Hacker News new | past | comments | ask | show | jobs | submit | localhost's comments login

You can listen to him talk about it!

https://www.youtube.com/watch?v=ZbFM3rn4ldo


As a counterpoint to a lot of the speculation on this thread, if you're interested in learning more about how and why we designed Python in Excel, I wrote up a doc (that is quite old but captures the core design quite well) here [1]. Disclosure: I was a founding member of the design team for the feature.

[1] https://notes.iunknown.com/python-in-excel/Book+of+Python+in...


I'm genuinely curious why python instead of something like PowerShell for Excel specifically. Seems a little out of the farm but I also get how it's a more adopted language.


Python is the most popular language for data analysis with a rich ecosystem of existing libraries for that task.

Incidentally I've worked on many products in the past, and I've never seen anything that approaches the level of product-market-fit that this feature has.

Also, this is the work of many people at the company. To them go the real credit of shipping and getting it out the door to customers.


To associate Excel with all those third-party Python analytical packages. Monte Carlo comes to mind; in the distant past, that was an expensive third-party Excel plug-in.


This was interesting but seems focused on how, not why. Like why not python as alternative to VBA, and why cloud only.


It seems like this is an orchestration layer that runs on Apple Silicon, given that ChatGPT integration looks like an API call from that. It's not clear to me what is being computed on the "private cloud compute"?


If I understand correctly there's three things here:

- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri

- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute

- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation


The second point makes sense. It gives Apple optionality to cut off the external LLMs at a later date if they want to. I wonder what % of requests will be handled by the private cloud models vs. local. I would imagine TTS and ASR is local for latency reasons. Natural language classifiers would certainly run on-device. I wonder if summarization and rewriting will though - those are more complex and definitely benefit from larger models.


I bought my shareware copy of DOOM from WBB, alongside many other books.


This is a giant dataset of 536GB of embeddings. I wonder how much compression is possible by training or fine-tuning a transformer model directly using these embeddings, i.e., no tokenization/decoding steps? Could a 7B or 14B model "memorize" Wikipedia?


How large is the set of binaries needed to do this training job? The current pytorch + CUDA ecosystem is so incredibly gigantic and manipulating those container images is painful because they are so large. I was hopeful that this would be the beginnings of a much smaller training/fine-tuning stack?


That is 100% my intention and hope and I think we are very close to deleting all of that. Right now on master, I am already only using Python for the tokenization preprocessing. In principle the requirements for llm.c should be extremely minimal. I think this a few days of work that is high on my mind.

Biggest problem right now is finding a place that can host the 135GB of tokens for FineWeb100B. Will probably use S3 or something.

Related see: https://github.com/karpathy/llm.c/issues/482


Could this be a good case for a torrent?


I wonder if enhanced operation of the lymphatic system might be causal to better mental health outcomes? The lymphatic system doesn't have a "pump", so it relies on muscle contraction to drive circulation. So more movement/exercise drives more lymphatic activity which may lead to better outcomes in people, especially if mental disorders are correlated with buildup of waste materials in the brain.


Well, the brain is a heavily vascularized organ. Proper sleep and exercise are both crucial for cleaning up waste products, and supplying nutrients and good blood flow.


This is exactly the kind of task that I want to deploy a long context window model on: "rewrite Thinking Fast and Slow taking into account the current state of research. Oh, and do it in the voice, style and structure of Tim Urban complete with crappy stick figure drawings."


Then we just need the LLM that will rewrite your book taking into account the current state of LLM hallucination behaviour.


Not me, if I'm going to take the time to read something, I want it to have been written, reviewed and edited by a human. There is far too much high fidelity information to assimilate that I'm missing out on to put in low fidelity stuff


Most human authors are frankly far too stupid to be worth reading, even if they do put care into their work.

This, IMO, is the actual biggest problem with LLMs training on whatever the biggest text corpus us that's available: they don't account for the fact that not all text is equally worthy of next-token-predicting. This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.


> This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.

Can you explain your solution?


I imagine it looks something like "Censor all writing that contradicts my worldview"


It hardly matters what sources you are using if you filter it through something that has less understanding than a two year old, if any, no matter how eloquent it can express itself.


Then don't copy and paste your copy of Thinking Fast and Slow into your AI along with my prompt then?


(My comment was less about my behavior but an attempt to encourage others to evaluate my thinking in hopes that they may apply it to their own in order to benefit our collective understanding)


Same! Just earlier today I was wanting to do this with "The Moral Animal" and "Guns, Germs, and Steel."

It's probably the AI thing I'm most excited about, and I suspect we're not far from that, although I'm betting the copyright battles are the primary obstacle to such a future at this point.


The thing with Guns, Germs, and Steel is that it make it essentially all about geographic determinism. There's another book (Why the West Rules--For Now written before China had really fully emerged on the stage) which argues that, yes, geography played an important role in which cores emerged earliest. BUT if you look at the sweep of history, the eastern core was arguably more advanced than the western core at various times. So a head start can't be the only answer.


The book specifically considers Eurasia to be one geographical region and it does acknowledge the technological developments in China. The fact that Europe became the winner in this race, according to GGS, is a sign that while geography is important it does not determine the course of history. It is not all about geographic determinism


It is a snapshot in time, and so wrong if viewed in a longer context.

People from Europe, came to have the Industrial Revolution at just the correct moment.

Some small changes in history and it would have happened in India.

It is making a theory to fit the facts.

I do not think the author is a "white supremacist" but the book reads like that. Taking all the accidents of history and making them seem like destiny that Europeans rule the world (they do not, they never did, and they are fading from world domination fast)


I enjoyed both GGS and WTWRFN, but in a mode where I basically ignored the thesis, reading instead for the factual information so clearly presented. Like the coverage of the Polynesian diaspora in GGS that has really stuck with me.

Thinking Fast & Slow was a fun read, but I did not retain much more than the basic System I/II concept which I find is a useful device.


I thought the OP was joking!


It's not even clear that the dual process system 1/system 2 metaphor is accurate at all, so it may not be possible to redeem a book whose thesis is that the dual process model is correct.

It's not just that individual studies have failed to replicate. The whole field was in question at least a decade before the book was written, and since then many of the foundational results have failed to replicate. The book was in a sense the last hurrah of a theory that was on its way out, and the replication crisis administered the coup de grace IMO.


>This is exactly the kind of task that I want to deploy a long context window model on: "rewrite Thinking Fast and Slow taking into account the...

I want something similar but for children's literature. From Ralph and the Motorcycle to Peter Pan, a lot of stuff doesn't hold up.

The books provide plenty of utility. But many things don't hold up to modern thinking. LLMs provide the opportunity to embrace classic content while throwing off the need to improvise as one parses outmoded ideas.


It will not be anything lice classic content anymore.

You could not redact piece of art out of "old ideas". It is like re-drawing classics paintings but mask nipples and removing blood.

And books which could be redacted this way without falling apart — well, don't read such books at all and don't feed them to the children.

Literature for children must not be dumbed down, but exactly as for adults, but better.


It isn't redaction but reasonable and artful substitution. It isn't about dumbing down, but removing dumb ideas.


Maybe use chatGPT to make this make sense.


I would actually like to have books that had "Thinking Fast and Slow" as a prerequisite. Many data visualization books could be summed up as a bar chart is easily consumed by System 1. The visual noise creates mental strain on System 2.


"please finish game of thrones treating the impending zombie invasion as an allegory for global warming"

Also please omit "who has a better story than bran"


Didn't George say it is such an allegory?


Awesome prompt!


For your use case, what if you could use Python in Excel with pandas, numpy and the Anaconda ecosystem readily available? How would that change things?

Disclosure: I was a founding member of the Python in Excel team and am looking for new problems that Python in Excel could solve.


IMO this still doesn't change the fact that Excel is a 2D grid. Dealing with multi-dimensional data will always be tricky in that paradigm. Also, you'll still have cell references, no version control, no access control, ...

Excel is an amazing product and I'm sure people will still use it in 10 years. Our thesis is that for financial planning (and various other number-crunching use-cases) our building blocks make more sense.


> doesn't change the fact that Excel is a 2D grid

Or 3D if third dimension is tabs.


> looking for new problems that Python in Excel could solve

You could solve releasing Python in Excel for MacOS.


That is, of course, on the roadmap. I use a Mac as my daily driver so this one's personal :)


RE: RAG - they haven't released pricing, but if input tokens are priced at GPT-4 levels - $0.01/1K then sending 10M tokens will cost you $100.


In the announcements today they also halved the pricing of Gemini 1.0 Pro to $0.000125 / 1K characters, which is a quarter of GPT3.5 Turbo so it could potentially be a bit lower than GPT-4 pricing.


If you think the current APIs will stay that way, then you're right. But when they start offering dedicated chat instances or caching options, you could be back in the penny region.

You probably need a couple GB to cache a conversation. That's not so easy at the moment because you have to transfer that data to and from the GPUs and store the data somewhere.


The tokens need to be fed into the model along with the prompt and this takes time. Naive attention is O(N^2). They probably use at least flash attention, and likely something more exotic to their hardware.

You'll notice in their video [1] that they never show the prompts running interactively. This is for a roughly 800K context. They claim that "the model took around 60s to respond to each of these prompts".

This is not really usable as an interactive experience. I don't want to wait 1 minute for an answer each time I have a question.

[1] https://www.youtube.com/watch?v=SSnsmqIj1MI


GP's point is you can cache the state after the model processed the super long context but before it ingests your prompt.

If you are going to ask "then why don't OpenAI do it now", the answer is it takes a lot of storage (and IO) so it may not be worth it for shorter context, it adds significant complexity to the entire serving stack, and is incoherent with how OpenAI originally imagined where the "custom-ish" LLM serving game is going to - they bet on finetuning and dedicated instances, instead of long context.

The tradeoff can be reflected in the API and pricing, LLM APIs don't have to be like OpenAI's. What if you have an endpoint to generate a "cache" of your context (or really, a prefix of your prompt), billed as usual per token, then you can use your prompt prefix for a fixed price no matter how long it is?


Do you have examples of where this has been done? Based on my understanding you can do things like cache the embeddings to avoid the tokenization/embedding cost, but you will still need to do a forward pass through the model with the new user prompt and the cached context. That is where the naive O(N^2) complexity comes from and that is the cost that cannot be avoided (because the whole point is to present the next user prompt to the model along with the cached context).


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: