More

localhost · 2025-02-22T17:18:26 1740244706

You can listen to him talk about it!

https://www.youtube.com/watch?v=ZbFM3rn4ldo

localhost · 2024-06-14T15:30:21 1718379021

As a counterpoint to a lot of the speculation on this thread, if you're interested in learning more about how and why we designed Python in Excel, I wrote up a doc (that is quite old but captures the core design quite well) here [1]. Disclosure: I was a founding member of the design team for the feature.

[1] https://notes.iunknown.com/python-in-excel/Book+of+Python+in...

hypercube33 · 2024-06-14T15:57:11 1718380631

I'm genuinely curious why python instead of something like PowerShell for Excel specifically. Seems a little out of the farm but I also get how it's a more adopted language.

localhost · 2024-06-14T16:07:39 1718381259

Python is the most popular language for data analysis with a rich ecosystem of existing libraries for that task.

Incidentally I've worked on many products in the past, and I've never seen anything that approaches the level of product-market-fit that this feature has.

Also, this is the work of many people at the company. To them go the real credit of shipping and getting it out the door to customers.

ttyprintk · 2024-06-14T16:15:36 1718381736

To associate Excel with all those third-party Python analytical packages. Monte Carlo comes to mind; in the distant past, that was an expensive third-party Excel plug-in.

mrgoldenbrown · 2024-06-14T19:47:43 1718394463

This was interesting but seems focused on how, not why. Like why not python as alternative to VBA, and why cloud only.

localhost · 2024-06-10T18:48:44 1718045324

It seems like this is an orchestration layer that runs on Apple Silicon, given that ChatGPT integration looks like an API call from that. It's not clear to me what is being computed on the "private cloud compute"?

cube2222 · 2024-06-10T18:52:19 1718045539

If I understand correctly there's three things here:

- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri

- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute

- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation

localhost · 2024-06-10T18:58:09 1718045889

The second point makes sense. It gives Apple optionality to cut off the external LLMs at a later date if they want to. I wonder what % of requests will be handled by the private cloud models vs. local. I would imagine TTS and ASR is local for latency reasons. Natural language classifiers would certainly run on-device. I wonder if summarization and rewriting will though - those are more complex and definitely benefit from larger models.

localhost · 2024-05-29T22:04:03 1717020243

I bought my shareware copy of DOOM from WBB, alongside many other books.

localhost · 2024-05-29T20:53:13 1717015993

This is a giant dataset of 536GB of embeddings. I wonder how much compression is possible by training or fine-tuning a transformer model directly using these embeddings, i.e., no tokenization/decoding steps? Could a 7B or 14B model "memorize" Wikipedia?

localhost · 2024-05-28T20:08:41 1716926921

How large is the set of binaries needed to do this training job? The current pytorch + CUDA ecosystem is so incredibly gigantic and manipulating those container images is painful because they are so large. I was hopeful that this would be the beginnings of a much smaller training/fine-tuning stack?

karpathy · 2024-05-28T20:31:22 1716928282

That is 100% my intention and hope and I think we are very close to deleting all of that. Right now on master, I am already only using Python for the tokenization preprocessing. In principle the requirements for llm.c should be extremely minimal. I think this a few days of work that is high on my mind.

Biggest problem right now is finding a place that can host the 135GB of tokens for FineWeb100B. Will probably use S3 or something.

Related see: https://github.com/karpathy/llm.c/issues/482

metadat · 2024-05-28T23:26:00 1716938760

Could this be a good case for a torrent?

localhost · 2024-04-29T19:35:09 1714419309

I wonder if enhanced operation of the lymphatic system might be causal to better mental health outcomes? The lymphatic system doesn't have a "pump", so it relies on muscle contraction to drive circulation. So more movement/exercise drives more lymphatic activity which may lead to better outcomes in people, especially if mental disorders are correlated with buildup of waste materials in the brain.

lambdaba · 2024-04-29T19:37:32 1714419452

Well, the brain is a heavily vascularized organ. Proper sleep and exercise are both crucial for cleaning up waste products, and supplying nutrients and good blood flow.

localhost · on March 27, 2024

This is exactly the kind of task that I want to deploy a long context window model on: "rewrite Thinking Fast and Slow taking into account the current state of research. Oh, and do it in the voice, style and structure of Tim Urban complete with crappy stick figure drawings."

discreteevent · on March 27, 2024

Then we just need the LLM that will rewrite your book taking into account the current state of LLM hallucination behaviour.

iancmceachern · on March 27, 2024

Not me, if I'm going to take the time to read something, I want it to have been written, reviewed and edited by a human. There is far too much high fidelity information to assimilate that I'm missing out on to put in low fidelity stuff

cornel_io · on March 28, 2024

Most human authors are frankly far too stupid to be worth reading, even if they do put care into their work.

This, IMO, is the actual biggest problem with LLMs training on whatever the biggest text corpus us that's available: they don't account for the fact that not all text is equally worthy of next-token-predicting. This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.

mistermann · on March 28, 2024

> This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.

Can you explain your solution?

pcthrowaway · on March 29, 2024

I imagine it looks something like "Censor all writing that contradicts my worldview"

Ma8ee · on March 28, 2024

It hardly matters what sources you are using if you filter it through something that has less understanding than a two year old, if any, no matter how eloquent it can express itself.

localhost · on March 27, 2024

Then don't copy and paste your copy of Thinking Fast and Slow into your AI along with my prompt then?

iancmceachern · on March 27, 2024

(My comment was less about my behavior but an attempt to encourage others to evaluate my thinking in hopes that they may apply it to their own in order to benefit our collective understanding)

freedomben · on March 27, 2024

Same! Just earlier today I was wanting to do this with "The Moral Animal" and "Guns, Germs, and Steel."

It's probably the AI thing I'm most excited about, and I suspect we're not far from that, although I'm betting the copyright battles are the primary obstacle to such a future at this point.

ghaff · on March 27, 2024

The thing with Guns, Germs, and Steel is that it make it essentially all about geographic determinism. There's another book (Why the West Rules--For Now written before China had really fully emerged on the stage) which argues that, yes, geography played an important role in which cores emerged earliest. BUT if you look at the sweep of history, the eastern core was arguably more advanced than the western core at various times. So a head start can't be the only answer.

usrnm · on March 27, 2024

The book specifically considers Eurasia to be one geographical region and it does acknowledge the technological developments in China. The fact that Europe became the winner in this race, according to GGS, is a sign that while geography is important it does not determine the course of history. It is not all about geographic determinism

worik · on March 27, 2024

It is a snapshot in time, and so wrong if viewed in a longer context.

People from Europe, came to have the Industrial Revolution at just the correct moment.

Some small changes in history and it would have happened in India.

It is making a theory to fit the facts.

I do not think the author is a "white supremacist" but the book reads like that. Taking all the accidents of history and making them seem like destiny that Europeans rule the world (they do not, they never did, and they are fading from world domination fast)

clbrmbr · on March 28, 2024

I enjoyed both GGS and WTWRFN, but in a mode where I basically ignored the thesis, reading instead for the factual information so clearly presented. Like the coverage of the Polynesian diaspora in GGS that has really stuck with me.

Thinking Fast & Slow was a fun read, but I did not retain much more than the basic System I/II concept which I find is a useful device.

pigeons · on March 27, 2024

I thought the OP was joking!

ants_everywhere · on March 28, 2024

It's not even clear that the dual process system 1/system 2 metaphor is accurate at all, so it may not be possible to redeem a book whose thesis is that the dual process model is correct.

It's not just that individual studies have failed to replicate. The whole field was in question at least a decade before the book was written, and since then many of the foundational results have failed to replicate. The book was in a sense the last hurrah of a theory that was on its way out, and the replication crisis administered the coup de grace IMO.

bredren · on March 28, 2024

>This is exactly the kind of task that I want to deploy a long context window model on: "rewrite Thinking Fast and Slow taking into account the...

I want something similar but for children's literature. From Ralph and the Motorcycle to Peter Pan, a lot of stuff doesn't hold up.

The books provide plenty of utility. But many things don't hold up to modern thinking. LLMs provide the opportunity to embrace classic content while throwing off the need to improvise as one parses outmoded ideas.

blacklion · on March 28, 2024

It will not be anything lice classic content anymore.

You could not redact piece of art out of "old ideas". It is like re-drawing classics paintings but mask nipples and removing blood.

And books which could be redacted this way without falling apart — well, don't read such books at all and don't feed them to the children.

Literature for children must not be dumbed down, but exactly as for adults, but better.

bredren · on March 28, 2024

It isn't redaction but reasonable and artful substitution. It isn't about dumbing down, but removing dumb ideas.

justwool · on March 28, 2024

Maybe use chatGPT to make this make sense.

abirch · on March 27, 2024

I would actually like to have books that had "Thinking Fast and Slow" as a prerequisite. Many data visualization books could be summed up as a bar chart is easily consumed by System 1. The visual noise creates mental strain on System 2.

AtlasBarfed · on March 27, 2024

"please finish game of thrones treating the impending zombie invasion as an allegory for global warming"

Also please omit "who has a better story than bran"

stevage · on March 27, 2024

Didn't George say it is such an allegory?

paulolc · on March 27, 2024

Awesome prompt!

localhost · on March 19, 2024

For your use case, what if you could use Python in Excel with pandas, numpy and the Anaconda ecosystem readily available? How would that change things?

Disclosure: I was a founding member of the Python in Excel team and am looking for new problems that Python in Excel could solve.

Lukas1994 · on March 19, 2024

IMO this still doesn't change the fact that Excel is a 2D grid. Dealing with multi-dimensional data will always be tricky in that paradigm. Also, you'll still have cell references, no version control, no access control, ...

Excel is an amazing product and I'm sure people will still use it in 10 years. Our thesis is that for financial planning (and various other number-crunching use-cases) our building blocks make more sense.

Terretta · on March 21, 2024

> doesn't change the fact that Excel is a 2D grid

Or 3D if third dimension is tabs.

Terretta · on March 21, 2024

> looking for new problems that Python in Excel could solve

You could solve releasing Python in Excel for MacOS.

localhost · on March 21, 2024

That is, of course, on the roadmap. I use a Mac as my daily driver so this one's personal :)

localhost · on Feb 15, 2024

RE: RAG - they haven't released pricing, but if input tokens are priced at GPT-4 levels - $0.01/1K then sending 10M tokens will cost you $100.

campers · on Feb 16, 2024

In the announcements today they also halved the pricing of Gemini 1.0 Pro to $0.000125 / 1K characters, which is a quarter of GPT3.5 Turbo so it could potentially be a bit lower than GPT-4 pricing.

s-macke · on Feb 15, 2024

If you think the current APIs will stay that way, then you're right. But when they start offering dedicated chat instances or caching options, you could be back in the penny region.

You probably need a couple GB to cache a conversation. That's not so easy at the moment because you have to transfer that data to and from the GPUs and store the data somewhere.

localhost · on Feb 15, 2024

The tokens need to be fed into the model along with the prompt and this takes time. Naive attention is O(N^2). They probably use at least flash attention, and likely something more exotic to their hardware.

You'll notice in their video [1] that they never show the prompts running interactively. This is for a roughly 800K context. They claim that "the model took around 60s to respond to each of these prompts".

This is not really usable as an interactive experience. I don't want to wait 1 minute for an answer each time I have a question.

[1] https://www.youtube.com/watch?v=SSnsmqIj1MI

rfoo · on Feb 17, 2024

GP's point is you can cache the state after the model processed the super long context but before it ingests your prompt.

If you are going to ask "then why don't OpenAI do it now", the answer is it takes a lot of storage (and IO) so it may not be worth it for shorter context, it adds significant complexity to the entire serving stack, and is incoherent with how OpenAI originally imagined where the "custom-ish" LLM serving game is going to - they bet on finetuning and dedicated instances, instead of long context.

The tradeoff can be reflected in the API and pricing, LLM APIs don't have to be like OpenAI's. What if you have an endpoint to generate a "cache" of your context (or really, a prefix of your prompt), billed as usual per token, then you can use your prompt prefix for a fixed price no matter how long it is?

localhost · on Feb 17, 2024

Do you have examples of where this has been done? Based on my understanding you can do things like cache the embeddings to avoid the tokenization/embedding cost, but you will still need to do a forward pass through the model with the new user prompt and the cached context. That is where the naive O(N^2) complexity comes from and that is the cost that cannot be avoided (because the whole point is to present the next user prompt to the model along with the cached context).