I think you are right in saying that there is some deep intuition that takes months, if not years, to hone about current models, however, the intuition some who did nothing but talk and use LLMs nonstop two years ago would be just as good today as someone who started from scratch, if not worse because of antipatterns that don’t apply anymore, such as always starting a new chat and never using a CLI because of context drift.
Also, Simon, with all due respect, and I mean it, I genuinely look in awe at the amount of posts you have on your blog and your dedication, but it’s clear to anyone that the projects you created and launched before 2022 far exceed anything you’ve done since. And I will be the first to say that I don’t think that’s because of LLMs not being able to help you. But I do think it’s because what makes you really, really good at engineering you kept replacing slowly but surely with LLMs more and more by the month.
If I look at Django, I can clearly see your intelligence, passion, and expertise there. Do you feel that any of the projects you’ve written since LLMs are the main thing you focus on are similar?
Think about it this way:
100% of you wins against 100% of me any day.
100% of Claude running on your computer is the same as 100% of Claude running on mine.
95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.
I do worry when I see great programmers like you diluting their work.
My great regret from the past few years is that experimenting with LLMs has been such a huge distraction from my other work! My https://llm.datasette.io/ tool is from that era though, and it's pretty cool.
I do think your datasettes work is fantastic and I genuinely hope you take my previous message the right way. I’m not saying you do something bad, quite the opposite, I feel like we need more of you and I’m afraid because of LLMs we get less of you.
It used to be that the bots had a short context window, and they struggled with getting confused by past context, so it was much better to make a new chat every now and then to keep the thread on track.
The opposite is true now. The context windows are enormous, and the bots are able to stay on task extremely well. They're able to utilize any previous context you've provided as part of the conversation for the new task, which improves their performance.
The new pattern I am using is a master chat that I only ever change if I am doing something entirely different
That’s cool. I know context windows are arbitrarily larger now because consumers think that larger window = better, but I think the sentiment that the model can’t even use the window effectively still stands?
I still find LLMs perform best with a potent and focussed context to work with, and performance goes down quite significantly the more context it has.
I worked on a startup experimenting with using gemini-2.0-flash (the year old model) using its full 1m context window to query technical documents. We found it to be extremely successful at needle-in-a-haystack type problems.
As we migrated to newer models (gemini-3.0 and the o4-mini models) we again found it performed even better with x00k tokens. Our system prompt grew to about 20k tokens and the bots were able to handle it perfectly. Our issue became time to first token with large context, rather than the bot quality.
The ultra large 1m+ llama models were reported to be ineffective at >1m context. But at this point, it becomes so cost prohibitive to use anyway.
I am continuing to have success using Cursor's Auto model, and GPT-5.1 with extremely long conversations. I use different chats for different problems moreso for my own compartmentalisation of thoughts, rather than as a necessity for the bot.
> 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.
I see what you're saying, but I'm not sure it is true. Take simonw and tymscar, put them each in charge of a team of 19 engineers (of identical capabilities). Is the result "nowhere near the same jump" as simonw vs. tymscar alone? I think it's potentially a much bigger jump, if there are differences in who has better ideas and not just who can code the fastest.
Yeah... and besides managerial skills, also product (using the word loosely) sense, user empathy, clarity of vision, communication skills. They've always been multipliers for programmers, even more so in this moment.
Its one of the main reasons why I buy on steam.
Makes games much more engaging, especially for me because I prefer hard action games with little to no story.
DDR5 is ~8GT/s, GDDR6 is ~16GT/s, GDDR7 is ~32GT/s. It's faster but the difference isn't crazy and if the premise was to have a lot of slots then you could also have a lot of channels. 16 channels of DDR5-8200 would have slightly more memory bandwidth than RTX 4090.
Yeah, so DDR5 is 8GT and GDDR7 is 32GT.
Bus width is 64 vs 384. That already makes the VRAM 4*6 (24) times faster.
You can add more channels, sure, but each channel makes it less and less likely for you to boot. Look at modern AM5 struggling to boot at over 6000 with more than two sticks.
So you’d have to get an insane six channels to match the bus width, at which point your only choice to be stable would be to lower the speed so much that you’re back to the same orders of magnitude difference, really.
Now we could instead solder that RAM, move it closer to the GPU and cross-link channels to reduce noise. We could also increase the speed and oh, we just invented soldered-on GDDR…
The bus width is the number of channels. They don't call them channels when they're soldered but 384 is already the equivalent of 6. The premise is that you would have more. Dual socket Epyc systems already have 24 channels (12 channels per socket). It costs money but so does 256GB of GDDR.
> Look at modern AM5 struggling to boot at over 6000 with more than two sticks.
The relevant number for this is the number of sticks per channel. With 16 channels and 64GB sticks you could have 1TB of RAM with only one stick per channel. Use CAMM2 instead of DIMMs and you get the same speed and capacity from 8 slots.
But it would still be faster than splitting the model up on a cluster though, right? But I’ve also wondered why they haven’t just shipped gpus like cpus.
Man I'd love to have a GPU socket. But it'd be pretty hard to get a standard going that everyone would support. Look at sockets for CPUs, we barely had cross over for like 2 generations.
But boy, a standard GPU socket so you could easily BYO cooler would be nice.
The problem isn't the sockets. It costs a lot to spec and build new sockets, we wouldn't swap them for no reason.
The problem is that the signals and features that the motherboard and CPU expect are different between generations. We use different sockets on different generations to prevent you plugging in incompatible CPUs.
We used to have cross-generational sockets in the 386 era because the hardware supported it. Motherboards weren't changing so you could just upgrade the CPU. But then the CPUs needed different voltages than before for performance. So we needed a new socket to not blow up your CPU with the wrong voltage.
That's where we are today. Each generation of CPU wants different voltages, power, signals, a specific chipset, etc. Within the same +-1 generation you can swap CPUs because they're electrically compatible.
To have universal CPU sockets, we'd need a universal electrical interface standard, which is too much of a moving target.
AMD would probably love to never have to tool up a new CPU socket. They don't make money on the motherboard you have to buy. But the old motherboards just can't support new CPUs. Thus, new socket.
You're right, but loads of times I just left that there because I probably did something more involved in the map that I ended up deleting later without realising.
This sounds like the kind of situation where the LSP could suggest the simpler code, I'll see if there's an issue for it already and suggest it if not.
Elixir has one opinionated formatter -- Quokka -- that will rewrite the code above properly. It can also reuse linting rules as rewrite policies. Love using it.
You most likely asked an AI for this. They always think there is an `if` keyword in case statements in Gleam. There isn't one, sadly.
EDIT: I am wrong. Apparently there are, but it's a bit of a strange thing where they can only be used as clauses in `if` statements, and without doing any calculations.