At what point do LLMs enable bad engineering practices, if instead of working to abstract or encapsulate toilsome programming tasks we point an expensive slot machine at them and generate a bunch of verbose code and carry on? I'm not sure where the tradeoff leads if there's no longer a pain signal for things that need to be re-thought or re-architected. And when anyone does create a new framework or abstraction, it doesn't have enough prior art for an LLM to adeptly generate, and fails to gain traction.
How much of "good engineering practices" exist because we're trying to make it easy for humans to work with the code?
Pick your favorite GoF design pattern. Is that they best way to do it for the computer or the best way to do it for the developer?
I'm just making this up now, maybe it's not the greatest example; but, let's consider the "visitor" pattern.
There's some framework that does a big loop and calls the visit() function on an object. If you want to add a new type, you inherit from that interface, put visit() on your function and all is well. From a "good" engineering practice, this makes sense to a developer, you don't have to touch much code and your stuff lives in it's own little area. That all feels right to us as developers because we don't have a big context window.
But what if your code was all generated code, and if you want to add a new type to do something that would have been done in visit(). You tell the LLM "add this new functionality to the loop for this type of object". Maybe it does a case statement and puts the stuff right in the loop. That "feels" bad if there's a human in the loop, but does it matter to the computer?
Yes, we're early LLMs aren't deterministic, and verification may be hard now. But that may change.
In the context of a higher-level language, y=x/3 and y=x/4 look the same, but I bet the generated assembly does a shift on the latter and a multiply-by-a-constant on the former. While the "developer interface", the source code, looks similar (like writing to a visitor pattern), the generated assembly will look different. Do we care?
LLMs have limited working memory, like humans, and most of the practices that increase human programming effectiveness increase LLM effectiveness too. In fact more so, because LLMs are goldfish that retain no mental model between runs, so the docs had better be good, abstractions tight, and coding practices consistent such that code makes sense locally and globally.
So are we basically saying that LLMs work most effectively on codebases that exhibit good quality coding practices, but are not themselves particularly good at creating such quality code themselves, since they were trained on all the code that exists.
I don't know what conclusion to draw from that. Maybe that there's no such thing as a free lunch, after all.
Code is a design tool, just like lines on an engineering drawing. Most times you do not care if it was with a pen or a pencil, or if it was printed out. But you do care about the cross section of the thing depicted. The only time you care about whether it’s pen or pencil is for preservation.
So I don’t care about assembly because it does not matter usually in any metric. I design using code because that’s how I communicate intent.
If you learn how to draw, very quickly, you find that no one talks about lines (which is mostly all you do), you will hear about shapes, texture, edges, values, balance…. It’s in these higher abstractions intent resides.
Same with coding. No ones thinks in keywords, brackets, or lines of code. Instead, you quickly build higher abstractions and that’s where you live in. The pros is that those concepts habe no ambiguity.
Great Q, and your framing "there's no longer a pain signal for things that need to be re-thought or re-architected" perfectly encapsulates a concern I hadn't yet articulated so cleanly. Thanks for that!
It's easy to get this way with enough scrolling, try to focus on the things around you in real life. If you aren't reading LinkedIn or HN, how much do you actually hear about AI in day-to-day life? If someone at work directly asks you to do something using AI, you might make some effort to do it. But otherwise let the news and hype cycle play out. You don't need to anticipate or keep abreast of where people think things will be in ten years... they are almost certainly wrong. Think of LinkedIn and HN as entertainment at best. Work on personal coding projects without AI, build relationships with non-tech people, go outside.
It’s notable that just the English “implementation” of FizzBuzz here is longer and more ambiguous than the naive Python implementation, never mind the boilerplate (which itself is also longer than the Python).
The explosion of frameworks and YAML tools the author describes can be attributed to the fact that English is an extremely poor language for program specification, and requires all kinds of guardrails and annotation to accomplish the same specificity as a typical computer program.
LLM coding isn't a new level of abstraction. Abstractions are (semi-)reliable ways to manage complexity by creating building blocks that represent complex behavior, that are useful for reasoning about outcomes.
Because model output can vary widely from invocation to invocation, let alone model to model, prompts aren't reliable abstractions. You can't send someone all of the prompts for a vibecoded program and know they will get a binary with generally the same behavior. An effective programmer in the LLM age won't be saving mental energy by reasoning about the prompts, they will be fiddling with the prompts, crossing their fingers that it produces workable code, then going back to reasoning about the code to ensure it meets their specification.
What I think the discipline is going to find after the dust settles is that traditional computer code is the "easiest" way to reason about computer behavior. It requires some learning curve, yes, but it remains the highest level of real "abstraction", with LLMs being more of a slot machine for saving the typing or some boilerplate.
I think the analogy to high level programming languages misunderstands the value of abstraction and notation. You can’t reason about the behavior of an English prompt because English is underspecified. The value of code is that it has a fairly strong semantic correlation to machine operations, and reasoning about high level code is equivalent to reasoning about machine code. That’s why even with all this advancement we continue to check in code to our repositories and leave the sloppy English in our chat history.
Yep. Any statement in python or others can be mapped to something that the machine will do. And it will be the same thing every single time (concurrency and race issue aside). There’s no english sentence that can be as clear.
We’ve created formal notation to shorten writing. And computation is formal notation that is actually useful. Why write pages of specs when I could write a few lines of code?
There's also creative space inside the formal notation. It's not just "these are the known abstractions, please lego them together", the formal syntax and notation is just one part of the whole. The syntax and notation define the forms of poetry (here's the meter, here's the rhyme scheme, here's how the whitespace works), but as software developers we're still filling in the words that fit that meter and rhyme scheme and whitespace. We're adding the flowery metaphors in the way we choose variable names and the comments we choose to add and the order we define things or choose to use them.
Software developers can use the exact same "lego block" abstractions ("this code just multiplies two numbers") and tell very different stories with it ("this code is the formula for force power", "this code computes a probability of two events occurring", "this code gives us our progress bar state as the combination of two sub-processes", etc).
LLMs have only so many "stories" they are trained on, and so many ways of thinking about the "why" of a piece of code rather than mechanical "what".
Computers only care about the what, and have no use for the why. Humans care about the latter too and the programmer lives at the intersection of both. Taking a why and transforming it into a what is the coding process.
Software engineering is all about making sure the what actually solves the why, making the why visible enough in the what so that we can modify the latter if the former changes (it always does).
Current LLM are not about transforming a why into a what. It’s about transforming an underspecified what into some what that we hope fits the why. But as we all know from the 5 Why method, why’s are recursive structure, and most software engineer is about diving into the details of the why. The what are easy once that done because computers are simple mechanisms if you chose the correct level of abstraction for the project.
It's disheartening that a potentially worthwhile discussion — should we invest engineering resources in LLMs as a normal technology rather than as a millenarian fantasy? — has been hijacked by a (at this writing) 177-comment discussion on a small component of the author's argument. The author's argument is an important one that hardly hinges at all on water usage specifically, given the vast human and financial capital invested in LLM buildout so far.
Going to a popular restaurant that accepts app delivery orders (or a grocery store in a neighborhood where people prefer to pay for delivery) is an objectively bad experience. The kitchen or checkout line is backed up with delivery orders, there are a bunch of delivery drivers double-parked or loitering near the front, and due not to any moral failing but rather what must be a crushing grind, the drivers are for the most part rushed and inconsiderate of the staff or other customers.
The class of people who order delivery regularly are generally trading the short-term reward of convenient food for way more money than makes sense, too little of that money benefits the class of people who do the delivering, and as the article points out, it is essentially harming the business it's being ordered from.
I would love to see more restaurants and stores declining to support this kind of system. While there may be some marginal profit now, in the long run the race to the bottom is going to mean fewer sustainable businesses.
At the very least, I make an effort to pick up food in person these days. Saves me a lot of money, is better for the restaurant, and since it's not my livelihood I can just show up a bit early, park properly, and hang around, ensuring that the food will be as fresh as possible when I get home and avoiding any rush.
The animosity I sometimes see between the restaurant staff and the delivery drivers can be really uncomfortable. It's not shocking, they have competing incentives and I think there's a pretty stark class/culture divide, but it's unfortunate when a system like this pits workers against each other that are just both trying to do their job as best they can.
I feel like this needs an editor to have a chance of reaching almost anyone… there are ~100 section/chapter headings that seem to have been generated through some kind of psychedelic free association, and each section itself feels like an artistic effort to mystify the reader with references, jargon, and complex diagrams that are only loosely related to the text. And all wrapped here in a scroll-hijack that makes it even harder to read.
The effect is that it's unclear at first glance what the argument even might be, or which sections might be interesting to a reader who is not planning to read it front-to-back. And since it's apparently six hundred pages in printed form, I don't know that many will read it front-to-back either.
From a rhetorical perspective, it's an extended "Yes-set" argument or persuasion sandwich. You see it a lot with cult leaders, motivational speakers, or political pundits. The problem is that you have an unpopular idea that isn't very well supported. How do you smuggle it past your audience? You use a structure like this:
* Verifiable Fact
* Obvious Truth
* Widely Held Opinion
* Your Nonsense Here
* Tautological Platitude
This gets your audience nodding along in "Yes" mode and makes you seem credible so they tend to give you the benefit of the doubt when they hit something they aren't so sure about. Then, before they have time to really process their objection, you move onto and finish with something they can't help but agree with.
The stuff on the history of computation and cybernetics is well researched with a flashy presentation, but it's not original nor, as you pointed out, does it form a single coherent thesis. Mixing in all the biology and movie stuff just dilutes it further. It's just a grab bag of interesting things added to build credibility. Which is a shame, because it's exactly the kind of stuff that's relevant to my interests[3][4].
> "Your manuscript is both good and original; but the part that is good is not original, and the part that is original is not good." - Samuel Johnson
The author clearly has an Opinion™ about AI, but instead of supporting they're trying to smuggle it through in a sandwich, which I think is why you have that intuitive allergic reaction to it.
https://wii-film.antikythera.org/ - This is a 1-hour talk by the author which summarizes what seems to be the gist of the book. I haven't read the book completely. I read a few sections.
Personally, I think the book does not add anything novel. Reading Karl Friston and Andy Clark would be a better investment of time if the notion of predictive processing seems interesting to you.
I guess I am the odd one out here. Reading it front-to-back has been a blast so far and even though i find my own site's design to be a bit more readable for long text, I certainly appreciate the strangeness of this one.
Ooh, that looks very cool. The lack of a concrete definition of AGI and a scientifically (in the correct domains) backed operationalization of such a definition that can allow direct comparisons between humans and current AIs, where it isn't impossible for humans and/or easy to saturate by AIs, is much needed.
I got the same impression as well. I think I've become so cynical to these kinds of things that whenever I see this kind of thing, I immediately assume bad faith / woo and just move on to the next article to read.
It's interesting to call this a pre-mortem as it seems mainly organized around thinking positively past the imperfections of the technology. It's like a pre-mortem for the housing crisis that focuses on the benefits of subprime mortgage lending.
What I'd expect to see is an analysis of how to address or prevent the same situation as previous bubbles: that society has allocated resources to a specific investment that are far in excess of what that investment can fundamentally be expected to return. How can we avoid thinking sloppily about this technology, or getting taken in by hucksters' just-so stories of its future impact? How can we successfully identify use-cases where revenues exceed investment? When the next exciting tech comes around, how can we harness it well as a society without succumbing to irrational exuberance?
I don't know if you could have an independent government institution to help regulate booms and busts like the Fed does with the money supply? I'm not sure what you'd do with AI but there are fairly obvious things that could have been done with housing like restrict lending on the upside and spend on infrastructure in the bust.
Elected politicians have perverse incentives to let bubbles run so they can claim it's their policies providing never ending growth.
I think this leaves out what is probably the most likely future for this technology, having a similar destiny to most technologies as a tool. Both of these visions assume (I think incorrectly) a trend towards ubiquity, where either every interaction you as a person have is mediated by computers, or where within a certain "room" every interaction anyone has is mediated by computers.
But it seems more likely that like other technologies developed by humanity, we will see that computers are not efficient for, or extensible to, every task, and people will naturally tend to reach for computers where they are helpful and be disinclined to do so when they aren't helpful. Some computers will be in rooms, some will get carried around or worn, some will be integrated into infrastructure.
Similar to the automobile, steam powered motors, and electricity, we may predict a future where the technology totally pervades our lives, but in reality we eventually develop a sort of infrastructure that delimits the tool's use to a certain extent, whether it is narrow or wide. If that's the case then the work for the field is less about shoving the tech into every interaction, and more about developing better abstractions to allow people to use compute in an empowering rather than a disempowering way.
It already IS ubiquitous. What is the path to non-ubiquity then? Most people are depending on it in many personal contexts. A lot of people are even using it in their jobs whether others agree with it or not. Everyday it's becoming more ubiquitous than before.
Smart phones are this way for example. You may see them as just tools, but we became centaurs with our phones. I don't think being a "tool" precludes it from being a centaur or ubiquitous. I agree with you on some points, but I don't think the distinction you're making is valid here.
reply