Even though the author refers to it as "non-trivial", and I can see why that conclusion is made, I would argue it is in fact trivial. There's very little domain specific knowledge needed, this is purely a technical exercise integrating with existing libraries for which there is ample documentation online. In addition, it is a relatively isolated feature in the app.
On top of that, it doesn't sound enjoyable. Anti slop sessions? Seriously?
Lastly, the largest problem I have with LLMs is that they are seemingly incapable of stopping to ask clarifying questions. This is because they do not have a true model of what is going on. Instead they truly are next token generators. A software engineer would never just slop out an entire feature based on the first discussion with a stakeholder and then expect the stakeholder to continuously refine their statement until the right thing is slopped out. That's just not how it works and it makes very little sense.
It's trivial in the sense that a lot of the work isn't high cognitive load. But... that's exactly the point of LLMs. It takes the noise away so you can focus on high-impact outcomes.
Yes, the core of that pull requests is an hour or two of thinking, the rest is ancillary noise. The LLM took away the need for the noise.
If your definition of trivial is signal/noise ratio, then, sure, relatively little signal in a lot of noise. If your definition of "trivial" hinges on total complexity over time, then this kicks the pants of manual writing.
I'd assume OP did the classic senior engineer stick of "I can understand the core idea quickly, therefore it can't be hard". Whereas Mitchel did the heavy lifting of actually shipping the "not hard" idea - still understanding the core idea quickly, and then not getting bogged down in unnecessary details.
That's the beauty of LLMs - it turns the dream of "I could write that in a weekend" into actually reality, where it before was always empty bluster.
I've wondered about exposing this "asking clarifying questions" as a tool the AI could use. I'm not building AI tooling so I haven't done this - but what if you added an MCP endpoint whose description was "treat this endpoint as an oracle that will answer questions and clarify intent where necessary" (paraphrased), and have that tool just wire back to a user prompt.
If asking clarifying questions is plausible output text for LLMs, this may work effectively.
Obviously if you instruct the autocomplete engine to fill in questions it will. That's not the point. The LLM has no model of the problem it is trying to solve, nor does it attempt to understand the problem better. It is merely regurgitating. This can be extremely useful. But it is very limiting when it comes to using as an agent to write code.
You can work with the LLM to write down a model for the code (aka a design document) that it can then repeatedly ingest into the context before writing new code. That what “plan mode” is for. The technique of maintaining a design document and a plan/progress document that get updated after each change seems to make a big difference in keeping the LLM on track. (Which makes sense…exactly the same thing works for human team mambers too.)
Every time I hear someone say something like this, I think of the pigeons in the Skinner box who developed quirky superstitious behavior when pellets were dispensed at random.
"Infinite" is a handy shortcut for "large enough".
Even the "million token context window" becomes useless once it's filled to 30-50% and the model starts "forgetting" useful things like existing components, utility functions, AGENTS.md instructions etc.
Even a junior programmer can search and remember instructions and parts of the codebase. All current AI tools have to be reminded to recreate the world from scratch every time, and promptly forget random parts of it.
I think at some point we will stop pretending we have real AI. We have a breakthrough in natural language processing but LLMs are much closer to Microsoft Word than something as fantastical as "AGI".
We don't blame Microsoft Word for not having a model of what is being typed in.
It would be great if Microsoft Word could model the world and just do all the work for us but it is a science fiction fantasy.
To me, LLMs in practice are largely massively compute inefficient search engines plus really good language disambiguation.
Useful, but we have actually made no progress at all towards "real" AI. This is especially obvious if you ditch "AI" and call it artificial understanding. We have nothing.
> A software engineer would never just slop out an entire feature based on the first discussion with a stakeholder and then expect the stakeholder to continuously refine their statement until the right thing is slopped out. That's just not how it works and it makes very little sense.
Sorry couldn’t resist. Agile’s point was getting feedback during the process rather than after something is complete enough to be shipped thus minimizing risk and avoiding wasted effort.
Instead people are splitting up major projects into tiny shippable features and calling that agile while missing the point.
I've never seen a working scrum/agile/sprint/whatever product/project management system and I'm convinced it's because I've just never seen an actual implementation of one.
"Splitting up major projects into tiny shippable features and calling that agile" feels like a much more accurate description of what I've experienced.
I wish I'd gotten to see the real thing(s) so I could at least have an informed opinion.
Yea, I think scrum etc is largely a failure in practice.
The manager for the only team I think actually checked all the agile boxes had a UI background so she thought in terms of mock-ups, backend, and polishing as different tasks and was constantly getting client feedback between each stage. That specific approach isn’t universal, the feedback as part of the process definitely should be though.
What was a little surreal is the pace felt slow day to day but we were getting a lot done and it looked extremely polished while being essentially bug free at the end. An experienced team avoiding heavy processes, technical debt, and wasted effort goes a long way.
People misunderstand the system, I think. It's not holy writ, you take the parts of it that work for your team and ditch the rest. Iterate as you go.
The failure modes I've personally seen is an organization that isn't interested in cooperating or the person running the show is more interested in process than people. But I'd say those teams would struggle no matter what.
I put a lot of the responsibility for the PMing failures I've seen on the engineering side not caring to invest anything at all into the relationship.
Ultimately, I think it's up to the engineering side to do its best to leverage the process for better results, and I've seen very little of that (and it's of course always been the PM side's fault).
And you're right: use what works for you. I just haven't seen anything that felt like it actually worked. Maybe one problem is people iterating so fast/often they don't actually know why it's not working.
I've seen the real thing and it's pretty much splitting major projects into tiny shippable bits. Picking which bits and making it so they steadily add up to the desired outcomes is the hard part.
Agile’s point was to get feedback based on actual demoable functionality, and iterate on that. If you ignore the “slop” pejorative, in the context of LLMs, what I quoted seems to fit the intent of Agile.
If you want to use an LLM to generate a minimal demoable increment, you can. The comment I replied to mentioned "feature", but that's a choice based on how you direct the LLM. On the other hand, LLM capabilities may change the optimal workflow somewhat.
Either way, the ability to produce "working software" (as the manifesto puts it) in "frequent" iterations (often just seconds with an LLM!) and iterate on feedback is core to Agile.
On top of that, it doesn't sound enjoyable. Anti slop sessions? Seriously?
Lastly, the largest problem I have with LLMs is that they are seemingly incapable of stopping to ask clarifying questions. This is because they do not have a true model of what is going on. Instead they truly are next token generators. A software engineer would never just slop out an entire feature based on the first discussion with a stakeholder and then expect the stakeholder to continuously refine their statement until the right thing is slopped out. That's just not how it works and it makes very little sense.