> This is the part where I simply don't understand the objections people have to coding agents
Because I have a coworker who is pushing slop at unsustainable levels, and proclaiming to management how much more productive he is. It’s now even more of a risk to my career to speak up about how awful his PRs are to review (and I’m not the only one on the team who wishes to speak up).
The internet is rife with people who claim to be living in the future where they are now a 10x dev. Making these claims costs almost nothing, but it is negatively effecting mine and many others day to day.
I’m not necessarily blaming these internet voices (I don’t blame a bear for killing a hiker), but the damage they’re doing is still real.
I don't think you read the sentence you're responding to carefully enough. The antecedent of "this" isn't "coding agents" generally: it's "the value of an agent getting you past the blank page stage to a point where the substantive core of your feature functions well enough to start iterating on". If you want to respond to the argument I made there, you have to respond to the actual argument, not a broader one that's easier (and much less interesting) to take swipes at.
I have to agree. My experience working on a team with mixed levels of seniority and coding experience is that everybody got some increase in productivity and some increase in quality.
The ones who spend more time developing their agentic coding as a skillset have gotten much better results.
In our team people are also more willing to respond to feedback because nitpicks and requests to restructure/rearchitect are evaluated on merit instead of how time-consuming or boring they would have been to take on.
> My experience working on a team with mixed levels of seniority and coding experience is that everybody got some increase in productivity and some increase in quality.
Is that true? There have been a couple of papers that show that people have the perception that they are more productive because the AI feels like motion (you're "stuck" less often) when in reality it has been a net negative.
Don't mention AI, just point out why the code is bad. I've had co-workers who were vim wizards and others who literally hunt and pecked to type. At no point did their tools ever come up when reviewing their code. AI is a tool like anything else, treat it that way. This also means that the OPs default can't be AI == bad; focus on the result.
I don't think it is a long term solution. More like training wheels. Ideally the engineers learn to use AI to produce better code the first time. You just have a quality gate.
Edit: Do I advocate for this? 1000%. This isn't crypto burning electricity to make a ledger. This objectively will make the life of the craftsmanship focused engineer easier. Sloppy execution oriented engineers are not a new phenomenon, just magnified with the fire hose that an agentic AI can be.
The environmental cost of AI is mostly in training afaik. The inference energy cost is similar to the google searches and reddit etc loads you might do during handwritten dev last I checked. This might be completely wrong though
I hear this argument a lot, but it doesn’t hold water for me. Obviously the use of the AI is the thing that makes it worthwhile to do the training, so you obviously need to amortize the training cost over the inference. I don’t know whether or not doing so makes the environmental cost substantially higher, though.
> If you can describe why it is slop, an AI can probably identify the underlying issues automatically
I would argue against this. Most of the time the things we find in review are due to extra considerations, often business, architectural etc, things which the AI doesn't have context of and it is quite bothersome to provide this context.
I generally agree that vague 1 shot prompting might vary.
I also feel all of those things can be explained over time into a compendium that is input. For example, every time it is right, or wrong, comment and add it to an .md file. Better yet, have the CLI Ai tool append it.
We know what is included as part of a prompt (like the above) is more accurately paid attention to.
My intent isn't to make more work, it's just to make it easier to highlight the issues with code that's mindlessly generated, or is overly convoluted when a simple approach will do.
> i.e. ones that can run offline and fake apis/databases
I can see a place for this, but these are no longer e2e tests. I guess that’s what “hermetic” means? If so it’s almost sinister to still call these e2e tests. They’re just frontend tests.
> A) refactor pretty much everything underneath them without breaking the test
This should always be true of any type of tests unless it’s behavior you want to keep from breaking.
> B) test realistically (an underrated quality)
Removing major integration points from a test is anything but realistic. You can do this, but don’t pretend you’re getting the same quality as a colloquial e2e tests.
> C) write tests which more closely match requirements rather than implementation
If you’re ever testing implementation you’re doing it wrong. Tests should let you know when a requirement of your app breaks. This is why unit tests are often kinda harmful. They test contracts that might not exist.
> try to isolate the bugs to smaller units of code (or interactions between small pieces of code).
This is why unit tests before e2e tests.
It's higher risk to build on components without unit tests test coverage, even if the paltry smoke/e2e tests say it's fine per the customer's input examples.
Is it better to fuzz low-level components or high-level user-facing interfaces first?
IIUC in relation to Formal Methods, tests and test coverage are not sufficient but are advisable.
Competency Story: The customer and product owner can write BDD tests in order to validate the app against the requirements
Prompt: Write playwright tests for #token_reference, that run a named factored-out login sequence, and then test as human user would that: when you click on Home that it navigates to / (given browser MCP and recently the Gemini 2.5 Computer Operator model)
Because I have a coworker who is pushing slop at unsustainable levels, and proclaiming to management how much more productive he is. It’s now even more of a risk to my career to speak up about how awful his PRs are to review (and I’m not the only one on the team who wishes to speak up).
The internet is rife with people who claim to be living in the future where they are now a 10x dev. Making these claims costs almost nothing, but it is negatively effecting mine and many others day to day.
I’m not necessarily blaming these internet voices (I don’t blame a bear for killing a hiker), but the damage they’re doing is still real.