krethh's comments

krethh · 2026-02-16T14:37:26 1771252646

Only if you accept the premise that the code generated by LLMs is identical to the developer's output in quality, just higher in volume. In my lived professional experience, that's not the case.

It seems to me that prompting agents and reviewing the output just doesn't.... trigger the same neural pathways for people? I constantly see people submit agent generated code with mistakes they would have never made themselves when "handwriting" code.

Until now, the average PR had one author and a couple reviewers. From now on, most PRs will have no authors and only reviewers. We simply have no data about how this will impact both code quality AND people's cognitive abilities over time. If my intuition is correct, it will affect both negatively over time. It remains to be seen. It's definitely not something that the AI hyperenthusiasts think at all about.

joquarky · 2026-02-16T17:38:45 1771263525

> In my lived professional experience, that's not the case.

In mine it is the case. Anecdata.

But for me, this was over two decades in an underpaid job at an S&P500 writing government software, so maybe you had better peers.

krethh · 2026-02-17T00:45:31 1771289131

I stated plainly: "we have no data about this". Vibes is all we have.

It's not just me though. Loads of people subjectively perceiving a decrease in quality of engineering when relying on agents. You'll find thousands of examples on this site alone.

reactordev · 2026-02-17T15:51:02 1771343462

I have yet to find an agent that writes as succinctly as I do. That said, I have found agents more than capable of doing something.

krethh · 2026-02-16T03:19:09 1771211949

> define communication protocols between them that fail when prompt injections are present

There's the "draw the rest of the owl" of this problem.

Until we figure out a robust theoretical framework for identifying prompt injections (not anywhere close to that, to my knowledge - as OP pointed out, all models are getting jailbroken all the time), human-in-the-loop will remain the only defense.

CuriouslyC · 2026-02-16T03:58:02 1771214282

Human in the loop isn't the only defense, you can't achieve complete injection coverage, but you can have an agent convert untrusted input into a response schema with a canary field, then fail any agent outputs that don't conform to the schema or don't have the correct canary value. This works because prompt injection scrambles instruction following, so the odds that the injection works, the isolated agent re-injects into the output, and the model also conforms to the original instructions regarding schema and canary is extremely low. As long as the agent parsing untrusted content doesn't have any shell or other exfiltration tools, this works well.

krethh · 2026-02-16T07:02:14 1771225334

This only works against crude attacks which will fail the schema/canary check, but does next to nothing for semantic hijacking, memory poisoning and other more sophisticated techniques.

CuriouslyC · 2026-02-16T13:43:41 1771249421

With misinformation attacks, your can instruct research agent to be skeptical and thoroughly validate claims made by untrusted sources. TBH, I think humans are just as likely to fall for these sorts of attacks if not more-so, because we're lazier than agents and less likely to do due diligence (when prompted).

SpicyLemonZest · 2026-02-16T17:37:37 1771263457

Humans are definitely just as vulnerable. The difference is that no two humans are copies of the same model, so the blast radius is more limited; developing an exploit to convince one human assistant that he ought to send you money doesn't let you easily compromise everyone who went to the same school as him.

krethh · 2026-02-14T07:31:59 1771054319

The logical reason is that humans are exceptionally good at operating at the edge of what the technology of the time can do. We will find entire classes of tech problems which AI can't solve on its own. You have people today with job descriptions that even 15 years ago would have been unimaginable, much less predictable.

To think that whatever the AI is capable of solving is (and forever will be) the frontier of all problems is deeply delusional. AI got good at generating code, but it still can't even do a fraction of what the human brain can do.

jonahx · 2026-02-14T18:10:17 1771092617

> To think that whatever the AI is capable of solving is (and forever will be) the frontier of all problems is deeply delusional. AI got good at generating code, but it still can't even do a fraction of what the human brain can do.

AGI means fully general, meaning everything the human brain can do and more. I agree that currently it still feels far (at least it may be far), but there is no reason to think there's some magic human ingredient that will keep us perpetually in the loop. I would say that is delusional.

We used to think there was human-specific magic in chess, in poker, in Go, in code, and in writing. All those have fallen, the latter two albeit only in part but even that part was once thought to be the exclusive domain of humans.

krethh · 2026-02-14T23:44:04 1771112644

When I refer to AI, I mean the "AI" that has materialized thus far - LLMs and their derivatives. AGI in the sense that you mean is science fiction, no less than it was 50 years ago. It might happen, it might not, LLMs are in all likelihood not a pathway to get there.

krethh · 2026-01-18T09:03:11 1768726991

A $3 calculator today is capable of doing arithmetic that would require superhuman intelligence to do 100 years ago.

It's extremely hard to define "human-level intelligence" but I think we can all agree that the definition of it changes with the tools available to humans. Humans seem remarkably suited to adapt to operate at the edges of what the technology of time can do.

red75prime · 2026-01-18T18:08:31 1768759711

> that would require superhuman intelligence to do 100 years ago

It had required a ton of ordinary intelligence people doing routine work (see Computer(occupation)). On the other hand, I don't think anyone has seriously considered to replace, say, von Neumann with a large collective of laypeople.

krethh · 2026-01-18T08:56:43 1768726603

LLM's don't learn on their own mistakes in the same way that real developers and businesses do, at least not in a way that lends itself to RLVR.

Meaningful consequences of mistakes in software don't manifest themselves through compilation errors, but through business impacts which so far are very far outside of the scope of what an AI-assisted coding tool can comprehend.

red75prime · 2026-01-18T17:59:21 1768759161

> through business impacts which so far are very far outside of the scope of what an AI-assisted coding tool can comprehend.

That is, the problems are a) how to generate a training signal without formally verifiable results, b) hierarchical planning, c) credit assignment in a hierarchical planning system. Those problems are being worked on.

There are some preliminary research results that suggest that RL induces hierarchical reasoning in LLMs.

krethh · 2026-01-18T08:39:23 1768725563

Respectfully, you seem to love the sound of your writing so much you forget what you are arguing about. The topic (at least for the rest of the people in this thread) seems to be whether AI assistance can truly eliminate programmers.

There is one painfully obvious, undeniable historical trend: making programmer work easier increases the number of programmers. I would argue a modern developer is 1000x more effective than one working in the times of punch cards - yet we have roughly 1000x more software developers than back then.

I'm not an AI skeptic by any means, and use it everyday at my job where I am gainfully employed to develop production software used by paying customers. The overwhelming consensus among those similar to me (I've put down all of these qualifiers very intentionally) is that the currently existing modalities of AI tools are a massive productivity boost mostly for the "typing" part of software (yes, I use the latest SOTA tools, Claude Opus 4.5 thinking, blah, blah, so do most of my colleagues). But the "typing" part hasn't been the hard part for a while already.

You could argue that there is a "step change" coming in the capabilities of AI models, which will entirely replace developers (so software can be "willed into existence", as elegantly put by OP), but we are no closer to that point now than we were in December 2022. All the success of AI tools in actual, real-world software has been in tools specifically design to assist existing, working, competent developers (e.g. Cursor, Claude Code), and the tools which have positioned themselves to replace them have failed (Devin).

threethirtytwo · 2026-01-19T14:07:13 1768831633

There is no respectful way of telling someone they like the sound of their own voice. Let’s be real, you were objectively and deliberately disrespectful. Own it if you are going to break the rules of conduct. I hate this sneaky shit. Also I’m not off topic, you’re just missing the point.

I responded to another person in this thread and it’s the same response I would throw at you. You can read that as well.

Your “historical trend” is just applying an analogy and thinking that an analogy can take the place of reasoning. There are about a thousand examples of careers where automation technology increased the need of human operators and thousands of examples where automation eliminated human operators. Take pilots for example. Automation didn’t lower the need for pilots. Take intellisense and autocomplete… That didn’t lower the demand for programmers.

But then take a look at Waymo. You have to be next level stupid to think that ok, cruise control in cars raised automation but didn’t lower the demand for drivers… Therefore all car related businesses including Waymo will always need physical drivers.

As anyone is aware… this idea of using analogy as reasoning fails here. Waymo needs zero physical drivers thanks to automation. There is zero demand here and your methodology of reasoning fails.

Analogies are a form of manipulation. They only help allow you to elucidate and understand things via some thread of connection. You understand A therefore understanding A can help you understand B. But you can’t use analogies as the basis for forecasting or reasoning because although A can be similar to B, A is not in actuality B.

For AI coders it’s the same thing. You just need to use your common sense rather than rely on some inaccurate crutch of analogies and hoping everything will play out in the same way.

If AI becomes as good and as intelligent as a human swe than your job is going out the fucking window and replaced by a single Prompter. That’s common sense.

Look at the actual trendline of the actual topic: AI taking over our jobs and not automation in other sectors of engineering or other types of automation in software. What happened with AI in the last decade? We went from zero to movies, music and coding. What does your common sense tell you the next decade will bring?

If the improvement of AI from the last decade keeps going or keeps accelerating, the conclusion is obvious.

Sometimes the delusion a lot of swes have is jarring. Like literally if AGI existed thousands of jobs will be displaced. That’s common sense, but you still see tons of people clinging to some irrelevant analogy as if that exact analogy will play out against common sense.

krethh · 2026-01-19T19:29:11 1768850951

How ironic of you to call my argument an analogy while it isn't an analogy, yet all you have to offer is exactly that - analogies. Analogies to pilots, drivers, "a thousand examples of careers".

My argument isn't an analogy - it's an observation based on the trajectory of SWE employment specifically. It's you who's trying to reason about what's going to happen with software based on what happened to three-field crop rotation or whatever, not me.

I argued that a developer today is 1000x more effective than in the days of punch cards, yet we have 1000x more developers today. Not only that, this correlation tracked fairly linearly throughout the last many decades.

I would also argue that the productivity improvement between FORTRAN and C, or between C and Python was much, much more impactful than going from JavaScript to JavaScript with ChatGPT.

Software jobs will be redefined, they will require different skill sets, they may even be called something else - but they will still be there.

threethirtytwo · 2026-01-19T20:40:39 1768855239

>How ironic of you to call my argument an analogy while it isn't an analogy, yet all you have to offer is exactly that

Bro I offered you analogies to show you how it's IRRELEVANT. The point was to show you how it's an ineffective form of reasoning via demonstrating it's ineffectiveness FOR YOUR conclusion because using this reasoning can allow you to conclude the OPPOSITE. Assuming this type of reasoning is effective means BOTH what I say is true and what you say is true which leads to a logical contradiction.

There is no irony, only misunderstanding from you.

>I argued that a developer today is 1000x more effective than in the days of punch cards, yet we have 1000x more developers today. Not only that, this correlation tracked fairly linearly throughout the last many decades.

See here, you're using an analogy and claiming it's effective. To which I would typically offer you another analogy that shows the opposite effect, but I feel it would only confuse you further.

>Software jobs will be redefined, they will require different skill sets, they may even be called something else - but they will still be there.

Again, you believe this because of analogies. I recommend you take a stab at my way of reasoning. Try to arrive at your own conclusion without using analogies.

krethh · on Dec 23, 2024

Let's make two generous assumptions: 1. ARC-AGI actually generalizes to human intelligence 2. It took 172x more compute to go from ~75% to ~87%, so it will take roughly 4x that to get to 99% (the level of a STEM graduate), assuming every 172x'ing of the compute cuts the remaining gap in half

That is roughly 10^9 times more compute required, or roughly the US military budget per half an hour, to get the intelligence of 1 (!) STEM graduate (not any kind of superhuman intelligence).

Of course, algorithms will get better, but this particular approach feels like wading in a plateau of efficiency improvements, very, very far down the X axis.