More

WXLCKNO · 2026-02-11T19:02:19 1770836539

Saw this the other day and loved it. Especially seeing Opus 4.5 degrading prior to the 4.6 release (IIRC) and Codex staying very stable and even improving over time.

But FYI the blog post is not about the actual model being dumbed down, but the command line interface.

WXLCKNO · 2026-02-11T18:59:57 1770836397

"It appears minor on the surface but their response to all the comments tells you everything you need to know."

I mean I hope it's just a single developer being stubborn rather than guidance from management asking everyone to simplify Claude Code for maximum mass appeal. But I agree otherwise, it's telling.

WXLCKNO · 2026-02-11T18:54:04 1770836044

Exactly how I feel. I'm happy that more people are using these tools and learning (hopefully) about engineering but it shouldn't degrade the core experience for let's say "more advanced" users who don't see themselves as Vibe coders and want precise control over what's happening.

jonahx · 2026-02-11T19:22:25 1770837745

> learning (hopefully) about engineering

Not a chance.

If anything, the reverse, in that it devalues engineering. For most, LLMs are a path to an end-product without the bother or effort of understanding. No different than paid engineers were, but even better because you don't have to talk to engineers or pay them.

The sparks of genuine curiosity here are a rounding error.

croes · 2026-02-11T19:36:21 1770838581

If I give pupils the solution book will they learn or just copy the answers?

There is a reason why nowadays games start to help massively if the player gets stuck.

lukan · 2026-02-11T20:13:38 1770840818

"There is a reason why nowadays games start to help massively if the player gets stuck"

You mean those "free" games, that are hard and grindy by design and the offered help comes in the shape of payed perks to solve the challenges?

croes · 2026-02-11T20:32:21 1770841941

No, those paid games where NPCs starts to point to clues if the player takes too long to solve a riddle or where you can skip the hard parts if you fail to often.

WXLCKNO · 2026-02-11T18:50:54 1770835854

Yeah just that it's not real time and you have to toggle to see it. It lags a bunch also in longer threads. Definitely a downgrade.

koakuma-chan · 2026-02-11T19:03:17 1770836597

I mean yes, they claim that it's "Claude Code Native" or something but it does feel laggy and takes multiple seconds to start. What do they even mean native, didn't they acquire Bun? It's not native. They need to rewrite it in Rust, I'm serious.

WXLCKNO · 2026-02-11T19:23:09 1770837789

Codex feels much faster. For a while after the rewrite (to rust also I think?) it was bad because you couldn't copy anything from the terminal but since then it's gotten much much better.

WXLCKNO · 2026-02-11T18:50:17 1770835817

Sorry I'm dumber than the average Anthropic employee, might just take me a few more days for it to "click" that I'm no longer seeing useful information and that this is good.

WXLCKNO · 2026-02-11T18:49:02 1770835742

Jokes about vibe-coded CLI aside, I think that's the issue for me, the defaults are being tailored to vibe coders. (and the general weirdness of trying to fix it with verbose mode)

I like that people who were afraid of CLIs perhaps are now warming up to them through tools like Claude Code but I don't think it means the interfaces should be simplified and dumbed down for them as the primary audience.

Sure you can press CTRL+O, but that's not realtime and you have to toggle between that and your current real time activity. Plus it's often laggy as hell.

WXLCKNO · 2026-02-06T17:40:03 1770399603

I run a loop where I have 4 agents review in parallel after each implementation phase. It just increases the odds of finding issues.

I've switched this over to a team of 4 now that talk to each other to discuss issues they find and it's amazing. They confirm between themselves and if they wrongly identified something the others correct them.

dangus · 2026-02-07T14:15:44 1770473744

So, the answer is yes, a single agent makes too many mistakes and you have to run four of them (4x usage cost) to improve the quality.

I understand that it works better, but I am rightfully pointing out that it's less efficient.

An analogy would be putting a V8 engine into a pickup truck to make it go as fast as a Mazda Miata.

WXLCKNO · 2026-02-05T00:56:54 1770253014

Oh god am I glad to read this. Thought it was my microphone or something.

WXLCKNO · 2026-01-31T01:05:37 1769821537

AI slop commenter account above

WXLCKNO · 2026-01-25T17:29:39 1769362179

A few more years and some more ram on these earbuds and we'll be able to run some nice local earbud kubernetes clusters