More

joshribakoff · 2025-10-20T15:48:46 1760975326

Usually, the reporter inspects the document but does not get to take a copy

joshribakoff · 2025-10-16T18:42:03 1760640123

Completely broken.

altcognito · 2025-10-16T19:11:04 1760641864

Useless even.

joshribakoff · 2025-10-16T15:51:41 1760629901

There are far worse scandals about this company than advertising ethics.

joshribakoff · 2025-10-16T15:49:05 1760629745

You essentially need additional agents to implement the guardrails traditionally used to scale teams.

joshribakoff · 2025-10-16T15:48:04 1760629684

By not relying on AI to think for you.

joshribakoff · 2025-10-16T15:44:43 1760629483

I find claude models often use “tricks” like bash one liners, essentially excelling at surgical fixes. It does what i want more reliably, on smaller tasks.

GPT-5 can often be better at larger architectural changes, but i find that comes at the cost of instability/broken PRs. It often fails to capture intent or argues back, or just completely spirals out of control more often.

GPT-5 codex seemed to refuse valid requests like “make a change to break a test so we can test CI” (it over indexed on our agents.md and other instructions and then refused on the basis of “ethics” or some such)

joshribakoff · 2025-10-16T15:37:55 1760629075

More like “i tried what others claim extensively and it does not work for me, please let me know if im doing something wrong” — to which the response is often yours, reframing the observation as a fallacy.

crazygringo · 2025-10-16T21:34:52 1760650492

Can you point me to some of those comments?

I genuinely haven't seen them.

I see many people insisting it didn't work when they tried it for some little thing, therefore it's broken and useless. And a few people saying, actually it works really well if you're willing to learn how to use it.

I'm not sure I've ever seen someone here saying it hasn't worked but they're open to learning how to use it right. It's definitely not common.

joshribakoff · 2025-10-16T15:32:59 1760628779

Its bad because basic stuff like “please commit these changes” can sometimes take 10+ minutes at times and causes it to spiral doing tangential stuff.

By running them in parallel you avoid sitting there watching paint dry for a task that takes 3 seconds by hand.

Its really not comparable to a junior, its more comparable to a salty maliciously compliant optimized to burn tokens and deceive you.

joshribakoff · 2025-10-16T15:29:51 1760628591

Counter argument: one is actually capable of reasoning, the other is predicting the next token and brute forcing until checks pass.

joshribakoff · 2025-10-16T15:22:19 1760628139

AI circumvents guardrails yes.