> I've had cases where it completely misinterpreted what I was asking for, a ver...

jjani · 2025-08-08T19:23:14 1754680994

It appears to be overtuned on extremy strict instruction following, interpreting things in a very unhuman way, which may be a benefit to agentic tasks at the costs of everything else.

My limited API testing with gpt-5 also showed this. As an example, the instruction "don't use academic language" caused it to basically omit half of what it output without that instruction. The other frontier models, and even open source Chinese ones like Kimi and Deepseek, understand perfectly fine what we mean by it.

int_19h · 2025-08-08T20:29:52 1754684992

It's not great at agentic tasks either. Not the least because it seems very timid about doing things on its own, and demands (not asks - demands) that user confirm every tiny step.