Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Sonnet and Opus don't benchmark as well as O3/Grok4 at pure coding

Do any of the others have a "claude code" local agent? Seems like a big gap IMO. Though, it should be pretty easy for them to close that gap.

I don't usually take too many moral stances but I feel like I can't use Grok. It's bad enough Musk did his Nazi salute but his AI product itself is a Nazi too? It might be good at coding but I really can't stomach using it.



FWIW, people report that Grok 4 is not very good at coding, and xAI admit this themselves when they said they will be releasing a separate coding model in "the next few weeks".

Also, Google does have Gemini CLI, OpenAI does have Codex CLI, and then there is Aider which can support any model. I think the big difference is that Anthropic's models are the best for this use-case right now, and Anthropic has the Max plan which makes a massive difference to the cost of using Claude Code compared to competitors (although the Gemini CLI has insane free tiers).

I'm not sure how this will play out in the future, because it seems to me that Claude Code does not have much of a moat beyond Anthropic having the best coding models right now, and them offering model usage at heavily discounted prices.


> people report that Grok 4 is not very good at coding

There are agentic models and oracle models. It can be modelled on a four-way quadrant of agent vs oracle and high safety vs low safety.

https://ghuntley.com/cars

Grok is oracle and low safety.


Grok4 is pretty decent at planning and figuring out libraries and APIs.

For code it falls down past simple scripts and utilities.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: