Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What you describe could also be the difference in the hallucination rate [0]. Opus 4.5 has the lead here and Gemini 3 Pro performs here quite bad compared to the other benchmarks.

[0] https://artificialanalysis.ai/?omniscience=omniscience-hallu...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: