Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is it possible people are voting for speed of responsiveness too?


I suspect people on LLM Arena don't ask complex questions too often, and reasoning models seem to perform worse than simple models when the goal is just casual conversation or retrieving embedded knowledge. Reasoning models probably 'overthink' in such cases. And slower, too.


The LLM Arena deletes your prompt when you restart so what's the point in trying to write a complicated prompt and testing an exhaustive number of pairs?

It's easy to pin this on the users, but that website is hostile to putting in any effort.

This is something I've noticed a lot actually. A lot of AI projects just give you an input field and call it a day. Expecting the user to do the heavy lifting.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: