Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LM arena public voting is not objective for LLM evaluation (reddit.com)
2 points by lostmsu 11 months ago | hide | past | favorite | 2 comments


The post has been deleted.


TL;DR; A Reddit user claimed to have run a script that pretended to be a human on Chatbot Arena, upvoted Gemini thousands of times and similarly downvoted OpenAI's rival model. The script would detect models by recognizing model specific responses to artificially limited topics and certain forms of gibberish.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: