We need better LLM evaluations | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		We need better LLM evaluations (generatingconversation.substack.com)
		2 points by mehulashah 11 months ago \| hide \| past \| favorite \| 1 comment

mehulashah 11 months ago [–]

I’ve been thinking about this. We seem to be overfitting LLMs to benchmarks that represent “intelligence”, but I’m not sure that they represent the kinds of tasks that small business and enterprises want done. Is time for another Anon et al. that sets up a benchmark and council ala TPC? Hopefully, without the politics.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact