Hacker News new | past | comments | ask | show | jobs | submit login

Correctness?

I've not seen anyone seriously attempting to benchmark chatgpt output, without heavily cherry picking it first.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: