Hacker News new | past | comments | ask | show | jobs | submit login

For some tasks, scores are significantly affected by subjects and prompts used in the tests. I don't think these are valid figures while it is good to try to evaluate them. Overall, it it a good report.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: