Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In their paper, they explain that "in the case of math problems with deterministic results, the model is required to provide the final answer in a specified format (e.g., within a box), enabling reliable rule-based verification of correctness. Similarly, for LeetCode problems, a compiler can be used to generate feedback based on predefined test cases."

Basically, they have an external source-of-truth that verifies whether the model's answers are correct or not.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: