Also the impressive IMO-ProofBench Basic benchmark, the model achieved nearly 99... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		N_Lens 10 days ago \| parent \| context \| favorite \| on: DeepSeekMath-V2: Towards Self-Verifiable Mathemati... Also the impressive IMO-ProofBench Basic benchmark, the model achieved nearly 99% accuracy, though it fell slightly behind Gemini Deep Think on the Advanced subset. The approach shifts from "result-oriented" to "process-oriented" verification, particularly important for theorem proving where rigorous step-by-step derivation matters more than just numerical answers.

AlexCoventry 10 days ago [–]

"Process-oriented" verification has been a thing for a while in mathematical reasoning CoT. Google had a paper about it last year [1]. The key term to look for is "Process-reward model." I particularly like RL Tango [2].

[1] https://arxiv.org/abs/2406.06592

[2] https://arxiv.org/abs/2505.15034

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact