Hacker News new | past | comments | ask | show | jobs | submit login

Yes, but I meant it slightly differently than the distills.

The idea is to create the next gen SOTA non reasoning model with synthetic reasoning training data.






So you mean something like, "what if the baseline, off-the-cuff response for the next-gen models was tuned based on the results of the reasoning model excluding the reasoning itself?"

Exactly, albeit it may need the reasoning later to form the proper foundational logic in the weights.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: