Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very interesting project, would love to read a more technical write up on how the model was architected and trained, any pointers?


I link to it from the post, but all the code is open source! You can find the specific training script here: https://github.com/OpenPipe/best-hn/blob/main/stories_train_...

And all the graphs for the blog are from this notebook: https://github.com/OpenPipe/best-hn/blob/main/blog-figures.i...

Lots of other good stuff in that repo, although it's only organized to a "working researcher" standard I'm afraid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: