Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If this is like a multi trillion parameter model, then you know to replicate it it's probably cranking up the parameter count. If this is a <100M model, then you know there is some breakthrough they found that you need to find out, instead of wasting time and money with more parameters.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: