Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unless there is a massive change in archiecture, it will always be much more cost effective to have a single cluster of GPUs running inference for many users than have each user have hardware capable of running SOTA models but only using it for the 1% of the time where they have asked the model to do something.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: