Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can already run a large LLM (like sonnet 3.5) locally on CPU with 128GB of ram which is <300 USD, but can be offset by swap space. Obviously, response speed is going to be slower, but I can't imagine people will pay much more than 20 USD for waiting 30-60 seconds longer for a response.

And obviously consumer hardware is already being more optimized for running models locally.





Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: