You can already run a large LLM (like sonnet 3.5) locally on CPU with 128GB of r...

You can already run a large LLM (like sonnet 3.5) locally on CPU with 128GB of ram which is <300 USD, but can be offset by swap space. Obviously, response speed is going to be slower, but I can't imagine people will pay much more than 20 USD for waiting 30-60 seconds longer for a response.

And obviously consumer hardware is already being more optimized for running models locally.