Sure, Qwen-3-4B - a 4GB download - is nowhere near as capable as Claude Sonnet 4.
But it is massively more capable than the 4GB models we had last year.
Meanwhile recent models that are within the same ballpark of capabilities as Claude Sonnet 4 - like GLM 4.5 and Kimi K2 and the largest of the Qwen 3 models - can just about fit on a $10,000 512GB of RAM Mac Studio. That's a very notable trend.
El Capitan being much faster than my desktop doesn't mean that my desktop is useless. Same with LLMs.
I've been using Mistral Small 3.x for a bunch of tasks on my own PC and it has been very useful, especially after i wrote a few custom tools with llama.cpp to make it more "scriptable".
Local models are no where near capable compared to frontier big models.
While a small model might be fine for your use case, it can not replace Sonnet-4 for me.