Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Would be interesting to see a comparison with Qwen 32B. I found it a fantastic local model (ollama).


Last year, fit was important. This year, inference speed is key.

Proofreading an email at four tokens per second, great.

Spending a half hour to deep research some topic with artifacts and MCP tools and reasoning at four tokens per second… a bad time.


I agree. Qwen models are great.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: