Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For ultra large MoEs from deepseek and llama 4, fine-tuning on these models is becoming increasingly impossible for hobbyists and local LLM users.

Small and dense models are what local people really need.

Although benchmaxxing is not good, I still find this release valuable. Thank you Qwen.



  Small and dense models are what local people really need.
Disagreed. Small and dense is dumber and slower for local inferencing. MoEs is what people actually want on local.


YMMV.

Parameter efficiency is an important consideration, if not the most important one, for local LLMs because of the hardware constraint.

Do you guys really have GPUs with 80GB VRAM or M3 ultra with 512GB rams at home? If I can't run these ultra large MoEs locally, then these models mean nothing to me. I'm not a large LLM inference provider after all.

What's more, you also lose the opportunities to fine-tune these MoEs when it's already hard to do inference with these MoEs.


What people actually want is something like GPT4o/o1 running locally. That's the dream for local LLM people.

Running a 7b model for fun is not what people actually want. 7b models are very niche oriented.


About <10B LLMs, yes it's not that good. However, <10B is a range that allows many people to do their own tweaking and fine-tuning.


For a local LLM, you can't really ask for a certain performance level, it is what it is.

Instead, you can ask for the architecture, be it dense or MoE.

Besides, let's assume the best open weight LLM for now is deepseek r1, is it practical for you to run r1 locally? If not, r1 means nothing to you.

Maybe r1 will be surpassed by llama 4 behemoth. Is it practical for you to run behemoth locally? If not, behemoth also means nothing to you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: