Ok, so Qwen3 32B works now with the update. It seems much better than the Qwen3 ...

Ok, so Qwen3 32B works now with the update.

It seems much better than the Qwen3 30B A3B for quantized local use from what I can tell so far. Not sure yet how it compares to Gemma 3, but it's at least not clearly worse. It definitely does a much better job of formatting output in a friendlier way than Gemma does, but that's not as critical to me.

I suspect many people are getting even worse results out of the A3B than I did, since I saw downloads defaulting to 3-bit quants, but even at a higher quant, for local use it just isn't there yet.

I'm sure there are plenty of use cases for the low active parameter MoE models like sentiment analysis, summaries, etc, but for anything real I'll stick to the dense models. It makes me wonder if Qwen3 has similar problems that Llama 4 had, trying to be a big MoE model with low active parameters producing spotty results.

Qwen3 32B is quite usable, though. The problem I have with it so far is that it seems worse at instruction following, language translation and inferring the meaning of my prompt than Gemma 3. This isn't ideal, because if it can't follow instructions, you can't easily shape its reasoning/response to account for its issues.

One of my prompts simply asks it to do some translation and it occasionally feeds in Chinese characters. That's just not going to be usable for that scenario. Gemma 3's language consistency and quality is closer to production ready.

Gemma 3 does have its own problems with translation though, because if you instruct it to translate and what you want to translate is "what do you know?", it will instead go on talking about its capabilities rather than translating the language. You have to use a few tricks to prevent it from doing that.