From my experience with quantized 7B llama models, avoid 3B if you can. Without benchmarks, I think this is a decent rule of thumb.
From my experience with quantized 7B llama models, avoid 3B if you can. Without benchmarks, I think this is a decent rule of thumb.