Hacker News new | past | comments | ask | show | jobs | submit login

I have two 3090s and it runs fine with `ollama run mixtral`. Although OP definitely meant mistral with the 7B note



ollama run mixtral will default to the quantized version (4bit IIRC). I'd guess this is why it can fit with two 3090s.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: