The quantized model fits in about 20 GB, so 32 would probably be sufficient unle...

manmal · 2025-03-06T06:17:44 1741241864

I‘ve tried the very early Q4 mlx release on an M1 Max 32GB (LM Studio @ default settings), and have run into severe issues. For the coding tasks I gave it, it froze before it was done with reasoning. I guess I should limit context size. I do love what I‘m seeing though, the output reads very similar to R1, and I mostly agree with its conclusions. The Q8 version has to be way better even.

whitehexagon · 2025-03-06T16:32:08 1741278728

Does the Q8 fit within your 32GB (also using an M1 32GB)

manmal · 2025-03-06T17:30:55 1741282255

No, Q4 just barely fits, and with a longer context sometimes things freeze. I definitely have to close Xcode.