As the model weights (even quantized) would be several hundred GBs, itβs unlikely, unless special inference code is written that loads and processes only a small subset of weights and calculations at a time. But running it that way would be painfully slow.