I would be surprised if you can't. The smallest weight file is 14gb apparently | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rihegher on March 3, 2023 \| parent \| context \| favorite \| on: Facebook LLAMA is being openly distributed via tor... I would be surprised if you can't. The smallest weight file is 14gb apparently

eightysixfour on March 3, 2023 [–]

https://github.com/facebookresearch/llama/blob/main/FAQ.md#3

Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs.

MacsHeadroom on March 5, 2023 | [–]

In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact