Basically enough to fit the download in RAM + a bit more. In practice, you kinda... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

brucethemoose2 on Nov 29, 2023 | parent | context | favorite | on: Llamafile lets you distribute and run LLMs with a ...

Basically enough to fit the download in RAM + a bit more.

In practice, you kinda need a GPU, even a small one. Otherwise prompt processing is really slow.

wazoox on Dec 3, 2023 [–]

It's really decent without any GPU. Image analysis is somewhat long, but text prompts are fine. My Ryzen laptop does 2.5 to 4 tokens per second, my Mac pro more like 8.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact