Conventional wisdom states that running LLMs locally will require computers with high performance specifications especially GPUs with lots of VRAM. But is this actually true?
Thanks to an open-source llama2.c project, I ported it to work so vintage machines running DOS can actually inference Llama 2 LLM models. Of course there are severe limitations but the results will surprise you.
Thanks to an open-source llama2.c project, I ported it to work so vintage machines running DOS can actually inference Llama 2 LLM models. Of course there are severe limitations but the results will surprise you.