Over the past year or so various projects have made it possible to run LLMs on j...

nabakin on Jan 20, 2024 | parent | context | favorite | on: The Orange Pi 5 Plus

Over the past year or so various projects have made it possible to run LLMs on just about anything. Some GPUs are still better than others like Nvidia GPUs are still the best for token throughput (via TensorRT-LLM), but AMD GPUs are competitive (via vLLM) and even CPUs can run LLMs at decent speeds (via Llama.cpp).

user_7832 on Jan 20, 2024 [–]

Thank you!