Inference is not cheap -- SOTA LLMs require 100s of gigabytes of VRAM for infere...

andrewjl · on Dec 19, 2022

Apple has made a sample project available for Stable Diffusion that can run on both macOS and iOS.[1] The sample code is here.[2]

I'll concede that Stable Diffusion may or may not meet the threshold for SOTA but still think this is indicative of inference eventually becoming supported for any compelling LLM on consumer-grade hardware. The possibilities for creative tools are just too vast.

[1] https://machinelearning.apple.com/research/stable-diffusion-...

[2] https://github.com/apple/ml-stable-diffusion