Apple has made a sample project available for Stable Diffusion that can run on both macOS and iOS.[1] The sample code is here.[2]
I'll concede that Stable Diffusion may or may not meet the threshold for SOTA but still think this is indicative of inference eventually becoming supported for any compelling LLM on consumer-grade hardware. The possibilities for creative tools are just too vast.