This is also compounded by the fact that LLMs are not deterministic, every response is different for the same given prompt. And people tend to judge on one off experiences.
They can be. The cloud-hosted LLMs add a gratuitous randomization step to make the output seem more human. (In vein with the moronic idea of selling LLM's as sci-fi human-like assistants.)
But you don't have to add those randomizations. Nothing much is lost if you don't. (Output from my self-hosted LLM's is deterministic.)