> Also, I didn't really noticed a significant difference in code quality, even t...

> Also, I didn't really noticed a significant difference in code quality, even the best model (GPT-4) write code that doesn't work,

Interesting, personally I have noticed a difference. Mostly in how well the models pick up small details and context. Although I do have to agree that the open Llama models are generally fairly serviceable.

Recently I have tended to lean towards Claude Sonnet 3.5 as it seems slightly better. Although that does differ per language as well.

As far as them being slow, I haven't really noticed a difference. I use them mostly through the API with open webui and the answers come quick enough.