> In the last two weeks, I've heavily used Claude and DeepSeek Chat. ChatGPT is much more reliable compared to both
Which reliability problems did you face? I heard about connection issues due to too much traffic with Deepseek, but those would go away if you self-host the model.
Obviously the reliability problems would go away if you self-host but the "point" is most people rely on external providers because they can't locally run models of similar quality. So what do you do if deepseek cuts you off? For most getting 12? H100s (for the 671b) is essentially impossible.
You use a Deepseek model not hosted by Deepseek, but another provider (e.g. DeepInfra currently). Hopefully a robust provider market will emerge and thrive, even if open models start thinning out.
Have people tried using R1 for some real-world use cases? I attempted to use the 7b Ollama variant for my UI generation [1] and Gitlab Postgres Schema Analysis [2] tasks, but the results were not satisfactory.
- UI Generation: The generated UI failed to function due to errors in the JavaScript, and the overall user experience was poor.
- Gitlab Postgres Schema Analysis: It identified only a few design patterns.
I am not sure if these are suitable tasks for R1. I will try larger variant as well.
This is the same approach we took when we added LLM capability to a low code tool Appian. LLM helped us generate the Appian workflow configuration file, user reviews it and make changes if required, and then finally publishes it.
This is an important point. As people rely more on AI/LLM tools, reliability will become even more critical.
In the last two weeks, I've heavily used Claude and DeepSeek Chat. ChatGPT is much more reliable compared to both.
Claude struggles with long-context chats and often shifts to concise responses. DeepSeek often has its "fail whale" moment.
reply