If your application is pricing sensitive, check out DeepInfra.com - they have a ...

asaddhamani · 2025-07-08T08:21:27 1751962887

DeepInfra is amazing in terms of price, like really, they have the Qwen3 embedding model for $0.002 per mn tokens. That's an order of magnitude cheaper than most alternatives with better benchmark scores. But the performance P99 is slow and the variance is huge. For latency sensitive workloads it's problematic, if they can fix that it'll be a no-brainer to use them. DeepInfra does tend to have the lowest prices of any API provider.