Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh, Azure's speech recognition API beats it handily on English language. Both in accuracy and speed.

Another is Deepgram. Even this obscure vendor seems to be able to handle the samples I tried better than Whisper: https://picovoice.ai/platform/cat/

But yeah, go with Azure as your starting point. It is good and the price is likely acceptable unless you're transcribing all of youtube.



Umm I want to pay zero and run locally


If you use the large_v2 version of whisper, and give it a prompt to indicate what it's transcribing, it can do extremely well. But do use the prompt feature.


Yeah exactly this is why there’s hype. It’s the best model that you can use for free easily




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: