Another is Deepgram. Even this obscure vendor seems to be able to handle the samples I tried better than Whisper: https://picovoice.ai/platform/cat/
But yeah, go with Azure as your starting point. It is good and the price is likely acceptable unless you're transcribing all of youtube.
Another is Deepgram. Even this obscure vendor seems to be able to handle the samples I tried better than Whisper: https://picovoice.ai/platform/cat/
But yeah, go with Azure as your starting point. It is good and the price is likely acceptable unless you're transcribing all of youtube.