We recently did a comparative analysis of cloud speech-to-text providers for a p...

splatcollision · on Oct 24, 2017

For Watson Speech to Text - Did you choose the correct model to match your source audio quality? They default to a "Broadband" model intended for high quality audio sources, but you can also select "Narrowband" for things like phone quality. Not guaranteeing a difference, but in my experience, matching the source quality to the correct model makes some difference.

I've not compared them extensively but for streaming realtime, I found that Watson beat the Google api for a specific use-case. Your mileage may vary!

They also provide a handy Mic / File reader interface for browsers: https://github.com/watson-developer-cloud/speech-javascript-...

adrianbg · on Oct 24, 2017

Here are some benchmarks on telephone speech, including both APIs and human transcription services:

https://remeeting.com/app/benchmarks

Google actually did pretty badly for us on extended telephone speech. Not sure why.