I recently have been trying out open source solutions for voice recognition for a personal project and you are very correct that it lags very far behind proprietary solutions. Pocketsphinx is still very limited and Kaldi takes quite a bit to setup in a usable fashion. There were a few other options I looked at that I can't think of from the top of my head but were all in similar condition.
As the article says latency is still a problem and it's a huge problem in current open source solutions, some stuff I was testing was easily 5 seconds. I know that can be improved with configuration, but when dealing with libraries of 10 words or so, that's pretty bad.
I feel like anyone who is seriously interested in this space has been scooped up by all the big companies and the open source solutions have really seemed to linger because of it. It's one of the first areas I've seen where open source alternatives are really behind the proprietary solutions. Kind of bummed me out.
Kaldi is the best, there was just Tensorflow integration added which will hopefully speed up development (though I haven't seen any pretrained models for that yet).
As the article says latency is still a problem and it's a huge problem in current open source solutions, some stuff I was testing was easily 5 seconds. I know that can be improved with configuration, but when dealing with libraries of 10 words or so, that's pretty bad.
I feel like anyone who is seriously interested in this space has been scooped up by all the big companies and the open source solutions have really seemed to linger because of it. It's one of the first areas I've seen where open source alternatives are really behind the proprietary solutions. Kind of bummed me out.