Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Julius [1] is a pretty good offline speech recognition engine. In my tests it seems to have about 95% accuracy in grammar-based models, and it supports continuous dictation. There is also a decent Python module which supports Python 2, and Python 3 with a few tweaks.

HOWEVER:

The only continuous dictation models available for Julius are Japanese, as it is a Japanese project. This is mainly an issue of training data. The VoxForge models are working towards releasing one for English once they get 140 hours of training data (last time I checked they were around 130); but even so the quality is likely to be far less than commercial speech recognition products, which generally have thousands of hours of training.

[1] http://julius.osdn.jp/en_index.php



Julius is my preferred speech recognition engine. I've built an application[0] which enables users to control their Linux desktops with their voices, and uses Julius to do the heavy lifting.

[0]: https://github.com/SacredData/COMPUTER


After a quick look, it seems Julius doesn't use the new deep-learning stuff?

In terms of data, http://www.openslr.org/12/ says it has 300 hours + of speech+text from librivox audiobooks. Using Librovox recordings seemed a great idea for making a freely available large dataset.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: