Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I’m not sure whether it’s my affinity for code golf, but when I see someone bragging about about line count, I don’t expect the use of multi million line nonstandard libraries.

I wouldn't say the headline is "misleading" - no one is about to be fooled into thinking 300 lines of C++ could be capable of state-of-the-art speech recognition. The headline is squarely in the territory of "complete nonsense, but you can tell that without having to read further than the headline".



I’m not sure what gives you the confidence to make absolute statements like this. It might be unlikely, but code golfers, demo sceners and the like regularly do crazy stuff with ridiculously little code.


Are you saying state-of-the-art speech recognition cannot fit in 284 lines of C++ without any third party libraries?


Yes, assuming the lines are of reasonable length.


I suppose we have a challenge in our hands.


As soon as you succeed, I'm pretty sure someone will complain that the sequence of matrix multiplications in the AI parameter file also counts as "code" in the wider sense.


Feel free to consider the parameter file "output" that doesn't count against lines of code.

Anything you used in the process of generating it will be input that does count.


So 10000+ hours of WAV files...


Download a large amount of random German language videos off YouTube, but only ones with handmade subtitles. Correlate audio with text. Record audio, transform to text.

I posit this can be done in less than 284 lines of C++ while having an error rate equal to or better than the state-of-the-art for everyday speech.

Gentlemen, ready your putters…


> Anything you used in the process of generating it will be input that does count.


That doesn’t apply to the OP, does it?


Why wouldn't it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: