Hacker News new | past | comments | ask | show | jobs | submit login

As it says the audio is stripped/removed from video before processing, wonder how well it'd do if asked to transcribe by lip reading?



It looks like it actually only considers one frame for every second of video, so that certainly wouldn't work.


Yeah. If that interval isn't able to be adjusted then you're likely right. Oh well. ;)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: