Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is not the case here - I never encountered this with whisper-large-v3 or similar ASR models. Part of the reason, I guess, is that those subs are burnt into the movie, which makes them hard to extract. Standalone subs need the corresponding video resource to match the audio and text. So nothing is better than YouTube videos which are already aligned.


At least for English, those "fansubs" aren't typically burnt into the movie*, but ride along in the video container (MP4/MKV) as subtitle streams. They can typically be extracted as SRT files (plain text with sentence level timestamps).

*Although it used to be more common for AVI files in the olden days.


SRT is ancient. Nowadays everyone uses ASS subtitles which can be randomly styled.


In general? In the past I've known ASS to be used a lot for things like anime, but less for live action shows.


I have also found them inside mkvs as the subtitle track. I think SRT was the default because most content was ripped from DVD/BD, but now most of the content is from streaming sources and you need to convert the subtitles anyway.


WebVTT (a SubRip successor) is probably more widely used than ASS


By legit providers, probably.


flashbacks of trying to track down subs sync’d to a specific release




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: