Hacker News new | past | comments | ask | show | jobs | submit login

Check out this model, I've had limited success with it. Best I've done so far is to just add the labels it gives to the overlapping segments whisper spits out, which means some sentences have multiple speakers, but that's mostly the case because of cross-talk. I'd say it gets it right ~80% of the time with the 5 speakers I've done it on across ~16 hours of audio.

https://huggingface.co/pyannote/speaker-diarization




I will!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: