Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

'better' is always a loaded term with ASR. Gemini 1.5 flash can transcribe for 0.01/hour of audio and gives strong results. If you want timing and speaker info you need to use the previous version and a -lot- of tweaking of the prompt or else it will hallucinate the timing info. Give it a try. It may be a lot better for your use case.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: