how're you handling latency on turn overlaps : buffered stream with early intent... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		b0a04gl 79 days ago \| parent \| context \| favorite \| on: Launch HN: Issen (YC F24) – Personal AI language t... how're you handling latency on turn overlaps : buffered stream with early intent cutoff or full duplex with partial decoding?

mariano54 79 days ago [–]

We transcribe after 400ms of silence in 200ms chunks. 3 voice chunks (VAD) automatically interrupts, unless it's a back channel like "yeah" or "right" or something like that.

Whisper can transcribe in <100ms. We then wait for the turn detection model, LLM, and tts to trigger a streamed response back to eh client.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact