upvote
You ultimately still need a jitter buffer large enough to absorb retransmisiones. Otherwise you’ve got stuttering audio. And dynamically adjusting this jitter buffer is hard
reply
> And dynamically adjusting this jitter buffer is hard

Unappreciated part of this entire conversation.

reply
I'm not an expert. Can't we abuse that LLMs don't need to receive audio as a continuous stream without interruptions? Couldn't we just send data and pipe it into the LLM with deduplication (if resending happens)?

  x...y...y[dedup]...z
reply
You’re absolutely correct. A jitter buffer is necessary for a human listener, but a LLM isn’t aware of a time lapse, just like it isn’t aware of the time since your last message in the conversion (unless the chat harness explicitly informs it).
reply