upvote
100%. I have to hold the floor by filling the space with "ummmmmmmm.... uhhhh...." which inevitably distracts me from my point altogether. Poor user experience.
reply
Seems like there's a big risk of having that habit leak into human conversation. A lot of people try really hard to train themselves not to add those fillers.
reply
Have you tried telling it to pause to let you think?

I often use it while I’m walking and tell it to not respond until I initiate a conversation.

reply
I’ve tried this and it says it will but just keeps cutting in. I hate this feature so much.
reply
If anyone has an alternative I’m all ears.

This would be a killer feature for me and something I’ve tried to use on cross-country road trips.

reply
If you're setting this up yourself instead of using a lab's built-it speech functionality, you can run a small LLM in parallel, on a local model or small model like Haiku, that acts as a gate for either doing TTS on the response or not. Its only job is to decide if the transcription it receives is of someone being done talking or if that person is likely to still be mid-thought or mid-sentence.
reply
I know it's not the perfect solution for you, but I use a voice recorder and send the LLM the transcript. And my god is it working great.

Usually I just explain the things I want it to do. The longest was 30 minutes rambling of explaining the methods section of a paper in non chronological order. It worked unbelievable good for me.

reply
I find this is a problem even with human conversations. Some people just aren’t very good at telegraphing when they’ve finished ‘their turn’ talking. Or worse yet, aren’t willing to take turns in the first place.
reply