upvote
voice to voice models can call tools. no need for TTS.
reply