upvote
The diarization is on Voxtral Mini Transcribe V2, not Voxtral Mini 4B.
reply
Ahh, yeah, and it's explicitly not working for realtime streams. Good catch!
reply
Do you have experience with that model for diarization? Does it feel accurate, and what's its realtime factor on a typical GPU? Diarization has been the biggest thorn in my side for a long time..
reply
You can test it yourself for free on https://console.mistral.ai/build/audio/speech-to-text I tried it on an english-speaking podcast episode, and apart from identying one host as two different speakers (but only once for a few sentences at the start), the rest was flawless from what I could see
reply
Amazing. Thank you.
reply
> Do you have experience with that model

No, I just heard about it this morning.

reply