upvote
Parakeet is the name of a speech to text model from Nvidia. Roughly comparable to whisper from openAI.

It's the model doing the work inside the wrapper that an app provides.

reply
Yep, here's the v2 and v3:

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3

It's almost instant on my new M5 Max w/ 36GB of memory, but I used both with Handy on my previous 2019 Intel Mac w/ 16GB memory and was completely surprised at just how fast it was for being on-device! Not instant, but only a couple seconds.

reply
I’m using it on an M3 max 32gb, and I’m getting 60-70x realtime for recordings and crazy good accuracy. I can get an hour of audio transcribed in a minute. Similar results from Whisper, but half the speed.

Transcription this good used to cost A LOT, now it rounds down to free.

reply