Recall is, at its core, an API for bot recording. As someone building an application that relies heavily on conversational data, recording meetings is really important. Recall makes that process as easy as an API call, standardized across various meeting platforms. It's a huge PITA to set up infrastructure to get bots to join meetings that handle each platforms' proclivities, encoding and storing video data, etc.
The transcription service is just something they do to make transcribing recordings - one of the most common first post-processing steps for any conversational data - easier and lower friction.
I actually agree that it’s become incredibly easy to transcribe conversations using open-source models, and that’s not where Recall adds the most value. The hard part is building the infrastructure that allows you to get real-time access to the raw audio, video, and transcript data directly from the meeting platforms. We abstract all of that away and provide you with a clean interface to access that data. Once you get the data, you could use any of the models that you mentioned to do your own transcription, or transcribe using Recall’s transcription models.