Looks like Wispr Flow uses a cloud model [0]:
> Cloud based speech processing infrastructure for 1B users
It gets to be a messy comparison because my iPhone can do STT with no latency pretty well fully on device, but Wispr Flow requires a cloud model, but to be fair, older Apple devices do as well. It's not an apples and oranges comparison, but I think those technical details make this a non direct comparison in a few ways.
For on-device with low system resource usage, Apple's is pretty damn good.
For context, I was working on a podcast app with on-device transcription, had to park that idea for years before it got to today's performance.
I found another dreadful iPhone input "feature" yesterday. If you are browsing around in third party carplay apps, and ready to tap your selection, but instead press the accelerator first, it truncates the list to only a few items, and scrolls to the top.
Way to reduce driving distractions guys! What's next? If the car is moving, maps changes destinations?
I really wish human computer interaction research were more broadly applied, and if you do dumb stuff like all of the automotive / carplay world, then you'd be liable in court.
I once had a car that hid the backup cam behind a legal disclaimer every time you turned it on. I'm sure at least one pedestrian was hit by a car in reverse while that screen was on. The manufacturer should be 100% liable for the poor UI decision.
Yeah, that's unfortunate considering you can have it do nearly all of that (download maps, navigate to business all while offline), except asking siri to do it for you.
> I once had a car that hid the backup cam behind a legal disclaimer every time you turned it on.
My car pops up a dialog telling me (in a paragraph+) to pay attention while in semi-autopilot which I have to click "ok" on to get back to the map. It's very ironic, and extremely dangerous.
https://www.theregister.com/on-prem/2023/08/16/those-who-rel...
Would be great if they could at least fix two major bugs:
* input simply fails (seemingly) randomly where it is supported and many apps from major vendors don't support dictation input at all (e.g. OneNote) (there should at least be a fallback (a la Dragon Dictate from decades ago) for those cases * capitalization is still random leaving you with many errors to correct
but Apple mostly seems to see accessibility as something to use to enable performative press releases not actual functionality...
The streaming dictation they also added in that release is also much appreciated although occasionally buggy.
Sometimes it gets to “fever dream where you’re suddenly unable to successfully perform everyday tasks” levels of insanity.
And the worst part is: it used to be fine. I’d type more or less on full keyboard levels of speed and accuracy on my iPhone 4S.
Open your Settings app. Tap on General. Scroll down and select Keyboard. Toggle off Slide to Type
I’d much rather have “cheap, dependable, and good enough” over oligarch pricing for what used to be a one time software purchase any day.
I have a friend named Zi in my contacts. For some reason ios kept autocorrecting “I” to “Zi” and would do it too far back for me to notice.
What’s weird is how this is such a dumb bug that Apple usually irons out.
One of my primary methods of interacting with an iPhone is through speech and the state of Apple speech transcription is pretty horrible. It bothers me greatly.
I know some of the workarounds and things but it does feel like it’s in the Stone ages.
I don’t think it’s a microphone issue since iPhone microphones are fairly decent and I don’t think it’s a CPU issue either because Apple Silicon seems to be some of the best on the market. Which leaves us with the software…
Maybe they should put that cash hoard to good use and buy up some of these transcription companies or license their IP so we get truly high-quality transcription.