This doesn't feel relatable at all to me. If my writing speed is bottlenecked by thinking about what I'm writing, and my talking speed is significantly faster, that just means I've removed the bottleneck by not thinking about what I'm saying.
GRRM: How do you write so many books?... Don't you ever spend hours staring at the page, agonizing over which of two words to use, and asking 'am I actually any good at this?'
SK: Of course! But not when I'm writing.
(Full video here: https://www.youtube.com/watch?v=v_PBqSPNTfg )
In either case, different strokes for different folks, and what ultimately matters is whether you get good results. I think the upside is high, so I broadly suggest people try it out
In principle I don't see why they should have different amounts of thought. That'd be bounded by how much time it takes to produce the message, I think. Typing permits backtracking via editing, but speaking permits 'semantic backtracking' which isn't equivalent but definitely can do similar things. Language is powerful.
And importantly, to backtrack in visual media I tend to need to re-saccade through the text with physical eye motions, whereas with audio my brain just has an internal buffer I know at the speed of thought.
Typed messages might have higher _density_ of thought per token, though how valuable is that really, in LLM contexts? There are diminishing returns on how perfect you can get a prompt.
Also, audio permits a higher bandwidth mode: one can scan and speak at the same time.
And your 5 minute prompt just turned I to 1/2 hour of typing
With voice you get on with it, and then start iterating, getting Claude to plan with you.
Not been impressed with agentic coding myself so far, but I did notice that using voice works a lot better imo, keeping me focused on getting on with letting the agent do the work.
I've also found it good for stopping me doing the same thing in slack messages. I ramble my general essay to ChatGPT/Claude, get them to summarize rewrite a few lines in my own voice. Stops me spending an hour crafting a slack message and tends to soften it.
The Claude App version works from your phone and has a virtual environment it can use to write code and push it to a github repo :)
My go-to prompt finisher, which I have mapped to a hotkey due to frequent use, is "Before writing any code, first analyze the problem and requirements and identify any ambiguities, contradictions, or issues. Ask me to clarify any questions you have, and then we'll proceed to writing the code"
It's like a reasoning model. Don't ask, prompt 'and here is where you come up with apropos questions' and you shall have them, possibly even in a useful way.
Claude on macOS and iOS have native voice to text transcription. Haven't tried it but since you can access Claude Code from the apps now, I wonder if you use the Claude app's transcription for input into Claude Code.
Yeah, Claude/ChatGPT/Gemini all offer this, although Gemini's is basically unusable because it will immediately send the message if you stop talking for a few seconds
I imagine you totally could use the app transcript and paste it in, but keeping the friction to an absolute minimum (e.g., just needing to press one hotkey) feels nice
It's incredibly cheap and works reliably for me.
I have got it to paste my voice transcriptions into Chrome (Gemini, Claude, ChatGPT) as well as Cursor.
My main gripe is when the recording window loses focus, I haven't found a way to bring it back and continue the recorded session. So occasionally I have to start from scratch, which is particularly annoying if it happens during a long-winded brain dump.
Superwhisper offers some AI post-processing of the text (e.g., making nice bullets or grammar), but this doesn't seem necessary and just makes things a bit slower
My regular workflow is to talk (I use VoiceInk for transcription) and then say “tell me what you understood” — this puts your words into a well structured format, and you can also make sure the cli-agent got it, and expressing it explicitly likely also helps it stay on track.
I use a keyboard shortcut to start and stop recording and it will put the transcription into the clipboard so I can paste into any app.
It's a huge productivity boost - OP is correct about not overthinking trying to be that coherent - the models are very good at knowing what you mean (Opus 4.5 with Claude Code in my case)
I am using Whisper Medium. The only problem I see is that at the end of the message it sometimes puts a bye or a thank you which is kind of annoying.
If you want local transcription, locally running models aren't quite good enough yet.
They use right-ctrl as their trigger. I've set mine to double tap and then I can talk with long pauses/thinking and it just keeps listening till I tap to finish.
Also haven't tried but on latest MacOS 26 apple updated their STT models so their build in voice dictation maybe is good enough.