upvote
> Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?"

A verbal diff sounds practically useless. Does it first read out the entire left-hand base, and then read out the entire right-hand target? Does it say loudly "REMOVING ... ADDING ... "? How would it read out something like Struct->Field? This seems lower fidelity than a visual confirmation, and I just don't think that voice commands make sense with this kind of work.

reply
It would tell me about the changes like a human would.

"It changed the plot function so it takes another parameter called linewidth. It also added an input field in the stylecontrols section where the user can ...".

reply
How would you detect the presence of bugs in this scenario? How would you make sure the LLM isn't adding yet another useless, redundant function to the code base? Even if there isn't a bug in this PR, do you not want to be familiar with the actual shape of the code in case you need to dig through it while bug hunting later?

Every time I try to take a hands-off approach to the code like this, I come to regret it later. The code ends up bloated and labyrinthine. When I let it grow unabated, it becomes gradually more difficult for the LLM to understand the intended structure as the project becomes too big for the model to keep the whole thing in its context.

reply

    How would you detect the
    presence of bugs in this
    scenario?
I would ask AI. "Did the last commit introduce any bugs or unintended consequences?". In fact I already use this prompt after every change I make manually.

    How would you make sure the LLM
    isn't adding yet another
    useless, redundant function to
    the code base?
By asking AI. In fact, I already run a long "Can you refactor anything in this codebase to reduce redundancy, improve readability, performance or maintainability" pretty regularly.
reply
Are you ever reading the code? What do you do when the LLM can't fix a bug? Do you not wish you had a more intimate first-hand knowledge of the code when fixing things yourself?

Please don't tell me that never happens-- I've had one just in the last week and I use both OpenAI and Anthropic foundation models.

reply
In my current workflow, yes, I read all code.

In fact, I usually let multiple LLMs implement the same feature, and then I compare them. I even run my own arena in which I calculate Elo scores for LLMs from my perspective of which one implemented features better.

Having the ability to control code agents via voice would not take away my ability to do that. But I think in the future, that will become less and less necessary. If we look back at this conversation in five years, it will look very archaic, and we will be used to having superhuman AI do everything for us. In 10 years, it will sound like a strange idea that humans were once fiddling with code to improve the quality.

reply
Something something wasting machine cycles with a compiler.

Something something taking the crafts and the man out of craftsmanship to just get it out the door as quickly as possible.

All jest aside I mostly agree with you but I'd tack on another 20 years for a total of 30.

Though in this technological jump I don't think people are as excited (understandably) as when the teletype came on scene. I too like the potential but dislike the whole discourse around it, the ethics involved and the way it's deployed. Such is life I suppose.

reply
Fair enough. Thank you for sating my curiosity. I'm not quite as optimistic as you, but I'm excited at the potential to be proven wrong. :)
reply
I can't tell if this is sarcasm.
reply
deleted
reply
I like how people think that if LLMs get to the point where they write code you can ship without reviewing it, that humans will still be in the loop "sshing into a code space" and "implementing features". Do you really think you'll even know what files are in that repo? Or that you'll be a necessary part of the process whatsoever?
reply