You could get to "something that works" rather fast but it took a long time to 1) evaluate other options (maybe before, maybe after), 2) refine it, 3) test it and build confidence around it.
I think your point stands but no one really knows where. The next year or so is going to be everyone trying to figure that out (this is also why we hear a lot of "we need to reinvent github")
Shame that what is left for the humans is the shitty, tedious part of the work.. It reminds me of the quote:
I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do laundry and dishes..I believe the llm providers went with the wrong approach from the off - the focus should’ve been on complementing labour not displacement. And I believe they have learned an expensive lesson along the way.
But the first time I say “No, it should be …” it’s nearly game over. If you say it 3+ times in a row, you’re basically doomed.
Sure, you can get it to fix the bug, but it comes at the cost of future prompts often barely working.
The moment I hit the "no, it should be.." point, I know it's the end of it.
Sometimes I can salvage something by asking for a summary of the work and reasoning done, and doing a fresh restart. But often times, it's manual corrections and full restart from there.
The person who builds an agentic IDE or GitHub alternative that natively does the process you describe will be a multibillionare.
Do you want a demo of what this is capable of?
And it's not just easier because it's cheap, it's easier because you're not emotionally attached to that code. Just let it produce slop, log what worked, what didn't, nuke the project and start over.
It just gets incredibly boring.