But I'm optimistic that this will gradually improve in time.
Today it’s my turn to be that person. Large scientific code base with a bunch of nontrivial, handwritten modules accomplishing distinct, but structurally similar in terms of the underlying computation, tasks. Pointed GPT Pro at it, told it what new functionality I wanted, and it churns away for 40 minutes and completely knocks it out of the park. Estimated time savings of about 3-4 weeks. I’ve done this half a dozen times over the past two months and haven’t noticed any drop off or degradation. If anything it got even better with 5.4.
The codebase itself is architected and documented to be LLM friendly and claude.md gives very strong harnesses how to do things.
As architect Claude is abysmal, but when you give it an existing software pattern it merely needs to extend, it’s so good it still gives me probably something like 5x feature velocity boost.
Plus when doing large refactorings, it forgets much fever things than me.
Inventing new architecture is as hard as ever and it’s not great help there - unless you can point it to some well documented pattern and tell it ”do it like that please”.
Even after deleting everything from the first feature and going back to the checkpoint just before initial development, I can no longer get it to accomplish anything meaningful without my direct guidance.