I strongly suspect most closed source code developed under commercial or internal pressure is pretty awful after a few years of development.
All LLM code has to do is suck less than existing code. And that's presuming the code quality doesn't improve as the models, the harnesses and our ways of working with them improve.
LLMs doesn't have this benefit. You forget to add the correct to the system prompt, and the LLM will repeat the same mistake over and over, and worse than that, their mistakes aren't based on their understanding, it's basically random guesses.
Humans, even bad coders, still seem to have some sort of architecture in mind, even if it's spaghetti, whereas LLMs (obviously) don't think more than a few steps, and never about the full scope of what they're contributing too, and on purpose too, because you want the context to be as small as possible when you work with LLMs.
With LLMs you need to thread carefully between "What does the LLM need to know?" and "Can I skip passing this to the LLM this time?" while a human you can more or less dump them everything you sit on, and let them shift it through, and they'll mostly make it out OK.
Whilst I don't claim any true "understanding" as that is a very loaded term that doesn't mean it's just random guesses.
Anyone using recent LLM coding agents on a regular basis would probably agree that there's something going on that fits some non-athropomorphizing, non-sentience-assigning definition of "understanding"
As for the point about improvement - I think that's an orthogonal issue to the overall code quality. With regard to human codebases - there's plenty of scenarios that negate the improvement of individuals. We're comparing organizations with LLMs - not individuals with LLMs and that makes a significant difference.
i dont see why software engineers are paid so well, and are so hard to hire?
just dump a bunch of requirements on a homeless person and itll just work out
But anyway, let the LLM verify the code to give advice on improvements but don't let it write code unverified. That's my opinion on it anyway.
But as soon as you do minimal reviews and high-level corrections, applications turn out just fine.
Can there be bugs? Sure. That's the price of not reading or understanding every line. It should depend on the criticality of your software how much of these you tolerate and how much you don't (reviewing, understanding, testing everything 100% like you were used to if you had written it yourself will kill most if not all of your gained speed)
But I never got the impression of unmaintainability or unfixable bugs.
Actually the other side around: A really good cleanup pass, architectural changes, or bugfixes are seldom more than a few prompts and 2 hours away, provided your overall base is decent and you actually gave a fuck from the start.
I've yet to come across a human developer who's output would meet this standard, despite writing every line.
In fact, having an LLM review our code is catching quite a few bugs before it reaches QA.
The humans may skip unit tests and need reminding; the AI always write unit tests once it's in AGENTS.md or whatever, but my experience* was that 5-10% of the time the LLM's attempt at a "test" would, instead of executing the code and examining the results, open the source code as a text file and run a regex to find/exclude certain substrings.
* At the start of this year, because Anthropic and OpenAI were both offering free trials. IDK how much things have changed since then, some things change fast in this domain, other things don't.
Riding the exponential means you have to update priors more often.
My observation is that they are equally bad and hard to maintain or even more so than the new ones.
One thing I’ve noticed is that the LLM assisted ones have a lot more comments which is nice but take more time to read.
And they do it faster than any human developer.
They clearly are only assistants for the moment, you can use them to do work ... but only if you could do the said work yourself alone in the first place.
I'm an experienced developer, but I don't count myself as a web dev or a python dev; I can review the web and python stuff I get out of the AI (sometimes I need to ask the AI follow-up questions so I can find official documentation for what it did), but I can't write it.
But the difference I allude to here is more like how "book reviewer" is a different job than "book author": yes, if you can review a book, you can also write one. Eventually.
On a more serious note, I think the problem will be the inability to handle/maintain the systems once they are too big and nobody has no idea what's inside of them or what they do.
Is this a good idea? Probably not—in the past we would only do that when the architecture was causing serious problems since it always has tons of behaviors that will accidentally not get carried forward, some of which are load bearing and will cause bugs.
Now we can do it in an afternoon and get the same long term bug behavior.