Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
Do they??
My team lead has worked on the same software for 30 years. He has the ability to hear me discuss a bug I noticed, and then pinpoint not only the likely culprit, but the exact function that's causing it.
And with one you need to train a guy for 25 years and with the other you need plan mode for a few minutes and then it runs 24/7.
https://www.joelonsoftware.com/2000/04/06/things-you-should-...
A decade ago, I was sitting in on a meeting about a rewrite and, before I could say anything, someone in the first year of her career asked why anyone thought a rewrite would be any cleaner once all the edge cases were handled. Afterwards, I asked her where she learned this. She said "I don't know, it just seems kind of obvious." She went on to be a great engineer and is now a great manager.
Greenfield guy comes in, promises the world, and starts from some first principles white papered architecture. It's really lovely until they onboard the first user. Then they slowly commit all the "sins" (features that drive revenue) of the first system.
The firm is stuck supporting N systems indefinitely because the perfect new system takes so long to cover even 30% of the original system use cases, that management takes a flier on.. bear with me.. a second rewrite. Now they have 3 systems.
I've seen more 3rd systems than I've seen actual decommissioning of original systems into a single clean new system.
The answer is chipping away, modularizing, and replacing piecemeal Ship of Theseus style. But that does not drive big hires and big promotions.
Including all of the above.
maybe some that people said were that bad. but they just needed some elbow grease. remember, it takes guts to be amazing!
The reason Oracle can continue failing at those massive projects is simple: everyone fails at them routinely and often it’s the customers fault.
it will kill all the people in that hospital too
What do you think the fake Delve attestation scandal was about? https://news.ycombinator.com/item?id=47444319
(Screams in "deployed in 2026 a new product that only works in internet explorer" in healthcare).
Definitely cleaning up other people's AI mess for them for free is not a good use of time.
But won’t those more complex systems presumably solve more complex problems than the systems that humans could build? Or within a comparable time?
I think it is reasonably safe to assume at this point in the game that these AI systems are increasingly able to reason rigorously about novel problems presented to them, of ever increasing complexity and sophistication.
I think the problem will get worst. I dislike the marketing around AI, but I do think it is a useful tool to help those who have experience move faster. If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
I've been watching non-developers vibe code stuff, and the general failure mode seems to be ignorance of 3-pick-2 tradeoffs.
They'll spam "make it more reliable" or some such, and AI will best-effort add more intermediary redis caches or similar patterns.
But because the vibe coders don't actually know what a redis cache is or how it works, they'll never make the architectural trade-offs to truly fix things.
I often wonder if it’s the statistical nature of the LLM mixed with a request in the prompt.
“ These are highly complicated pieces of equipment… almost as complicated as living organisms.
In some cases, they’ve been designed by other computers.
We don’t know exactly how they work.”
Now how did that work out ;-)
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
Plausible?
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
Do you think people are not giving their agents specs and asking for input?
Commits, design reviews, whitepapers, code reviews, test suites. And pretty concerning : chat logs and even keystrokes from employees nowadays.
The way we train specialized bots now is incredibly inefficient, that part is rapidly improving.
That's serious levels of circular thinking right there.
We train humans to do things untrained humans can not do.
- AI Hype
- AI Psychosis
- AI keeps getting better and better until it can work around big AI slop code bases
The belief in this is a form of AI psychosis, I think.
Maybe in the future but certainly no evidence of this anytime soon
Here's some anecdotal evidence from me - I cleaned up multiple GPT 4.x era vibecoded projects recently with the latest claude model and integrated one of those into a fairly large open source codebase.
This is something AI completely failed at last year.
Maybe you should try something like this or listen to success stories before claiming 'certainly no evidence' in future?
What evidence is there that we're not at or close to a plateau of what LLMs are capable of? How do you know the growth rate from 2023 to present will continue into 2029? eg. Is it more training data? More GPUs? What if we're kind of reaching the limits of those things already?
I don't see why we would assume that we are at a plateau for RL. In many other settings, Go for instance, RL continues to scale until you reach compute limits. Some things are more easily RL'd than others, but ultimately this largely unlocks data. We are not yet compute/energy/physical world constrained. I think you would start observing clear changes in the world around you before that becomes a true bottleneck. Regardless, currently the vast majority of compute is used for inference not training so the compute overhang is large.
Assuming that we plateau at {insert current moment} seems wishful and I've already had this conversation any number of times on this exact forum at every level of capability [3.5, 4, o1, o3, 4.6/5.5, mythos] from Nov 2022 onwards.
And the answer appears to be that the improvement is accelerating. So how could it be stopping?
1) same business logic implemented in two different places, with extra code to sync between them
2) fixing apparently simple bugs results in lots of new code being written
It’s a sign I need to at least temporarily dedicate more effort to overseeing work in that area.
I somewhat agree with the AI psychosis framing of the OP. It takes some taste and discipline to avoid letting things dissolve into complete slop.
* A belief that AI will keep getting better, presented without evidence, does not yield a lot of skepticism around these parts.
* Your comment saying it is wrong to believe AI will keep getting better, also presented without evidence, is downvoted.
I think it will be needless verbose complexity.
I kind of imagine someone having an unlimited budget of free amazon stuff shipped to their house.
In theory, they are living a prosperous life of plenty.
In reality, they will be drowning in something that isn't prosperity.
The explanation, in turn, can be fed back to recreate the functionality of the original code.
At that point, why care about the code at all? If it works, it works. If it doesn't, tell the model to fix it. You did ask for tests, right?
That is where we're indisputably headed. It's not quite a lossless loop yet, but those who say it won't or can't happen bear a heavy burden of proof.
You have not seen the spreadsheets that accounts run the firm on.
Bloody kids!
The issues have all been structural, not local. It's easier to treat it like a rewrite using the original as a super detailed product spec. Working on the existing codebase works, but you have to aggressively modularize everything anyway to untangle it rather than attack it from the top down.
All of these projects have gone well, but I haven't run into a case where a feature they thought was implemented isn't possible. That will happen eventually.
It's honestly good, quick work as a contractor. But I do hope they invest in building expertise from that point rather than treating it like a stable base to continue vibecoding on.
I exaggerate only a little.
Are you sure about this? Yes, there is a stable set, but they are used in all of the wrong places, particularly in places where they don't belong because juniors and now AIs can recite them and want to use them everywhere. That's not even discussing whether the stable set itself is correct or not - it's dubious at this point.
(None of above is theoretical)
Violets are blue
AI is great
And so are you
In their current forms, it's unlikely for a product that actually needs to work.
It's not getting that complex and working with current LLMs.
I thought the same when I saw development outsourced to Indians that struggled to write a for loop.
I was wrong.
It turns out that customers will keep doubling down on mistakes until they’re out of funds, and then they’ll hire the cheapest consultants they can find to fix the mess with whatever spare change they can find under the couch cushions.
Source: being called in with a one week time budget to fix a mess built up over years and millions of dollars.
Scrape off all the soil, put it in casks, and bury it in a concrete bunker for 10000 years. Then relocate everyone and attempt to rebuild.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
If the farming situation were as dire as you seem to suggest, we'd have unpredictable famines all the time, but we don't
Planting is merely setting up the conditions. We didn't write the dna, we couldn't write the dna if we wanted to because we are an infinity away from understanding all the actual processes that descend from the dna. And when we utilize the dna that we simply found and didn't and couln't hope to write, it's always, at best, a case of hoping it goes right again this time.
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.