I’ve been driving Claude as my primary coding interface the last three months at my job. Other than a different domain, I feel like I could have written this exact article.
The project I’m on started as a vibe-coded prototype that quickly got promoted to a production service we sell.
I’ve had to build the mental model after the fact, while refactoring and ripping out large chunks of nonsense or dead code.
But the product wouldn’t exist without that quick and dirty prototype, and I can use Claude as a goddamned chainsaw to clean up.
On Friday, I finally added a type checker pre-commit hook and fixed the 90 existing errors (properly, no type ignores) in ~2 hours. I tried full-agentic first, and it failed miserably, then I went through error by error with Claude, we tightened up some exiting types, fixed some clunky abstractions, and got a nice, clean result.
AI-assisted coding is amazing, but IMO for production code there’s no substitute for human review and guidance.
Then use ideation to architect, dive into details and tell the AI exactly what your choices are, how certain methods should be called, how logging and observability should be setup, what language to use, type checking, coding style (configure ruthless linting and formatting before you write a single line of code), what testing methodology, framework, unit, integration, e2e. Database, changes you will handle migrations, as much as possible so the AI is as confined as possible to how you would do it.
Then, create a plan file, have it manage it like a task list, and implement in parts, before starting it needs to present you a plan, in it you will notice it will make mistakes, misunderstand some things that you may me didn’t clarify before, or it will just forget. You add to AGENTS.md or whatever, make changes to the ai’s plan, tell it to update the plan.md and when satisfied, proceed.
After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff.
The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.
If you do this and iterate you will gradually end up with a solid harness and you will need to review less.
Then port it to other projects.
Personally, I think it's just the natural flow when you're starting out. If he keeps going, his opinion is going to change and as he gets to know it better, he'll likely go more and more towards vibecoding again.
It's hard to say why, but you get better at it. Even if it's really hard to really put into words why
It may actually be true. Your feeling might be right - but I strongly caution you against trusting that feeling until you can explain it. Something you can’t explain is something you don’t understand.
have you ever learned a skill? Like carving, singing, playing guitar, playing a video game, anything?
It's easy to get better at it without understanding why you're better at it. As a matter of fact, very very few people master the discipline enough to be able to grasp the reason for why they're actually better
Most people just come up with random shit which may or may not be related. Which I just abstained from.
This is something everyone who cares about improving in a skill does regularly - examine their improvement, the reasons behind it, and how to add to them. That’s the basis of self-driven learning.
And that's not really explainable without exploring specific examples. And now we're in thousands of words of explanation territory, hence my decision to say it's hard to put it into words.
For instance, if I say “I noticed I run better in my blue shoes than my red shoes” I did not learn anything. If I examine my shoes and notice that my blue shoes have a cushioned sole, while my red shoes are flat, I can combine that with thinking about how I run and learn that cushioned soles cause less fatigue to the muscles in my feet and ankles.
The reason the difference matters is because if I don’t do the learning step, when buy another pair of blue shoes but they’re flat soled, I’m back to square one.
Back to the real scenario, if you hold on to your ungrounded intuition re what tricks and phrasing work without understanding why, you may find those don’t work at all on a new model version or when forced to change to a different product due to price, insolvency, etc.
One thing I will add: I actually don’t think it’s wrong to start out building a vibe coded spaghetti mess for a project like this… provided you see it as a prototype you’re going to learn from and then throw away. A throwaway prototype is immensely useful because it helps you figure out what you want to build in the first place, before you step down a level and focus on closely guiding the agent to actually build it.
The author’s mistake was that he thought the horrible prototype would evolve into the real thing. Of course it could not. But I suspect that the author’s final results when he did start afresh and build with closer attention to architecture were much better because he has learned more about the requirements for what he wanted to build from that first attempt.
But that's boring nerd shit and LLMs didn't change who thinks boring nerd shit is boring or cool.
Some people do find it unfun, saying it deprives them of the happy "flow" of banging out code. Reaching "flow" when prompting LLMs arguably requires a somewhat deeper understanding of them as a proper technical tool, as opposed to a complete black box, or worse, a crystal ball.
SWEs spend 20% of the time writing code for exactly the same reason brick-layers spend 20% of their time laying bricks
I kinda like how you can just use it for anything you like. I have bazillion personal projects, I can now get help with, polish up, simplify, or build UI for, and it's nice. Anything from reverse engineering, to data extraction, to playing with FPGAs, is just so much less tedious and I can focus on the fun parts.
I use LLMs in my every day work. I’m also a strong critic of LLMs and absolutely loathe the hype cycle around them.
I have done some really cool things with copilot and Claude and I keep sharing them to within my working circle because I simply don’t want to interact that much with people who aren’t grounded on the subject.
I started using Copilot at work because that's what the company policy was. It's a pretty strict environment, but it's perfectly serviceable and gets a lot of fresh, vetted updates. IDE integration with vs code was a huge plus for me.
Claude code is definitely a messier, buggier frontend for the LLM. It's clunkier to navigate and it has much more primitive context management tools. IDE integration is clunky with vs code, too.
However, if you want to take advantage of the Anthropic subscription services, I've found Claude Code is the way to go... Simply because Anthropic works hard to lock you into their ecosystem if you want the sweet discounts. I'm greedy, so I bit the bullet for all of the LLM coding stuff I do in my personal life.
Previously, takes were necessarily shallower or not as insightful ("worked with caveats for me, ymmv") - there just wasn't enough data - although a few have posted fairly balanced takes (@mitsuhiko for example).
I don't think we've seen the last of hypers and doomers though.
What’s really happening is that you’re all of those people in the beginning. Those people are you as you go through the experience. You’re excited after seeing it do the impossible and in later instances you’re critical of the imperfections. It’s like the stages of grief, a sort of Kübler-Ross model for AI.
I recently had to rewrite a part of such a prototype that had 15 years of development on it, which was a massive headache. One of the most useful things I used LLMs for was asking it to compare the rewritten functionality with the old one, and find potential differences. While I was busy refactoring and redesigning the underlying architecture, I then sometimes was pinged by the LLM to investigate a potential difference. It sometimes included false positives, but it did help me spot small details that otherwise would have taken quite a while of debugging.
Professional software engineers like many of us have a big blind spot when it comes to AI coding, and that's a fixation on code quality.
It makes sense to focus on code quality. We're not wrong. After all, we've spent our entire careers in the code. Bad code quality slows us down and makes things slow/insecure/unreliable/etc for end users.
However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
There are two forces contributing to this: (1) more people coding smaller apps, and (2) improvements in coding models and agentic tools.
We are increasingly moving toward a world where people who aren't sophisticated programmers are "building" their own apps with a user base of just one person. In many cases, these apps are simple and effective and come without the bloat that larger software suites have subjected users to for years. The code is simple, and even when it's not, nobody will ever have to maintain it, so it doesn't matter. Some apps will be unreliable, some will get hacked, some will be slow and inefficient, and it won't matter. This trend will continue to grow.
At the same time, technology is improving, and the AI is increasingly good at designing and architecting software. We are in the very earliest months of AI actually being somewhat competent at this. It's unlikely that it will plateau and stop improving. And even when it finally does, if such a point comes, there will still be many years of improvements in tooling, as humanity's ability to make effective use of a technology always lags far behind the invention of the technology itself.
So I'm right there with you in being annoyed by all the hype and exaggerated claims. But the "truth" about AI-assisted coding is changing every year, every quarter, every month. It's only trending in one direction. And it isn't going to stop.
Strongly disagree with this thesis, and in fact I'd go completely the opposite: code quality is more important than ever thanks to AI.
LLM-assisted coding is most successful in codebases with attributes strongly associated with high code quality: predictable patterns, well-named variables, use of a type system, no global mutable state, very low mutability in general, etc.
I'm using AI on a pretty shitty legacy area of a Python codebase right now (like, literally right now, Claude is running while I type this) and it's struggling for the same reason a human would struggle. What are the columns in this DataFrame? Who knows, because the dataframe is getting mutated depending on the function calls! Oh yeah and someone thought they could be "clever" and assemble function names via strings and dynamically call them to save a few lines of code, awesome! An LLM is going to struggle deciphering this disasterpiece, same as anyone.
Meanwhile for newer areas of the code with strict typing and a sensible architecture, Claude will usually just one-shot whatever I ask.
edit: I see most replies are saying basically the same thing here, which is an indicator.
It actually becomes more and more relevant. AI constantly needs to reread its own code and fit it into its limited context, in order to take it as a reference for writing out new stuff. This means that every single code smell, and every instance of needless code bloat, actually becomes a grievous hazard to further progress. Arguably, you should in fact be quite obsessed about refactoring and cleaning up what the AI has come up with, even more so than if you were coding purely for humans.
Strong disagree. I just watched a team spend weeks trying to make a piece of code work with AI because the vibe coded was spaghetti garbage that even the AI couldn’t tell what needed to be done and was basically playing ineffective whackamole - it would fix the bug you ask it by reintroducing an old bug or introducing a new bug because no one understood what was happening. And humans couldn’t even step in like normal because no one understood what’s going on.
In 1998, I'm sure there were newspaper companies who failed at transitioning online, didn't get any web traffic, had unreliable servers crashed, etc. This says very little about what life would be like for the newspaper industry in 1999, 2000, 2005, 2010, and beyond.
AI will get better at making good maintainable and explainable code because that’s what it takes to actually solve problems tractably. But saying “code quality doesn’t matter because AI” is definitely not true both experientially and as a prediction. Will AI do a better job in the future? Sure. But because their code quality improves not because it’s less important.
Guns, wheels, cars, ships, batteries, televisions, the internet, smartphones, airplanes, refrigeration, electric lighting, semiconductors, GPS, solar panels, antibiotics, printing presses, steam engines, radio, etc. The pattern is obvious, the forces are clear and well-studied.
If there is (1) a big gap between current capabilities and theoretical limits, (2) huge incentives for those who to improve things, (3) no alternative tech that will replace or outcompete it, (4) broad social acceptance and adoption, and (5) no chance of the tech being lost or forgotten, then technological improvement is basically a guarantee.
These are all obviously true of AI coding.
It isn't even a good job of cherry picking: we never got mainstream supersonic passenger aircraft after the Concorde because aerospace technology hasn't advanced far enough to make it economically viable and the decrease in progress and massively increasing costs in semiconductors for cutting edge processes is very well known.
It is absolutely the case that virtual reality technology will only get better over time. Maybe it'll take 5, or 10, or 20, or 40 years, but it's almost a certainty that we'll eventually see better AR/VR tech in the future than we have in the past.
Would you bet against that? You'd be crazy to imo.
Whether what they're using in 20 years is produced by the company formerly known as Facebook or not is a whole different question.
Spaghetti code is still spaghetti code. Something that should be a small change ends up touching multiple parts of the codebase. Not only does this increase costs, it just compounds the next time you need to change this feature.
I don't see why this would be a reality that anyone wants. Why would you want an agent going in circles, burning money and eventually finding the answer, if simpler code could get it there faster and cheaper?
Maybe one day it'll change. Maybe there will be a new AI technology which shakes up the whole way we do it. But if the architecture of LLMs stays as it is, I don't see why you wouldn't want to make efficient use of the context window.
I said that (a) apps are getting simpler and smaller in scope and so their code quality matters less, and (b) AI is getting better at writing good code.
Think about what happened to writing when we went from scribes to the printing press, and from the printing press to the web. Books and essays didn't get bigger. We just got more people writing.
> However, code quality is becoming less and less relevant in the age of AI coding, and to ignore that is to have our heads stuck in the sand. Just because we don't like it doesn't mean it's not true.
> [...]
> We are increasingly moving toward a world where people who aren't sophisticated programmers are "building" their own apps with a user base of just one person. In many cases, these apps are simple and effective and come without the bloat that larger software suites have subjected users to for years. The code is simple, and even when it's not, nobody will ever have to maintain it, so it doesn't matter. Some apps will be unreliable, some will get hacked, some will be slow and inefficient, and it won't matter. This trend will continue to grow.
I do agree with the fact that more and more people are going to take advantage of agentic coding to write their own tools/apps to maker their life easier.
And I genuinely see it as a good thing: computers were always supposed to make our lives easier.But I don't see how it can be used as an argument for "code quality is becoming less and less relevant".
If AI is producing 10 times more lines that are necessary to achieve the goal, that's more resources used. With the prices of RAM and SSD skyrocketing, I don't see it as a positive for regular users. If they need to buy a new computer to run their vibecoded app, are they really reaping the benefits?
But what's more concerning to me is: where do we draw the line?
Let's say it's fine to have a garbage vibecoded app running only on its "creator" computer. Even if it gobbles gigabytes of RAM and is absolutely not secured. Good.
But then, if "code quality is becoming less and less relevant", does this also applies to public/professional apps?
In our modern societies we HAVE to use dozens of software everyday, whether we want it or not, whether we actually directly interact with them or not.
Are you okay with your power company cutting power because their vibecoded monitoring software mistakenly thought you didn't paid your bills?
Are you okay with an autonomous car driving over your kid because its vibecoded software didn't saw them?
Are you okay with cops coming to your door at 5AM because a vibecoded tool reported you as a terrorist?
Personally, I'm not.
People can produce all the trash they want on their own hardware. But I don't want my life to be ruled by software that were not given the required quality controls they must have had.
I mean, I agree, but you could say this at any point in time throughout history. An engineer from the 1960s engineer could scoff at the web and the explosion in the number of progress and the decline in efficiency of the average program.
An artist from the 1700s would scoff at the lack of training and precision of the average artist/designer from today, because the explosion in numbers has certain translated to a decline in the average quality of art.
A film producer from the 1940s would scoff at the lack of quality of the average YouTuber's videography skills. But we still have millions of YouTubers and they're racking up trillions of views.
Etc.
To me, the chief lesson is that when we democratize technology and put it in the hands of more people, the tradeoff in quality is something that society is ready to accept. Whether this is depressing (bc less quality) or empowering (bc more people) is a matter of perspective.
We're entering a world where FAR more people will be able to casually create and edit the software they want to see. It's going to be a messier world for sure. And that bothers us as engineers. But just because something bothers us doesn't mean it bothers the rest of the world.
> But then, if "code quality is becoming less and less relevant", does this also applies to public/professional apps?
No, I think these will always have a higher bar for reliability and security. But even in our pre-vibe coded era, how many massive brandname companies have had outages and hacks and shitty UIs? Our tolerance for these things is quite high.
Of course the bigger more visible and important applications will be the slowest to adopt risky tech and will have more guardrails up. That's a good thing.
But it's still just a matter of time, especially as the tools improve and get better at writing code that's less wasteful, more secure, etc. And as our skills improve, and we get better at using AI.
I'm curious about software that's actively used but nobody maintains it. If it's a personal anecdote, that's fine as well
It's the opposite, code quality is becoming more and more relevant. Before now you could only neglect quality for so long before the time to implement any change became so long as to completely stall out a project.
That's still true, the only thing AI has changed is it's let you charge further and further into technical debt before you see the problems. But now instead of the problems being a gradual ramp up it's a cliff, the moment you hit the point where the current crop of models can't operate on it effectively any more you're completely lost.
> We are in the very earliest months of AI actually being somewhat competent at this. It's unlikely that it will plateau and stop improving.
We hit the plateau on model improvement a few years back. We've only continued to see any improvement at all because of the exponential increase of money poured into it.
> It's only trending in one direction. And it isn't going to stop.
Sure it can. When the bubble pops there will be a question: is using an agent cost effective? Even if you think it is at $200/month/user, we'll see how that holds up once the cost skyrockets after OpenAI and Anthropic run out of money to burn and their investors want some returns.
Think about it this way: If your job survived the popularity of offshoring to engineers paid 10% of your salary, why would AI tooling kill it?
What you're missing is that fewer and fewer projects are going to need a ton of technical depth.
I have friends who'd never written a line of code in their lives who now use multiple simple vibe-coded apps at work daily.
> We hit the plateau on model improvement a few years back. We've only continued to see any improvement at all because of the exponential increase of money poured into it.
The genie is out of the bottle. Humanity is not going to stop pouring more and more money into AI.
> Sure it can. When the bubble pops there will be a question: is using an agent cost effective? Even if you think it is at $200/month/user, we'll see how that holds up once the cost skyrockets after OpenAI and Anthropic run out of money to burn and their investors want some returns.
The AI bubble isn't going to pop. This is like saying the internet bubble is going to pop in 1999. Maybe you will be right about short term economic trends, but the underlying technology is here to stay and will only trend in one direction: better, cheaper, faster, more available, more widely adopted, etc.
Again it's the opposite. A landscape of vibe coded micro apps is a landscape of buggy, vulnerable, points of failure. When you buy a product, software or hardware, you do more than buy the functionality you buy the assurance it will work. AI does not change this. Vibe code an app to automate your lightbulbs all you like, but nobody is going to be paying millions of dollars a year on vibe coded slop apps and apps like that is what keeps the tech industry afloat.
> Humanity is not going to stop pouring more and more money into AI.
There's no more money to pour into it. Even if you did, we're out of GPU capacity and we're running low on the power and infrastructure to run these giant data centres, and it takes decades to bring new fabs or power plants online. It is physically impossible to continue this level of growth in AI investment. Every company that's invested into AI has done so on the promise of increased improvement, but the moment that stops being true everything shifts.
> The AI bubble isn't going to pop. This is like saying the internet bubble is going to pop in 1999.
The internet bubble did pop. What happened after is an assessment of how much the tech is actually worth, and the future we have now 26 years later bears little resemblance to the hype in 1999. What makes you think this will be different?
Once the hype fades, the long-term unsuitability for large projects becomes obvious, and token costs increase by ten or one hundred times, are businesses really going to pay thousands of dollars a month on agent subscriptions to vibe code little apps here and there?
This is what everyone says when technology democratizes something that was previously reserved for a small number of experts.
When the printing press was invented, scribes complained that it would lead to a flood of poorly written, untrustworthy information. And you know what? It did. And nobody cares.
When the web was new, the news media complained about the same thing. A landscape of poorly researched error-ridden microblogs with spelling mistakes and inaccurate information. And you know what? They were right. That's exactly what the internet led to. And now that's the world we live in, and 90% of those news media companies are dead or irrelevant.
And here you are continuing the tradition of discussing a new landscape of buggy, vulnerable products. And the same thing will happen and already is happening. People don't care. When you democratize technology and you give people the ability to do something useful they never could do before without having to spend years becoming an expert, they do it en masse, and they accept the tradeoffs. This has happened time and time again.
> The internet bubble did pop... the future we have now 26 years later bears little resemblance to the hype in 1999. What makes you think this will be different?
You cut out the part where I said it only popped economically, but the technology continued to improve. And the situation we have now is even better than the hype in 1999:
They predicted video on demand over the internet. They predicted the expansion of broadband. They predicted the dominance of e-commerce. They predicted incumbents being disrupted. All of this happened. Look at the most valuable companies on earth right now.
If anything, their predictions were understated. They didn't predict mobile, or social media. They thought that people would never trust SaaS because it's insecure. They didn't predict Netflix dominating Hollywood. The internet ate MORE than they thought it would.
Ok, so another fundamental proposition is monetary resources are needed to fund said technology improvement.
Whats wrong with LLMs? They require immense monetary resources.
Is that a problem for now? No because lots of private money is flowing in and Google et al have the blessing of their shareholders to pump up the amount of cash flows going into LLM based projects.
Could all this stop? Absolutely, many are already fearing the returns will not come. What happens then? No more huge technology leaps.
What part of renting your ability to do your job is "democratizing"? The current state of AI is the literal opposite. Same for local models that require thousands of dollars of GPUs to run.
Over the past 20 years software engineering has become something that just about anyone can do with little more than a shitty laptop, the time and effort, and an internet connection. How is a world where that ability is rented out to only those that can pay "democratic"?
> When the printing press was invented, scribes complained that it would lead to a flood of poorly written, untrustworthy information. And you know what? It did. And nobody cares.
A bad book is just a bad book. If a novel is $10 at the airport and it's complete garbage then I'm out $10 and a couple of hours. As you say, who cares. A bad vibe coded app and you've leaked your email inbox and bank account and you're out way more than $10. The risk profile from AI is way higher.
Same is even more true for businesses. The cost of a cyberattack or a outage is measured in the millions of dollars. It's a simple maths, the cost of the risk of compromise far oughtweights the cost of cheaper upfront software.
> You cut out the part where I said it only popped economically, but the technology continued to improve.
The improvement in AI models requires billions of dollars a year in hardware, infrastructure, end energy. Do you think that investors will continue to pour that level of investment into improving AI models for a payout that might only come ten to fifteen years down the road? Once the economic bubble pops, the models we have are the end of the road.
If it generates the slop version in a week but it takes me 3 more weeks to clean it up, could I have I just done it right the first time myself in 4 weeks instead? How much money have I wasted in tokens?
In both cases, you feel super productive all the time, because you are constantly putting in instructions and getting massive amounts of output, and this feels like constant & fast progress. It's scary how easy it is to waste time on LLMs while not even realizing you are wasting time.
Soooooo....
As one who hasn't taken the plunge yet -- I'm basically retired, but have a couple of projects I might want to use AI for -- "time" is not always fungible with, or a good proxy for, either "effort" or "motivation"
> How much money have I wasted in tokens?
This, of course, may be a legitimate concern.
> If it generates the slop version in a week but it takes me 3 more weeks to clean it up, could I have I just done it right the first time myself in 4 weeks instead?
This likewise may be a legitimate concern, but sometimes the motivation for cleaning up a basically working piece of code is easier to find that the motivation for staring at a blank screen and trying to write that first function.
Cleaning up agent slop code by hand is also a miserable experience and makes me hate my job. I do it already because at $DAYJOB because my boss thinks “investing” in third worlders for pennies on the dollar and just giving them a Claude subscription will be better than investing in technical excellence and leadership. The ROI on this strategy is questionable at best, at least at my current job. Code Review by humans is still the bottleneck and delivering proper working features has not accelerated because they require much more iteration because of slop.
Would much rather spend the time making my own artisanal tradslop instead if it’s gonna take me the same amount of time anyway - at least it’s more enjoyable.
I completely agree that this is the case right now, but I do wonder how long it will remain the case.
The AI’s are more than capable of producing a mountain of docs from which to rebuild, sanely. They’re really not that capable - without a lot of human pain - of making a shit codebase good.
I often see criticism towards projects that are AI-driven that assumes that codebase is crystalized in time, when in fact humans can keep iterating with AI on it until it is better. We don't expect an AI-less project to be perfect in 0.1.0, so why expect that from AI? I know the answer is that the marketing and Twitter/LinkedIn slop makes those claims, but it's more useful to see past the hype and investigate how to use these tools which are invariably here to stay
That's a big leap of faith and... kinda contradicts the article as I understood it.
My experience is entirely opposite (and matches my understanding of the article): vibing from the start makes you take orders of magnitude more time to perfect. AI is a multiplier as an assistant, but a divisor as an engineer.
1. Autocomplete. Pretty simple; you only accept auto-completes you actually want, as you manually write code.
2. Software engineering design and implementation workflow. The AI makes a plan, with tasks. It commits those plans to files. It starts sub-agents to tackle the tasks. The subagents create tests to validate the code, then writes code to pass the tests. The subagents finish their tasks, and the AI agent does a review of the work to see if it's accurate. Multiple passes find more bugs and fix them in a loop, until there is nothing left to fix.
I'm amazed that nobody thinks the latter is a real thing that works, when Claude fucking Code has been produced this way for like 6 months. There's tens of thousands of people using this completely vibe-coded software. It's not a hoax.
also Claude Code is notoriously poorly built, so I wouldn't tout it as SOTA
And people can look at the results (illegally) because that whole bunch of code has been leaked. Let's just say it's not looking good. These are the folks who actually made and trained Claude to begin with, they know the model more than anyone else, and the code is still absolute garbage tier by sensible human-written code quality standards.
There is something at this point kind of surreal in the fact that you know everyday there will be this exact blog post and these exact comments.
Like, its been literal years and years and yall are still talking about the thing thats supposed to do other things. What are we even doing anymore? Is this dead internet? It boggles the mind we are still at this level of discourse frankly.
Love 'em hate 'em I don't care yall need to freaking get a grip! Like for the love god read a book, paint a picture! Do something else! This blog is just a journey to snooze town and we all must at some level know that. This feels like literal brain virus.