Developers should write their own code and use LLMs to design and verify. Better, faster architecture and planning, pre-cleaned PRs and no skill atrophy or loss of understanding on the part of the developer.
I come in knowing what I need to build and at least one idea or more of how it should be done. I present the problem, constraints, potential solutions, and ask for criticisms and alternatives. I can keep it as broad as possible or I can get more granular like struct layouts, api endpoints, etc. I go back and forth until there's an approach I prefer and then I code that approach.
| it can code pretty well given a very tight and limited scope.
It's wildly better at tight and limited scope than large scale changes but even then I would rather code it myself.
One thing I would like to see is the use of LLMs for smarter semi-manual editing.
While programming I often need to make very similar changes in several places. If the instances are similar enough I can get away with recording a one-off keyboard macro to repeat, but if there are differences that are too difficult to handle this way I end up needing to do a lot of manual editing.
It would be nice to see LLMs tightly integrated into the editor so I can do a simple "place the cursor at things like this" based on an example or two. I'm sure more ideas for using LLMs more quickly perform semantic changes you intended are possible, instead of just prompting for a big diff. I feel there's a lot more innovation possible in this direction, where you're still "coding it yourself" but just faster.
What I did was make one commit by hand (involving multiple files), and then told Codex (last year's Codex!) to make the equivalent changes to other instances in the code base.
Never understood that argument. Because there’s two steps in design. Finding a good solution (discussing prior art, tradeoffs,…) and then nailing the technical side of that solution (data structures, formula,…). Is it the former, the latter or both?
It does not actually, and not any faster.
Again I've lost count of how many times I've had an in-depth architectural discussion with ChatGPT, with it giving me the final mark of approval ("This is excellent"), only for me to discover a flaw in my approach or a radically simpler and better approach, go back to it with it, and for it to proclaim "Yeah this is a much better approach".
These LLMs are in many cases sycophantic confirmation machines. Yes, they are useful to some extent in helping you refine your ideas and think of edge cases. But they are nowhere close to actually thinking better and faster. Faster in the wrong direction is not just slow, you are actually going backward.
A paradigm shift is an earth shattering, very important change - a complete change in thinking etc. LLMs are not that. They are simply some pretty new tools. Nice tools but they will whip off your metaphorical thumb just as quickly as a miss-used table saw.
You'll note that you mention "engineers are offloading": that's not a paradigm shift. That's a bunch of engineers discovering a better slide rule.
I'm old enough to remember moving on from slide rules (I still have mine) through calculators (ditto) to using fag packets and napkins for their real intended purpose.
The drill-driver also took engineering by storm but no-one ever used the term paradigm shift (to be fair, I don't think it was invented at the time and I can't be arsed to look it up).
If this sounds melodramatic it’s likely that it hasn’t fully taken root where you are yet.
I see opinions split on like “it’s just a dirty untrustworthy tool that is making our lives and the world a living hell” and “this is the second coming of Christ”. The reality is that right now we lie on that first part of the spectrum, but I am looking over the hill and seeing 4 horses and they are stampeding this way
If she says “I’m sorry, I don’t know how to are you still just a script” then I have my answer. :P
LLMs are remarkable these days but they’re still missing a some essential insight. I’m far less confident now, though, that this will require another big breakthrough and not just a combination of tweaks.
Engage as a person, please.
And it's literally just a black box that generates more Javascript for their Next.js app
Generally, the whole point of the "Power to the people?" (and to some extent the "On being left behind") section(s) is to underscore the two antithetical claims made by many LLM marketers: 1. LLMs are so powerful and so natural and easy that someone with no experience can create amazing software, and 2. LLM usage is a core skill, one that if you don't begin training now you'll be left behind.
Obviously, both of these can't be simultaneously 100% true--either it's easy enough for the non-programming layperson to successfully generate software for an intentional purpose, or, LLM assisted programming is a skill you need to train to avoid professional obsolescence in modern society. So, the article disagrees with the majority of both claims, and accepts a weakened/minor portion of each: 1. LLM output is easy to generate but accurate prompting matters, and 2. when used for software development professionally, some amount of skilled human intervention does indeed seem necessary. And now these two claims do align.
However, if professional software engineers who work with and read code constantly, armed with the best software practices to aid LLMs we can determine, cannot use modern AI tools without shooting their feet off at relatively frequent rates, certainly you'd expect the layperson who must put an even greater amount of undue faith in the validity of the results to be at extremely high-risk of foot-shooting. It's not "gatekeeping" to forewarn people against unwarranted trust in LLM output, nor is it "gatekeeping" to suggest that modern tech communicators/marketers describing an overly flowery LLM tooling landscape might be doing people a disservice.
I've had this conversation with managers in multiple organizations this year: "Yes, you could totally vibe code that instead of paying for a SaaS. But you have strict contractual and professional obligations about data security. Do you want to be deposed and asked, 'So, did you really just vibe code the system that led to the data leak? Did the vibe coders have any professional qualifications? Did they even look at the code?'"
Similarly, a backend server that handles 8 million users a day is expected to stay up.
Now, there are 10,000 things that have less demanding requirements. I'm actually really delighted that people are able to vibe code their own tools with minimal knowledge of software engineering! We have been chronically underproducing niche software all along.
But if your software already has on-call shifts (and SLAs, etc) like the GP, then I think you want to be smart about how you combine human expertise with LLMs.
It feels like a dunk to write that. But I genuinely do think there's so much motivated reasoning on both sides of this issue, and one signal of that is when people tip their hands like this.
I was going to argue that companies got to choose their own auditors, so of course there were some bad ones out there. But looking at the market, it seems like (1) the race to the bottom has gotten ridiculous, and (2) the insurance companies do not currently trust the auditors in any meaningful way. So, yeah, point to you.
Once upon a time, I went through SOC2 audits where the auditors asked lots of questions about Vault and really tried to understand how credentials got handled. Sure, that was exceptional even at the time.
But that still leaves a whole pile of other audits and regulatory frameworks I need to comply with. Probably most of these frameworks will eventually accept "The code was written by an LLM and reviewed by an actual programmer." I am less certain that you'll be able to get away with vibe coding regulated systems any time soon.
My thing here is: you want to summon some kind of deus ex machina reason why the unpredictability (say) of agent-generated software will fail in the real world, but the concrete one you came up with fails to make that argument, pretty abruptly. Which makes me think the argument is less about the world as it is and more about the world as you'd hope it would be, if that makes sense.
Would you have the same reaction to requiring an approval for a production deployment? That’s driving the development process.
—-
Also jfc I need to cool it with the buzzwords, sorry I just got home from “talk like this all day” $job
It does? You mean "it tests itself faster", which is not really a test now, is it?
Funny, I thought that the major hurdle is improving accuracy and reliability, as it's always been. Engineering is necessary and useful, but it's a much simpler problem, which is why everyone is jumping on it.
I'm sure it was very difficult to program in machine code, but if now (or soon) anyone can just write software using a LLM without any sort of learning it changes everything. LLMs can plan and create something usable from simple instructions or ideas, and they will only get better.
I think LLMs will be (and already are) useful for many more things than programming anyway.
Did you read the section "Power to the People?" ? In it, the author dismantles your thesis with powerful, highly plausible arguments.
1. You don't have to be an LLM expert to get good, consistent results with LLMs.
My best vibe-code process after years of using LLMs is to have Claude Code create a plan file and then cycle it through Codex until Codex finds nothing more to review, then have an agent implement it. This process is trivial yet produces amazing results.
It's solved by better and better harnesses.
2. You don't have to write technical specs. The LLM does that for you. You just tell it "I want the next-tab button to wrap back to the first one" and it generates a technical plan. Natural language is fine.
3. Software that seems to work only to fail down the line in production is already how software works today. With LLMs you can paste the stacktrace or user bug email and it will fix it.
This is why vibe-coding works. Instead of simulating how an app will run in your head looking at its code, you run the app and tell the LLM what isn't working correctly. The app spec is derived iteratively through a UX feedback look.
4. I don't understand TFA's goalposts, but letting people create software that are only interested in the LLM process (rather than the software craftsmanship) would be a huge democratization of software.
> 1. You don't have to be an LLM expert to get good, consistent results with LLMs.
You don't get good consistent results with LLMs, expert or not
> 2. You don't have to write technical specs. The LLM does that for you. You just tell it "I want the next-tab button to wrap back to the first one" and it generates a technical plan. Natural language is fine.
Try this, have Claude write a section in your specs titled "Performance Optimizations" and see the gibberish it will come up with. Fluffy lists with no actually useful content specific to the project. This is a severe problem with LLM-driven speccing I have encountered uncountable times. I now rarely allow them to touch the specs document.
> 3. Software that seems to work only to fail down the line in production is already how software works today. With LLMs you can paste the stacktrace or user bug email and it will fix it.
And pretty soon you have a big ball of mud. But I guess if the rate of bugs accelerate, the LLMs can also "fix" them faster
> This is why vibe-coding works. Instead of simulating how an app will run in your head looking at its code, you run the app and tell the LLM what isn't working correctly. The app spec is derived iteratively through a UX feedback look.
I should tell you about the markdown viewer with specific features I want, that I have wanted to build only with LLM vibe-coding, and how none of them are able to do it.
Would there even be a debate in the tech community if such unassailable arguments existed? The author is entirely entitled to his opinion, just as I am allowed to disagree with him (not sure why I am also downvoted). The good thing is, if I'm right, we will see it in less than 10 years.
I don't buy that's true. The "only" part, anyway. Look at how UX with software has evolved. This is gonna be an old man yells at clouds take, but before smartphones, there were hotkeys. And man, you could fly with those things. The computers running things weren't as fast as they are today, but you could mash in a a whole sequence thru muscle memory, and just wait for it to complete. Now, you have to poke at your phone, wait for it to respond, poke at it some more. It's really not great for getting fast at it. AI advancement is going to be like that. Directionally generally it will be better, but there's going to be some niche where, y'know what, ChatGPT-4o really had it in a way that 5.5 does not. (Rose colored glasses not included.)
Then came the new Claude update, which many people say is worse. Even Anthropic says it got worse.[1] HN discussion back on April 15th: [2]
Some of this is a pricing issue. Turning "default reasoning effort" down from "high" to "medium" was a form of shrinkflation. Maybe this technology is hitting a price/performance wall.
[1] https://www.anthropic.com/engineering/april-23-postmortem