I am a very AI-forward person, but hallucinations are becoming more pernicious than ever even as they get less frequent, especially if the code actually works. A human absolutely has to guide these processes at a macro level for sustainability for SaaS as it evolves with business needs.
Maybe for one and done systems with no maintenance/no updates/no security patches you can reduce humans to SDETs, but systems like that are more the exception than the norm.
At least with concurrent and distributed systems stuff (which is really all I know nowadays), it is great at getting a prototype, but the code is generally mediocre-at-best and pretty sub-optimal. I don't know if it's because it is trained on a lot of mediocre and/or buggy code but for concurrency-heavy stuff I've been having to rewrite a lot of it myself.
I think that AI is great for getting a rough POC, and admittedly often a rough POC is good enough for a project (and a lot of projects never get beyond a rough POC), but I think software engineers will be needed for stuff that needs to be more polished.
I'm pretty sure it still saves me time, and if nothing else it's an excuse to write TLA+, and that's fun.
I had this same discussion at work the other day. I had an 80k line generated project dropped on my plate. It doesn’t use anything built into the web framework or orm. It’s a maintenance nightmare.
Example: I got Claude to generate a language server for TLA+ so I could have nice integration with Neovim. It took like 45 minutes of arguing with Claude and then it worked fine. This is incredibly low-stakes stuff: realistically the worst case scenario is that the text in the file gets screwed up, and I'm somewhat protected by Git if that happens.
That said, I am a little concerned how cavalier people have been deploying AI code everywhere. I don't want pacemaker firmware to be written by some intern in an afternoon with Claude.
Even still, other professions interact with the real social world which is not necessarily the case with programming. A lawyer will always be needed because judgments are and must be made by humans only. Software on the other hand can be built and tested in its own loop, especially now with human readable specifications. For example, I wanted to build an app and told Claude and it planned out the features, which I reviewed and accepted, then it built, wrote tests, used MCPs including the browser for interacting with the UI and taking screenshots of it, finding any bugs and regressions, and so on until an hour later it came back with the full app. Such a loop is not possible in other professions.
It's when you have to iterate to handle changing business needs, scale issues, and integrate with other systems where the entropy becomes a scary concern over a long enough timeline.
And it's not just "checking" - it's wholesale rejections of code, reframing prompts to target specific classes or approaches, etc... I don't think you will take the human out planning any time soon.
Honestly, I believe lower court judges will be the first job in the legal industry to become fully automated.
PMs turning their brain off and letting the LLMs extrapolate from quick and dirty bashing of text into a template (or, PMs throwing customer feedback at a slackbot to generate a jira ticket form it) can be better than PMs doing nothing but passing ill-defined reqs directly into the ticket, but that's a low bar. And it doesn't by itself solve the problems of the details that got generated for this ticket subtly conflicting with the details that got generated for (and implemented) in a different ticket 8 months ago.
The bottleneck is understanding, never "code generation."
Below is an an axiom which has served me well over the years. Perhaps it will for you as well.
When making software, remember that it is a snapshot of
your understanding of the problem. It states to all,
including your future-self, your approach, clarity, and
appropriateness of the solution for the problem at hand.
Choose your statements wisely.I'm guessing they've tried (or been induced to try by upper management), but given up because they don't know how to debug any problems that arise due to the LLM working itself into a corner.
Coding-agent LLMs act a lot like junior devs. And junior devs are: eager to write code before gathering requirements; often reaching for dumb brute-force solutions that require more work from them and are more error-prone, rather than embracing laziness/automation; getting confused and then "spinning their wheels" trying things that clearly won't work instead of asking for help; not recognizing when they've created an X-Y problem, and have then solved for their Y but not actually solved for the original problem X; etc.
The way you compensate for those inexperience-driven flaws in junior devs' approach, is to have them paired with, or fast-iteration-code-reviewed by, senior devs.
Insofar as a PM has development experience, it's usually only to the level of being a "junior dev" themselves. But to compensate for LLMs-as-junior-devs, they really need senior-dev levels of experience.
The good PMs know all of this, and so they're generally wary to take responsibility for driving the actual coding-agent development process on all but the most trivial change requests. A large part of a PM's job is understanding task assignment / delegation based on comparative advantage; and from their perspective, it's obvious that wielding LLMs in solution-space (as opposed to problem-space, as they do) is something still best left to the engineers trained to navigate solution-space.
Halfway there feels way overblown, and only seems to further devalue to work that devs do. Having clearly written requirements would be fantastic, and even as someone less pro AI even I can see great utility for it here, but its not halfway there to implementation. Not even 25% in all honesty, since edge cases and unforeseen consequences can cause changes to the spec midway through development.
Super glad to have gotten out when I did...
Judging by every PM I’ve worked with, 0% chance of this happening. Much sooner would see SWEs making PMs redundant than the other way around. Unless of course you want a system that falls apart like a house of cards as soon as you get a single user for your vaporware.
And then someone copy pastes it into Claude and now those inaccuracies become part of the code and tests.
It's the equivalent of writer's block and is why a common advice given to writers is to put anything they can onto the page then edit it later.
The PM has historically often not had a detailed enough mental model of the implementation to spot the hard parts in advance or a detailed enough mental model of the customer desires to know if it's gonna be the right thing or not.
Those are the things that killed waterfall.
You can use LLM tools to help you improve both those areas. Synthesizing large amounts of text and looking for inconsistencies.
But the 80th-percentile-or-lower person who was already not working hard to try to get ahead of those things still isn't going to work any harder than the next person and so won't gain much of a real edge.
Normally waterfall works where the scope is extremely-well defined and articulated in design plans. Which shortens dev time because prior to AI code was mostly deterministic. Here we have to do waterfall level of documentation while iterating on a non-deterministic solution (code gen) to non-deterministic requirements (per usual).
It's bonkers.
I still think the technology is cool though.
And to answer the questioner.. Have you worked with a PM? Most of the ones I've worked with try to be simultaneously in charge yet not responsible for anything. Validating something implies skill and responsibility.
Nobody "deserves" anything. They do have the jobs though. Thinking that the world isn't full of people doing what they need to do to get by who don't give a shit about fitting a fantasy ideal is wild.
We see it with code too right? It’s harder to review code than to write it.
On top of that the LLM can work so fast that the amount of things that need validating grows!
This is where humans get lazy and the problems come in IMO. Whether its a PM not validating their ticket, or a dev doing a bad code review.
Add on to that that the incentives currently are to move fast and trust the AI.
It becomes clear to me that a lot of that review work either won’t be done at all, or won’t be nearly thorough enough.
Reviewing code is harder than reviewing text because code does something and has interdependencies and therefore must be correct in its function, do not mix the two. This is like saying an editor reviewing an article or novel is harder than actually writing the novel which is blatantly incorrect.
Hahahahahaha. Sorry, I couldn't help myself; this reads like satire. The answer is "real life experience says otherwise".
I feel compelled to point out to you that this is a completely unsustainable, unsupportable, unsubstantiable claim. You have met ~0% of PMs, and of the ones you've met maybe you've experienced a non-zero percentage of their work, but statistically that's also very unlikely.
If you think you can say what most PMs do or what PMs are likely to do, then, I'm sorry, but you are not even thinking like an engineer. You're thinking, actually, a lot more like a PM to many of us.
> just like good devs
I'm so sorry, my sides just can't handle the starry-eyed nature of these takes. This is just too much for me.
To many of us this reads like you've never met people before. But who knows, maybe you live in Lake Wobegon, where all the women are strong, all the men are good-looking, and all the children are above average! If so then we're jealous, but you still should be more careful about how unrigorous your mental model is because it will make you a worse engineer.
Experience with different PMs and developers aside, the older you get in the profession the more you will hopefully realize that none of your quality effort fantasy matters. Sales happen and money rolls in independently of whether you think the PMs or the people who call themselves engineers do a "good job". Businesses thrive on sales and marketing, not engineering.
As to your latter point, not sure why you think I think business doesn't continue on even with bad employees, of course it does and I didn't say otherwise. But that does not mean they're doing a good job, those two are orthogonal concepts.
And I'm not sure how we even got to this, the original point was that I personally as a dev can physically see PM productivity increasing with AI, even as other devs in this thread seem not to. For a competent PM, a tool that automates a detailed first draft fundamentally changes the psychology of ticket creation. If your argument is just "bad PMs will still be bad," then sure, I agree, but that doesn't really engage with how the tooling changes the workflow for everyone else.
Uh. We're not talking about knowing what good is, which is completely irrelevant to anything in this thread. You made a claim without qualification about what it is more likely for PMs to do. I can't tell if you've lost the chain or are engaging in some kind of motte and bailey fallacy. Either way it's a bad sign for this conversation.
I'm going to summarize the threads so far. I hope it highlights why what you've said sounds so silly:
Someone: "I see X failing to do Y."
You: "X definitely do Y. Why would you think that X aren't doing Y? Doing Y is the obvious thing for X to do."
Someone: "I literally am seeing it happen right now."
You: "Well then those X are bad."
Someone: "Yeah, no shit. They just said as much."
You: "But most X would do Y."
Someone: "In my experience that is false."
Someone else: "Mine too."
Someone else: "Mine as well."
Someone else: "Same."
You: "The bad ones shouldn't have their jobs."
Someone: "They do though."
You: "But we can tell which ones are the bad ones."
Someone: "Bartender, another drink please."
If your technology relies on humans using it in ways that go against the ways they are inclined to use them, then that is an issue with the technology.
Are advanced calculators bad because a student could use the CAS to ace calculus homework, exams or the SAT without actually learning the material?
Is copy/paste bad because a person could use it to copy/paste code from one place to another without noticing some of the areas they need to update in the new location, adding bugs and missing a chance to learn some more subtleties of the system?
Is Git bad because a manager could use it to just measure performance by number of lines of code committed instead of doing more work to actually understand everyone's performance?
Many tools can be used lazily in ways that will directly work against a long term goal of improving knowledge and productivity.
ok, so for some of the jobs we're doing plausible sounding goo is just fine. and that's kinda sad. but the 'just playing around' case is fine for PSG, this isn't a serious effort but just seeing how things might work out without much effort.
taking the remainder, where understanding and intent are important, the role of the ai is produce PSG, but the intentional person now goes through everything and plucks out all the nonsense. this may take more or less time than simply writing it, but we should understand this is resulting in less real engagement by the ultimate author. where this is actually interesting is a parallel to Burrough's cutup method - where source text and audio were randomly scrambled and sometimes really clever and novel stuff pops out.
but to say the current model of vibe coding has much to offer in the second case is really quite unclear. to the extent to which coding is the production of boilerplate is really a problem with APIs and abstraction design. if we can get LLMs to mitigate some of that I the short term without causing too much distraction, that's fine, but we should really be using that to inform the solution to the fundamental problem.
so for me what's missing in your model is how LLMs are supposed to be used 'properly'. I don't think laziness is really the right cut here, make-work is make-work, and there's plenty of real work to be done. but in what sense does LLM usage for code actually improve our understanding of these systems and get us more agency?
> People who use AI because they are trying to avoid doing work fall into a completely different category than people using AI as a force multiplier and for skills/capabilities enhancements / quality improvement.
This statement is absolutely true. There are ways to use LLM tools to significantly improve the quality of your work instead of to avoid doing hard work. (And the result can easily become something that requires more hard thought, not less.)
Some that I frequently enjoy that are usable even if you don't want the machine to generate your actual code at all: * consistency-check passes asking it to look for issues or edge cases * evaluation of test coverage to suggest any missed tests or proposed new ones * evaluation of feasibility of different refactoring approaches (chasing down dependencies and call trees much more faster than I would be able to do by hand, etc)
> to the extent to which coding is the production of boilerplate is really a problem with APIs and abstraction design. if we can get LLMs to mitigate some of that I the short term without causing too much distraction, that's fine, but we should really be using that to inform the solution to the fundamental problem.
I generally would disagree with this, though. I don't think there's solely a problem with abstraction design, I think the inherent complexity of many systems in the business world is very high (though obviously different implementations make it different levels of painful). If that's a problem, it's a people/social one, not a technology problem.
In my future we lean into the fact that people want features, they want complexity, for many things - everybody's ideal just-for-them workflow/tooling would look slightly different than the next person's - and use these tools to build things that do more, not less. Like the evolution of spellcheck from something you manually ran, to something that constantly ran, to something that can autocorrect generaly-usefully when typing on a touchscreen.
Let's get back to finding more features/customization to delight users with.
This isn’t actually an argument for or against anything, I don’t know why people say this. It is entirely possible that people are using this brand new, historically unprecedented tool wrong.
Cars have been a huge success in spite of requiring people to learn a bunch of new things use them.
The classic "you're holding it wrong" was about the iPhone 4: sure, people could learn to hold the iPhone in such a way that they didn't block the particular parts of the antenna that were (supposedly) the problem. But "holding an iPhone" is a fairly natural thing to do, and if the way that people are going to do it naturally doesn't allow its antenna to connect properly, then that's a technology problem, not a human problem.
If the selling point for AI is "you can just talk to it, and it will do stuff for you!" (which may or may not be yours, personally, but it is for a lot of people), then you have to be able to acknowledge that "describing a problem or desire using natural language" is something that humans already do naturally. Thus, if they have to learn to describe their problem in very specific ways in order to get the AI to do what they want, and most people are not doing that, then that's a failure of the technology.
For the specific case at hand, what's being described is similar to the problem of self-driving cars: you're selling the benefit as being the AI taking a lot of the work off your shoulders; all you have to do is constantly check its work just in case it makes a mistake. Which is something that we already know, empirically and with lots and lots of data, that humans are bad at.
Once again, it's a technology issue. Not a human issue.
Cars can take you from place to place much faster than a horse can, all you have to do is learn to drive and constantly keep your hand on the wheel.
Part of using a technology is, well, learning how to use it. It's not the technology's fault that humans are lazy or not able to pay attention and crash.
Some people are lazy, plain and simple. If they want to blindly accept what the LLM tells them without critical analysis and review then that's on them.
Just lol. Is this what you guys mean by productivity boost?
Comical. LLM’s aren’t all that great - it’s more that most orgs are horribly inefficient. Like it’s amazing how bad they are.
That’s why Elon succeeded with spacex - he saw how horrible inefficient the industry was. And used that thinking to take a gamble and it’s paid off.
Considering that that’s been a running complaint for like 50 years, it doesn’t seem like project management is going to get better on its own at this point. So, yes, an LLM does represent a productivity boost in that area.
When the org is misaligned, mismanaged, has poor customer feedback loops, bad product market fit, too much bureaucracy, etc etc no amount of AI slop is going to make a meaningful impact on its bottom line. In fact, it will likely do the opposite through combination of exponentially increasing complexity, combined with worker force deskilling, layoffs, and rising token prices. Real bottleneck is and always has been communication & alignment.
It might make the employees _happier_ in the interim though, which, I believe, is what we're predominantly seeing during this AI mania. People fed up with the bullshit jobs of rewriting the same service for the 5th time in 2 years or creating TPS reports weekly just for their manager to throw them directly in the trash are absolutely giddy that they no longer have to do this manually. I think we need to question the economic value of these jobs in the first place, though.
I've worked at big tech prior to LLMs becoming a thing, and consistently saw projects of 20-50 people carried by 2-3 individuals that actually understood what needed to be done. I don't think this ratio will be any better with genAI, and I also don't think that tokenmaxxing has any meaningful correlation with impact. Bullshit jobs (and questionable personal projects) just get done faster now. Yay, I guess.
In the long run these highly inefficient firms are going to get destroyed by people who have a vision and can do what 100+ firms are doing and package it together as one solution that is far superior on dimensions that matter to firms.
At least when the PM still wrote it you could outright tell it was bullshit and made no sense. Now that is just obfuscated.
The truth of the matter is that software starts as basic CRUD and then given time and users evolves into its own special snowflake. Every single system given enough time and users will become a “complex system”.
Yes please, I've seen the vibecoded slop PMs put out every day because software engineering is simply not a skill they have, and I'd love to make a LOT of money fixing their crap once it dies in production <3
I can tell you right now most pm’s are absolutely useless and glorified project managers who don’t know how to think and get in the way - and don’t know how to enable engineers to be more productive.