upvote
I can't imagine SWEs will be reduced to SDETs anymore than attorneys will be reduced to spell-checkers on AI powered case briefs.

I am a very AI-forward person, but hallucinations are becoming more pernicious than ever even as they get less frequent, especially if the code actually works. A human absolutely has to guide these processes at a macro level for sustainability for SaaS as it evolves with business needs.

Maybe for one and done systems with no maintenance/no updates/no security patches you can reduce humans to SDETs, but systems like that are more the exception than the norm.

reply
I've noticed even more than the "hallucinations", just the code is generally quite bad.

At least with concurrent and distributed systems stuff (which is really all I know nowadays), it is great at getting a prototype, but the code is generally mediocre-at-best and pretty sub-optimal. I don't know if it's because it is trained on a lot of mediocre and/or buggy code but for concurrency-heavy stuff I've been having to rewrite a lot of it myself.

I think that AI is great for getting a rough POC, and admittedly often a rough POC is good enough for a project (and a lot of projects never get beyond a rough POC), but I think software engineers will be needed for stuff that needs to be more polished.

reply
I'm getting the impression that LLMs are just not very good at "reasoning" about time. I have definitely had success getting a coding agent to produce decent concurrent code, but I had to basically lead it by the nose, and I strongly suspect that in most cases it would have taken less time to just do it the old fashioned way.
reply
I've had good luck having it translate TLA+ specs to programming languages. The specs are written by me and my fingers, and I've done most of the interesting concurrency reasoning beforehand.

I'm pretty sure it still saves me time, and if nothing else it's an excuse to write TLA+, and that's fun.

reply
Numerous real world technical requirements can be solved with existing code, lightly modified. That’s basically LLM code’s bread and butter. The further you get from that, the closer the “time saved using LLM” line gets to zero, and once it crosses, it becomes the “time wasted using LLM” line. I think embedded and concurrent systems are going to require more unique code solutions than, say, a crud web app with a few interesting feature-building junkets.
reply
The code is quite terrible, but no one has ever cared about code quality, at least in my experience. All they’ve ever cared about is that “it works”. It’s why an army of juniors always write most of the code.

I had this same discussion at work the other day. I had an 80k line generated project dropped on my plate. It doesn’t use anything built into the web framework or orm. It’s a maintenance nightmare.

reply
I think there are plenty of projects where "good enough" really is "good enough"...maybe most apps? If you're just making a shitty simple app, I don't really care about code quality.

Example: I got Claude to generate a language server for TLA+ so I could have nice integration with Neovim. It took like 45 minutes of arguing with Claude and then it worked fine. This is incredibly low-stakes stuff: realistically the worst case scenario is that the text in the file gets screwed up, and I'm somewhat protected by Git if that happens.

That said, I am a little concerned how cavalier people have been deploying AI code everywhere. I don't want pacemaker firmware to be written by some intern in an afternoon with Claude.

reply
Yes I agree, the low stake, low evolution code is perfect for LLMs. The project I was handed is not that at all.
reply
Maybe you can ask Claude to reverse engineer what the original prompt was.
reply
By SDET I mean one who reviews not writes code, maybe we have different definitions of that term because you also mention humans being needed to guide the processes.

Even still, other professions interact with the real social world which is not necessarily the case with programming. A lawyer will always be needed because judgments are and must be made by humans only. Software on the other hand can be built and tested in its own loop, especially now with human readable specifications. For example, I wanted to build an app and told Claude and it planned out the features, which I reviewed and accepted, then it built, wrote tests, used MCPs including the browser for interacting with the UI and taking screenshots of it, finding any bugs and regressions, and so on until an hour later it came back with the full app. Such a loop is not possible in other professions.

reply
No one's arguing you can't stand up a good MVP.

It's when you have to iterate to handle changing business needs, scale issues, and integrate with other systems where the entropy becomes a scary concern over a long enough timeline.

And it's not just "checking" - it's wholesale rejections of code, reframing prompts to target specific classes or approaches, etc... I don't think you will take the human out planning any time soon.

reply
I agree, humans will always be there.
reply
> A lawyer will always be needed because judgments are and must be made by humans only.

Honestly, I believe lower court judges will be the first job in the legal industry to become fully automated.

reply
This afternoon I was speaking with a friend and mentioned that I need to find a lawyer for contracts. His immediate response was, "you don't need a lawyer, just use AI". Not an avenue I'm interested in going down.
reply
IMO the code-generation for boilerplate and the improvement of copypasta quality are much bigger improvements than that.

PMs turning their brain off and letting the LLMs extrapolate from quick and dirty bashing of text into a template (or, PMs throwing customer feedback at a slackbot to generate a jira ticket form it) can be better than PMs doing nothing but passing ill-defined reqs directly into the ticket, but that's a low bar. And it doesn't by itself solve the problems of the details that got generated for this ticket subtly conflicting with the details that got generated for (and implemented) in a different ticket 8 months ago.

reply
If you do that someone still needs to make sure the details make sense which, from experience, sometimes they will and sometimes they won't. When I open tickets using automation I often back into the ticket from a running implementation that passes tests so the description is at least internally consistent but there are often still issues that need corrected.
reply
That's what a good PM and developer pair should be doing, it's just that it's a lot faster for both of them now to review and work in tandem to get the feature done, because the bottleneck is the code generation.
reply
> That's what a good PM and developer pair should be doing, it's just that it's a lot faster for both of them now to review and work in tandem to get the feature done, because the bottleneck is the code generation.

The bottleneck is understanding, never "code generation."

Below is an an axiom which has served me well over the years. Perhaps it will for you as well.

  When making software, remember that it is a snapshot of 
  your understanding of the problem. It states to all, 
  including your future-self, your approach, clarity, and 
  appropriateness of the solution for the problem at hand. 
  Choose your statements wisely.
reply
> Honestly, with the first step, it seems the PMs are already halfway there to implementation of the feature so I wonder if in the future they'll just do everything themselves

I'm guessing they've tried (or been induced to try by upper management), but given up because they don't know how to debug any problems that arise due to the LLM working itself into a corner.

Coding-agent LLMs act a lot like junior devs. And junior devs are: eager to write code before gathering requirements; often reaching for dumb brute-force solutions that require more work from them and are more error-prone, rather than embracing laziness/automation; getting confused and then "spinning their wheels" trying things that clearly won't work instead of asking for help; not recognizing when they've created an X-Y problem, and have then solved for their Y but not actually solved for the original problem X; etc.

The way you compensate for those inexperience-driven flaws in junior devs' approach, is to have them paired with, or fast-iteration-code-reviewed by, senior devs.

Insofar as a PM has development experience, it's usually only to the level of being a "junior dev" themselves. But to compensate for LLMs-as-junior-devs, they really need senior-dev levels of experience.

The good PMs know all of this, and so they're generally wary to take responsibility for driving the actual coding-agent development process on all but the most trivial change requests. A large part of a PM's job is understanding task assignment / delegation based on comparative advantage; and from their perspective, it's obvious that wielding LLMs in solution-space (as opposed to problem-space, as they do) is something still best left to the engineers trained to navigate solution-space.

reply
deleted
reply
> it seems the PMs are already halfway there to implementation

Halfway there feels way overblown, and only seems to further devalue to work that devs do. Having clearly written requirements would be fantastic, and even as someone less pro AI even I can see great utility for it here, but its not halfway there to implementation. Not even 25% in all honesty, since edge cases and unforeseen consequences can cause changes to the spec midway through development.

reply
> Then devs can copy paste this Jira ticket content into the LLM agent of choice

Super glad to have gotten out when I did...

reply
> Honestly, with the first step, it seems the PMs are already halfway there to implementation of the feature so I wonder if in the future they'll just do everything themselves and a few devs will be around as SDETs rather than full blown implementers.

Judging by every PM I’ve worked with, 0% chance of this happening. Much sooner would see SWEs making PMs redundant than the other way around. Unless of course you want a system that falls apart like a house of cards as soon as you get a single user for your vaporware.

reply
Except... no one validates the generated tickets, and it's full of inaccuracies.

And then someone copy pastes it into Claude and now those inaccuracies become part of the code and tests.

reply
The PMs validate it, why do you think they don't read over it to make sure it fits what they want? You might say "well they're lazy, look why they didn't write enough detail to start off with" but for lots of people, reviewing something to make sure it's close to what they want and then tweaking it is much easier than writing it from scratch.

It's the equivalent of writer's block and is why a common advice given to writers is to put anything they can onto the page then edit it later.

reply
> The PMs validate it, why do you think they don't read over it to make sure it fits what they want?

The PM has historically often not had a detailed enough mental model of the implementation to spot the hard parts in advance or a detailed enough mental model of the customer desires to know if it's gonna be the right thing or not.

Those are the things that killed waterfall.

You can use LLM tools to help you improve both those areas. Synthesizing large amounts of text and looking for inconsistencies.

But the 80th-percentile-or-lower person who was already not working hard to try to get ahead of those things still isn't going to work any harder than the next person and so won't gain much of a real edge.

reply
I'm glad you mentioned it and TFA briefly mentioned waterfall. The second graph shown in the article with documentation overlapping the dev cycle, it's like the worse of both agile and waterfall. It's supposedly real-time waterfall.

Normally waterfall works where the scope is extremely-well defined and articulated in design plans. Which shortens dev time because prior to AI code was mostly deterministic. Here we have to do waterfall level of documentation while iterating on a non-deterministic solution (code gen) to non-deterministic requirements (per usual).

It's bonkers.

I still think the technology is cool though.

And to answer the questioner.. Have you worked with a PM? Most of the ones I've worked with try to be simultaneously in charge yet not responsible for anything. Validating something implies skill and responsibility.

reply
Then they're just bad PMs and don't deserve to have the job. That can be said in any profession, devs or lawyers or doctors who blindly accept LLM output without review are bad employees.
reply
> Then they're just bad PMs and don't deserve to have the job.

Nobody "deserves" anything. They do have the jobs though. Thinking that the world isn't full of people doing what they need to do to get by who don't give a shit about fitting a fantasy ideal is wild.

reply
Deserving and having are two different things, that doesn't mean they can't be criticized either way. By the same logic bad devs and bad dev practices can also be criticized.
reply
"They're bad PMs" does not meaningfully respond to people saying the world is full of bad PMs. They know. It was already given. Giving it again in response isn't engaging thoughtfully.
reply
deleted
reply
I think validating a fully generated novel of a ticket, is much harder than thinking through the problem in the first place and creating your own ticket.

We see it with code too right? It’s harder to review code than to write it.

On top of that the LLM can work so fast that the amount of things that need validating grows!

This is where humans get lazy and the problems come in IMO. Whether its a PM not validating their ticket, or a dev doing a bad code review.

Add on to that that the incentives currently are to move fast and trust the AI.

It becomes clear to me that a lot of that review work either won’t be done at all, or won’t be nearly thorough enough.

reply
The tickets are not "novel"-length, they are about a few bulleted lists of the sections I mentioned above. In that case it is indeed way easier to review that a ticket only saying "do X with Y data."

Reviewing code is harder than reviewing text because code does something and has interdependencies and therefore must be correct in its function, do not mix the two. This is like saying an editor reviewing an article or novel is harder than actually writing the novel which is blatantly incorrect.

reply
Most real tickets are more complicated than “Do x with Y data” and also have many interdependencies throughout the business
reply
Most? That's doubtful especially when a lot of tickets are simply CRUD which are fine being generated by an LLM. Those that are more complex require more review and interdependency management, sure, but to say that that is most tickets is simply not correct.
reply
I agree. I hate getting tickets like this because they’ve often gone down the wrong path and I have to work backwards to understand the actual problem and the right way to solve it
reply
just this week i pushed back on some requirements in a very detailed product spec I was implementing to speed up time to ship. The pm had no idea what I was talking about because the requirements were invented by an LLM. This is not a bad PM, discipline doesn't scale.
reply
> The PMs validate it, why do you think they don't read over it to make sure it fits what they want?

Hahahahahaha. Sorry, I couldn't help myself; this reads like satire. The answer is "real life experience says otherwise".

reply
Yeah I was so tempted to ask if this person has ever actually met a project/product manager...
reply
Maybe you both just have bad PMs, because just like good devs they should also be reviewing their work. My point was that it is more likely for PMs to review and edit a generated ticket than to have to write it all themselves which they often won't do.
reply
> My point was that it is more likely for PMs to

I feel compelled to point out to you that this is a completely unsustainable, unsupportable, unsubstantiable claim. You have met ~0% of PMs, and of the ones you've met maybe you've experienced a non-zero percentage of their work, but statistically that's also very unlikely.

If you think you can say what most PMs do or what PMs are likely to do, then, I'm sorry, but you are not even thinking like an engineer. You're thinking, actually, a lot more like a PM to many of us.

> just like good devs

I'm so sorry, my sides just can't handle the starry-eyed nature of these takes. This is just too much for me.

To many of us this reads like you've never met people before. But who knows, maybe you live in Lake Wobegon, where all the women are strong, all the men are good-looking, and all the children are above average! If so then we're jealous, but you still should be more careful about how unrigorous your mental model is because it will make you a worse engineer.

Experience with different PMs and developers aside, the older you get in the profession the more you will hopefully realize that none of your quality effort fantasy matters. Sales happen and money rolls in independently of whether you think the PMs or the people who call themselves engineers do a "good job". Businesses thrive on sales and marketing, not engineering.

reply
What a strange response. By your logic you've met ~0% of developers too yet I assume you can distinguish good development practices from bad. I also mentioned good PMs which by definition review and write good tickets with a clear explanation of the problem and what they want the solution to be. If personally meeting millions of people is the epistemic standard you have to know something then I'm not sure how you know anything at all.

As to your latter point, not sure why you think I think business doesn't continue on even with bad employees, of course it does and I didn't say otherwise. But that does not mean they're doing a good job, those two are orthogonal concepts.

And I'm not sure how we even got to this, the original point was that I personally as a dev can physically see PM productivity increasing with AI, even as other devs in this thread seem not to. For a competent PM, a tool that automates a detailed first draft fundamentally changes the psychology of ticket creation. If your argument is just "bad PMs will still be bad," then sure, I agree, but that doesn't really engage with how the tooling changes the workflow for everyone else.

reply
> yet I assume you can distinguish good development practices from bad

Uh. We're not talking about knowing what good is, which is completely irrelevant to anything in this thread. You made a claim without qualification about what it is more likely for PMs to do. I can't tell if you've lost the chain or are engaging in some kind of motte and bailey fallacy. Either way it's a bad sign for this conversation.

I'm going to summarize the threads so far. I hope it highlights why what you've said sounds so silly:

Someone: "I see X failing to do Y."

You: "X definitely do Y. Why would you think that X aren't doing Y? Doing Y is the obvious thing for X to do."

Someone: "I literally am seeing it happen right now."

You: "Well then those X are bad."

Someone: "Yeah, no shit. They just said as much."

You: "But most X would do Y."

Someone: "In my experience that is false."

Someone else: "Mine too."

Someone else: "Mine as well."

Someone else: "Same."

You: "The bad ones shouldn't have their jobs."

Someone: "They do though."

You: "But we can tell which ones are the bad ones."

Someone: "Bartender, another drink please."

reply
This failure is human laziness, not an issue with the technology. People who use AI because they are trying to avoid doing work fall into a completely different category than people using AI as a force multiplier and for skills/capabilities enhancements / quality improvement.
reply
It's also the only way to get those massive increases in productivity.
reply
This is very much a "you're holding it wrong" response.

If your technology relies on humans using it in ways that go against the ways they are inclined to use them, then that is an issue with the technology.

reply
I don't think that works as a critique of LLMs because it's far too broadly applicable to well-accepted tools.

Are advanced calculators bad because a student could use the CAS to ace calculus homework, exams or the SAT without actually learning the material?

Is copy/paste bad because a person could use it to copy/paste code from one place to another without noticing some of the areas they need to update in the new location, adding bugs and missing a chance to learn some more subtleties of the system?

Is Git bad because a manager could use it to just measure performance by number of lines of code committed instead of doing more work to actually understand everyone's performance?

Many tools can be used lazily in ways that will directly work against a long term goal of improving knowledge and productivity.

reply
but in this case that's exactly what AI is doing, and no more. its filling in the gaps with some plausible sounding goo so that the person doesn't have to worry about the details.

ok, so for some of the jobs we're doing plausible sounding goo is just fine. and that's kinda sad. but the 'just playing around' case is fine for PSG, this isn't a serious effort but just seeing how things might work out without much effort.

taking the remainder, where understanding and intent are important, the role of the ai is produce PSG, but the intentional person now goes through everything and plucks out all the nonsense. this may take more or less time than simply writing it, but we should understand this is resulting in less real engagement by the ultimate author. where this is actually interesting is a parallel to Burrough's cutup method - where source text and audio were randomly scrambled and sometimes really clever and novel stuff pops out.

but to say the current model of vibe coding has much to offer in the second case is really quite unclear. to the extent to which coding is the production of boilerplate is really a problem with APIs and abstraction design. if we can get LLMs to mitigate some of that I the short term without causing too much distraction, that's fine, but we should really be using that to inform the solution to the fundamental problem.

so for me what's missing in your model is how LLMs are supposed to be used 'properly'. I don't think laziness is really the right cut here, make-work is make-work, and there's plenty of real work to be done. but in what sense does LLM usage for code actually improve our understanding of these systems and get us more agency?

reply
I don't disagree with your take on most jobs or vibe coding as shown in countless proof-of-concept/0-to-1 demos. But the comment I was replying to was dismissing this statement from another commenter:

> People who use AI because they are trying to avoid doing work fall into a completely different category than people using AI as a force multiplier and for skills/capabilities enhancements / quality improvement.

This statement is absolutely true. There are ways to use LLM tools to significantly improve the quality of your work instead of to avoid doing hard work. (And the result can easily become something that requires more hard thought, not less.)

Some that I frequently enjoy that are usable even if you don't want the machine to generate your actual code at all: * consistency-check passes asking it to look for issues or edge cases * evaluation of test coverage to suggest any missed tests or proposed new ones * evaluation of feasibility of different refactoring approaches (chasing down dependencies and call trees much more faster than I would be able to do by hand, etc)

> to the extent to which coding is the production of boilerplate is really a problem with APIs and abstraction design. if we can get LLMs to mitigate some of that I the short term without causing too much distraction, that's fine, but we should really be using that to inform the solution to the fundamental problem.

I generally would disagree with this, though. I don't think there's solely a problem with abstraction design, I think the inherent complexity of many systems in the business world is very high (though obviously different implementations make it different levels of painful). If that's a problem, it's a people/social one, not a technology problem.

In my future we lean into the fact that people want features, they want complexity, for many things - everybody's ideal just-for-them workflow/tooling would look slightly different than the next person's - and use these tools to build things that do more, not less. Like the evolution of spellcheck from something you manually ran, to something that constantly ran, to something that can autocorrect generaly-usefully when typing on a touchscreen.

Let's get back to finding more features/customization to delight users with.

reply
> This is very much a "you're holding it wrong" response

This isn’t actually an argument for or against anything, I don’t know why people say this. It is entirely possible that people are using this brand new, historically unprecedented tool wrong.

Cars have been a huge success in spite of requiring people to learn a bunch of new things use them.

reply
It's not about having to learn things; it's about the required methods of using the tool going directly against the grain of the way people in general operate.

The classic "you're holding it wrong" was about the iPhone 4: sure, people could learn to hold the iPhone in such a way that they didn't block the particular parts of the antenna that were (supposedly) the problem. But "holding an iPhone" is a fairly natural thing to do, and if the way that people are going to do it naturally doesn't allow its antenna to connect properly, then that's a technology problem, not a human problem.

If the selling point for AI is "you can just talk to it, and it will do stuff for you!" (which may or may not be yours, personally, but it is for a lot of people), then you have to be able to acknowledge that "describing a problem or desire using natural language" is something that humans already do naturally. Thus, if they have to learn to describe their problem in very specific ways in order to get the AI to do what they want, and most people are not doing that, then that's a failure of the technology.

For the specific case at hand, what's being described is similar to the problem of self-driving cars: you're selling the benefit as being the AI taking a lot of the work off your shoulders; all you have to do is constantly check its work just in case it makes a mistake. Which is something that we already know, empirically and with lots and lots of data, that humans are bad at.

Once again, it's a technology issue. Not a human issue.

reply
> selling the benefit as being the AI taking a lot of the work off your shoulders; all you have to do is constantly check its work just in case it makes a mistake.

Cars can take you from place to place much faster than a horse can, all you have to do is learn to drive and constantly keep your hand on the wheel.

Part of using a technology is, well, learning how to use it. It's not the technology's fault that humans are lazy or not able to pay attention and crash.

reply
Maybe they are holding it wrong then. Like someone else said, people had to be taught how to drive a car and that cannot be in any sense said to be the car's fault.

Some people are lazy, plain and simple. If they want to blindly accept what the LLM tells them without critical analysis and review then that's on them.

reply
I second this
reply
lol

Just lol. Is this what you guys mean by productivity boost?

Comical. LLM’s aren’t all that great - it’s more that most orgs are horribly inefficient. Like it’s amazing how bad they are.

That’s why Elon succeeded with spacex - he saw how horrible inefficient the industry was. And used that thinking to take a gamble and it’s paid off.

reply
> most orgs are horribly inefficient

Considering that that’s been a running complaint for like 50 years, it doesn’t seem like project management is going to get better on its own at this point. So, yes, an LLM does represent a productivity boost in that area.

reply
The problem is that organizations are inefficient in such a way that extra output from white collar workers doesn't translate to improved org-wide performance in a positively correlated, linear fashion.

When the org is misaligned, mismanaged, has poor customer feedback loops, bad product market fit, too much bureaucracy, etc etc no amount of AI slop is going to make a meaningful impact on its bottom line. In fact, it will likely do the opposite through combination of exponentially increasing complexity, combined with worker force deskilling, layoffs, and rising token prices. Real bottleneck is and always has been communication & alignment.

It might make the employees _happier_ in the interim though, which, I believe, is what we're predominantly seeing during this AI mania. People fed up with the bullshit jobs of rewriting the same service for the 5th time in 2 years or creating TPS reports weekly just for their manager to throw them directly in the trash are absolutely giddy that they no longer have to do this manually. I think we need to question the economic value of these jobs in the first place, though.

I've worked at big tech prior to LLMs becoming a thing, and consistently saw projects of 20-50 people carried by 2-3 individuals that actually understood what needed to be done. I don't think this ratio will be any better with genAI, and I also don't think that tokenmaxxing has any meaningful correlation with impact. Bullshit jobs (and questionable personal projects) just get done faster now. Yay, I guess.

reply
Correct most people should be fired.

In the long run these highly inefficient firms are going to get destroyed by people who have a vision and can do what 100+ firms are doing and package it together as one solution that is far superior on dimensions that matter to firms.

reply
If only it was that simple. The reason these inefficient companies continue to exist is due to regulatory capture and monopolistic behavior. Competing with them doesn't just require better efficiency.
reply
The idea that PM tickets are now much improved because they paste their unbaked wrong "idea of what the ticket is" into ChatGPT to expand into a 500 word behemoth is hilarious.

At least when the PM still wrote it you could outright tell it was bullshit and made no sense. Now that is just obfuscated.

reply
You're probably right but that sounds like it's still a win to me.
reply
Not sure what your point is, LLMs don't have to be all that great to still show a productivity boost and especially if the organization is inefficient, then even more so.
reply
[flagged]
reply
Two hour old account just to make comments like this, it will get flagged. Next time use your main account.
reply
deleted
reply
Maybe for some subset of sotware (like CRM panels or something) PMs will do everything. But if you're projecting the way one sort of software (ie user-facing, business use oriented software) is developed and put to use with software writ large, then no I don't think so
reply
Sure, I'm just talking about 90% of software which is basic CRUD, not complex systems or microcontroller programming. In that case it's likely that just a PM could build something with LLMs.
reply
For basic CRUD we’ve had no code solutions that PMs could have been using for decades.

The truth of the matter is that software starts as basic CRUD and then given time and users evolves into its own special snowflake. Every single system given enough time and users will become a “complex system”.

reply
I literally can’t tell if this comment is a joke or not.
reply
The last sentence was party facetious sure but the first paragraph is not, I have seen ticket quality go up quite a bit from a few years ago.
reply
> Honestly, with the first step, it seems the PMs are already halfway there to implementation of the feature so I wonder if in the future they'll just do everything themselves

Yes please, I've seen the vibecoded slop PMs put out every day because software engineering is simply not a skill they have, and I'd love to make a LOT of money fixing their crap once it dies in production <3

reply
I’m a former PM who’s now a founder and all the engineers I worked with loved me.

I can tell you right now most pm’s are absolutely useless and glorified project managers who don’t know how to think and get in the way - and don’t know how to enable engineers to be more productive.

reply
I already do the latter, not very difficult to get into. Good consulting money.
reply