undefined

upvote

points

by locusofself9 hours ago |

upvote

by happytoexplain8 hours ago|

[-]

No, but that's the crux of the AI problem in software. Time to write code was never the bottleneck. AI is most useful for learning, either via conversation or by seeing examples. It makes writing code faster too, but only a little after you take into account review. The cases where it shines are high-profile and exciting to managers, but not common enough to make a big difference in practice. E.g AI can one-shot a script to get logs from a paginated API, convert it to ndjson, and save to files grouped by week, with minimal code review, but only if I'm already experienced enough to describe those requirements, and, most importantly, that's not what I'm doing every day anyway.

reply

upvote

by brandensilva7 hours ago|

[-]

I'm finding it in some cases I'm dealing with even more code given how much code AI outputs. So yeah, for some tasks I find myself extremely fast but for others I find myself spending ungodly amounts of time reviewing the code I never wrote to make sure it doesn't destroy the project from unforseen convincing slop.

reply

upvote

by ritlo8 hours ago|

[-]

A related Dirty Secret that's going to become clear from all this is that a very large proportion of code in the wild (yes, even in 2026—maybe not in FAANG and friends, IDK, but across all code that is written for pay in the entire economy) has limited or no automated test coverage, and is often being written with only a limited recorded spec that's usually fleshed out only to the degree needed (very partial) as a given feature is being worked on.

What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.

I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".

We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.

On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.

reply

upvote

by ansibsha6 hours ago|

[-]

> better-specified software

Code is the most precise specification we have for interfacing with computers.

reply

upvote

by tmaly5 hours ago|

[-]

There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?

reply

upvote

by marginalia_nu4 hours ago|

[-]

Machine code is still code, even if the representation is a bit less legible than the punch cards we used to use.

reply

upvote

by 3 hours ago|

[-]

deleted

reply

upvote

by interestpiqued2 hours ago|

[-]

You’re missing the point of a spec

reply

upvote

by slopinthebag7 hours ago|

[-]

Re. productivity, if LLM's are a genuine boost with 1/3 of the work, neutral 1/3 of the time, and actually worse 1/3 of the time, it's likely we aren't really seeing performance improvements as 1) people are using them for everything and b) we're still learning how to best use them.

So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.

reply

upvote

by dboreham4 hours ago|

[-]

Bingo. Hopefully there are some business opportunities for us in that truth.

reply

upvote

by _wire_8 hours ago|

[-]

> because the time will just go into tests that wouldn't otherwise have been written

Writing tests to ensure a program is correct is the same problem as writing a correct program.

Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.

Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.

You can push uncertainty around and but you can't eliminate it.

This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.

As Douglas Adams noted: ultimately you've got to know where your towel is.

reply

upvote

by layer83 hours ago|

[-]

A competent programmer proves the program he writes correct in his head. He can certainly make mistakes in that, but it’s very different from writing tests, because proofs abstract (or quantify) over all states and inputs, which tests cannot do.

reply

upvote

by 3 hours ago|

[-]

deleted

reply

upvote

by shimman8 hours ago|

[-]

These companies don't care about saving time or lowering operating costs, they have massive monopolies to subsidize their extremely poor engineering practices with. If the mandate is to force LLM usage or lose your job, you don't care about saving time; you care about saving your job.

One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.

It has to end.

reply

upvote

by SchemaLoad2 hours ago|

[-]

The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.

reply

upvote

by marginalia_nu8 hours ago|

[-]

To be honest, some times it's still beneficial.

For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.

reply

upvote

by misnome7 hours ago|

[-]

And spotting stuff in review! Sometimes it’s false positives but on several occasions I’ve spent ~15-30 minutes teaching-reviewing a PR in person, checked afterwards and it matched every one of the points.

reply

upvote

by bluGill8 hours ago|

[-]

Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.

reply

upvote

by hard248 hours ago|

[-]

Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.

People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.

reply

upvote

by bonesss8 hours ago|

[-]

That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.

Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.

An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.

reply

upvote

by demosito6664 hours ago|

[-]

> Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place

The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.

A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.

Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.

reply

upvote

by bluGill3 hours ago|

[-]

I have found ai extreemly good at finding all those really hard bugs though. Ai is a greater force multiplier when there is a complex bug than in gneen field code.

reply

upvote

by bluGill8 hours ago|

[-]

Sortof. I work on a system too large for anyone to know the whole thing. Often people who don't know each other do something that will break the other. (Often because of the number of different people - most individuals go years between this)

reply

upvote

by raw_anon_11117 hours ago|

[-]

No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”

reply

upvote

by ansibsha6 hours ago|

[-]

No you’re not. The “how” is your job to understand, and if you don’t you’ll end up like the devs in the article.

We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs give you the illusion of this.

reply

upvote

by raw_anon_11116 hours ago|

[-]

No in my case the “how” is

1. I spoke to sales to find out about the customer

2. I read every line of the contract (SOW)

3. I did the initial requirements gathering over a couple of days with the client - or maybe up to 3 weeks

3. I designed every single bit of AWS architecture and code

4. I did the design review with the client

5. I led the customer acceptance testing

> We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs

I assure you the mid level developers or god forbid foreign contractors were not “experts” with 30 years of coding experience and at the time 8 years of pre LLM AWS experience. It’s been well over a decade - ironically before LLMs - that my responsibility was only for code I wrote with my own two hands

reply

upvote

by ansibsha2 hours ago|

[-]

Yes, and trusting an LLM here is not a good idea. You know it will make important mistakes.

I’m not saying trusting cheap devs is a good idea either. I do think cheap devs are actually at risk here.

reply

upvote

by raw_anon_11111 hours ago|

[-]

I am not “trusting” either - I’m validating that they meet the functional and non functional requirements just like with an LLM. I have never blindly trusted any developer when my neck was the one on the line in front of my CTO/director or customer.

I didn’t blindly trust the Salesforce consultants either. I also didn’t verify every line of oSql (not a typo) they wrote.

reply

upvote

by icedchai46 minutes ago|

[-]

Actually, it's SOQL. I did Salesforce crap for many years.

reply

upvote

by 3 hours ago|

[-]

deleted

reply