undefined

points

[-]

Debugging, sanity checking, testing, etc. are the best uses of LLMs. Much better than writing code.

Developers should write their own code and use LLMs to design and verify. Better, faster architecture and planning, pre-cleaned PRs and no skill atrophy or loss of understanding on the part of the developer.

by jb199116 hours ago|

parent|

[-]

Funny, I have the complete opposite impression after using claude code for a while. I would never trust it to design anything. Never again. But it can code pretty well given a very tight and limited scope.

by michaelchisari16 hours ago|

parent|

[-]

To clarify, AI should not do the design itself. You develop the design in conversation with AI.

I come in knowing what I need to build and at least one idea or more of how it should be done. I present the problem, constraints, potential solutions, and ask for criticisms and alternatives. I can keep it as broad as possible or I can get more granular like struct layouts, api endpoints, etc. I go back and forth until there's an approach I prefer and then I code that approach.

| it can code pretty well given a very tight and limited scope.

It's wildly better at tight and limited scope than large scale changes but even then I would rather code it myself.

by radarsat116 hours ago|

parent|

[-]

> It's wildly better at tight and limited scope than large scale changes but even then I would rather code it myself.

One thing I would like to see is the use of LLMs for smarter semi-manual editing.

While programming I often need to make very similar changes in several places. If the instances are similar enough I can get away with recording a one-off keyboard macro to repeat, but if there are differences that are too difficult to handle this way I end up needing to do a lot of manual editing.

It would be nice to see LLMs tightly integrated into the editor so I can do a simple "place the cursor at things like this" based on an example or two. I'm sure more ideas for using LLMs more quickly perform semantic changes you intended are possible, instead of just prompting for a big diff. I feel there's a lot more innovation possible in this direction, where you're still "coding it yourself" but just faster.

by strange_quark15 hours ago|

parent|

[-]

I've had a similar thought. A super refactor feature would be amazing, but wouldn't fit into the current zeitgeist of agent everything. Hopefully as the hype starts to die down and prices go up, we'll get some of these smaller, more targeted features.

by empthought14 hours ago|

parent|

[-]

You don't need a special feature for this. Just tell the coding assistant what to do.

by soco6 hours ago|

parent|

[-]

Then watch it f'up half your codebase because it thinks it's slightly related to your examples. The alternative, giving it 10 examples, is actually more work.

by empthought14 hours ago|

parent|

prev|

[-]

You should try using the existing agents for your semi-manual editing. You don't need editor support. The coding agent can find "things like this" faster than you can. Just tell it what to look for and how to change it.

What I did was make one commit by hand (involving multiple files), and then told Codex (last year's Codex!) to make the equivalent changes to other instances in the code base.

by xixixao8 hours ago|

parent|

prev|

[-]

Wait, have you been using Cursor? This is exaclty what it does fairly well.

by skydhash15 hours ago|

parent|

prev|

[-]

> I come in knowing what I need to build and at least one idea or more of how it should be done. I present the problem, constraints, potential solutions, and ask for criticisms and alternatives

Never understood that argument. Because there’s two steps in design. Finding a good solution (discussing prior art, tradeoffs,…) and then nailing the technical side of that solution (data structures, formula,…). Is it the former, the latter or both?

by redsocksfan452 hours ago|

parent|

[-]

[dead]

by dyauspitr15 hours ago|

parent|

prev|

[-]

They’re actually really good at both. Writing code and all the paraphernalia around it.

by prmph6 hours ago|

prev|

[-]

> It thinks faster

It does not actually, and not any faster.

Again I've lost count of how many times I've had an in-depth architectural discussion with ChatGPT, with it giving me the final mark of approval ("This is excellent"), only for me to discover a flaw in my approach or a radically simpler and better approach, go back to it with it, and for it to proclaim "Yeah this is a much better approach".

These LLMs are in many cases sycophantic confirmation machines. Yes, they are useful to some extent in helping you refine your ideas and think of edge cases. But they are nowhere close to actually thinking better and faster. Faster in the wrong direction is not just slow, you are actually going backward.

by gerdesj13 hours ago|

prev|

[-]

"paradigm shift"

A paradigm shift is an earth shattering, very important change - a complete change in thinking etc. LLMs are not that. They are simply some pretty new tools. Nice tools but they will whip off your metaphorical thumb just as quickly as a miss-used table saw.

You'll note that you mention "engineers are offloading": that's not a paradigm shift. That's a bunch of engineers discovering a better slide rule.

I'm old enough to remember moving on from slide rules (I still have mine) through calculators (ditto) to using fag packets and napkins for their real intended purpose.

The drill-driver also took engineering by storm but no-one ever used the term paradigm shift (to be fair, I don't think it was invented at the time and I can't be arsed to look it up).

by aspenmartin12 hours ago|

parent|

[-]

I would argue LLMs are possibly the largest paradigm shift the world has ever seen, and we are only at the beginning. The entire scaffolding and structure of programming is in the process of changing — coding has moved to orchestration and testing and governance of how to manage and productionalize code that has surpassed the capacity of human review.

If this sounds melodramatic it’s likely that it hasn’t fully taken root where you are yet.

I see opinions split on like “it’s just a dirty untrustworthy tool that is making our lives and the world a living hell” and “this is the second coming of Christ”. The reality is that right now we lie on that first part of the spectrum, but I am looking over the hill and seeing 4 horses and they are stampeding this way

by nothinkjustai9 hours ago|

parent|

[-]

No offense but this reads like AI psychosis

by aspenmartin1 hours ago|

parent|

[-]

Well I’m not offended but it sounds like you may not be paying attention? Do you know the capital outlay that has gone into infra buildouts? several people here have described “6 months” of AI mania—-the fact that people are saying 6 months is exactly the point. Development has been going on since 2010s. All of the “boosters” as HN likes to say have been saying “hey this thing is huge and the performance trends are startling, get ready” and people then say “that’s psychotic I can’t even get Siri to understand my name”. Sure enough, 6 months ago we hit a performance inflection point where “madness” has begun. That’s just when you started paying attention, the rate of change has not stopped. Pretty easy to predict what happens next…

by taneq46 minutes ago|

parent|

[-]

Maybe it’s time to ask Siri again, “hey are you smart yet or are you still just a script?”

If she says “I’m sorry, I don’t know how to are you still just a script” then I have my answer. :P

LLMs are remarkable these days but they’re still missing a some essential insight. I’m far less confident now, though, that this will require another big breakthrough and not just a combination of tweaks.

by aspenmartin8 minutes ago|

parent|

[-]

I accept your skepticism is all I can say but just consider we’re not talking about the most important numbers and topics in this conversation. We have a lot of mileage left in the current stack. Nothing is plateauing though you wouldn’t know it if you read HN.

by viking1237 hours ago|

parent|

prev|

[-]

lmao

by gerdesj12 hours ago|

parent|

prev|

[-]

[How did you bang out this: — on your keyboard? Why did you decide to use backticks and 66/99 for quotes - nice but its not you is it?]

Engage as a person, please.

by aspenmartin12 hours ago|

parent|

[-]

I typed this out, character by painful human character, on an iPhone. It is indeed me!

by aspenmartin11 hours ago|

parent|

prev|

[-]

Oh also! Two dashes on my phone converts to an EN dash I think (not em dash!)

by taneq3 hours ago|

parent|

[-]

Let’s test that — hmm, I think it did?

by taneq3 hours ago|

parent|

prev|

[-]

iOS automatically ‘replaces’ “quotes” with open/close quotes… and triple full stops with ellipses.

by fatata1232 hours ago|

parent|

prev|

[-]

[dead]

by raincole2 hours ago|

parent|

prev|

[-]

In the past half a century, product design went from making precise diagrams on paper to CAD. Do you think it's fair to call this paradigm shift or CAD is just some pretty new tool?

by Leynos7 hours ago|

parent|

prev|

[-]

When the nature of your job changes fundamentally in the space of a year, "paradigm shift" feels unsettlingly appropriate.

by slopinthebag9 hours ago|

parent|

prev|

[-]

"paradigm shift"

And it's literally just a black box that generates more Javascript for their Next.js app

by oytis16 hours ago|

prev|

[-]

The article addresses exactly this objection. Most importantly, it quotes that AI coding tools have a detrimental effect on software stability - which is basically raison d'etre for our profession. When it produces more robust software and handles on-call shifts better than humans, I will consider programming done.

by tptacek16 hours ago|

parent|

[-]

I'm excited to read the first cogent piece making this point that doesn't devolve to gatekeeping, a detached and vaguely hostile professional software developer telling people with a newfound capability to solve practical problems for themselves with new software that they don't or shouldn't want the thing that they want, because whatever it is they come up with won't be "fit for purpose" until blessed by the guild, which has bylaws extrapolated from Brooks about the fundamental "limitations of LLMs".

by sov13 hours ago|

parent|

[-]

I think you've misread the article, specifically the purpose of the Brooks quotes. They're clearly not to denigrate anyone for wanting a more useful or convenient way of generating software--they're specifically about the "LLMs will obliterate software engineering as a profession" claims put forth by many LLM marketers. In fact, Brooks is never once mentioned in the "Power to the people?" section of the essay.

Generally, the whole point of the "Power to the people?" (and to some extent the "On being left behind") section(s) is to underscore the two antithetical claims made by many LLM marketers: 1. LLMs are so powerful and so natural and easy that someone with no experience can create amazing software, and 2. LLM usage is a core skill, one that if you don't begin training now you'll be left behind.

Obviously, both of these can't be simultaneously 100% true--either it's easy enough for the non-programming layperson to successfully generate software for an intentional purpose, or, LLM assisted programming is a skill you need to train to avoid professional obsolescence in modern society. So, the article disagrees with the majority of both claims, and accepts a weakened/minor portion of each: 1. LLM output is easy to generate but accurate prompting matters, and 2. when used for software development professionally, some amount of skilled human intervention does indeed seem necessary. And now these two claims do align.

However, if professional software engineers who work with and read code constantly, armed with the best software practices to aid LLMs we can determine, cannot use modern AI tools without shooting their feet off at relatively frequent rates, certainly you'd expect the layperson who must put an even greater amount of undue faith in the validity of the results to be at extremely high-risk of foot-shooting. It's not "gatekeeping" to forewarn people against unwarranted trust in LLM output, nor is it "gatekeeping" to suggest that modern tech communicators/marketers describing an overly flowery LLM tooling landscape might be doing people a disservice.

by oytis15 hours ago|

parent|

prev|

[-]

I am less sure about his argument about democratising software indeed. The only problem in my own life that I solve with software is a problem of getting paid, so what do I know. If someone can generate a piece of code for their needs, and they don't risk harming anyone but themselves, then it's a great application of LLMs.

by ekidd15 hours ago|

parent|

prev|

[-]

The unfortunate reality is that a lot of software does have hard constraints. And a lot of these constraints are "gatekept" by regulators, compliance policies, insurance companies, etc. If someone slops together a medical record system, and leaks a bunch of PHI, there will be consequences, even in the US. Similarly, good luck getting insurance against cyber attacks without a SOC2 audit or equivalent.

I've had this conversation with managers in multiple organizations this year: "Yes, you could totally vibe code that instead of paying for a SaaS. But you have strict contractual and professional obligations about data security. Do you want to be deposed and asked, 'So, did you really just vibe code the system that led to the data leak? Did the vibe coders have any professional qualifications? Did they even look at the code?'"

Similarly, a backend server that handles 8 million users a day is expected to stay up.

Now, there are 10,000 things that have less demanding requirements. I'm actually really delighted that people are able to vibe code their own tools with minimal knowledge of software engineering! We have been chronically underproducing niche software all along.

But if your software already has on-call shifts (and SLAs, etc) like the GP, then I think you want to be smart about how you combine human expertise with LLMs.

by skydhash15 hours ago|

parent|

[-]

That’s why the biggest proponent of LLM tooling are managers and entrepreneurs (aka people that are incentivized to reduce costs due to salary costs). But anyone that has to keep the system running and doesn’t want to wake up in the middle of the night is rightly cautious.

by kasey_junk14 hours ago|

parent|

[-]

I’m literally tasked with reliability engineering and llms are far and away the biggest boost in that in my career.

by threecheese13 hours ago|

parent|

[-]

To be fair, that’s a role which most companies don’t have; even if they have a titled “SRE”, many times it’s a sysadmin in a hat, looking very tired and nervous. It must be fun right now tho

by tptacek14 hours ago|

parent|

prev|

[-]

OK, I have no idea who you are, and this isn't personal, I'm responding to a comment and not a person --- but this is an argument that posits that one of the big problems with LLM software is "SOC2 audits". Since SOC2 audits are basically not a meaningful thing, I'm left wondering if the rest of your argument is similarly poorly supported.

It feels like a dunk to write that. But I genuinely do think there's so much motivated reasoning on both sides of this issue, and one signal of that is when people tip their hands like this.

by ekidd11 hours ago|

parent|

[-]

No offense taken.

I was going to argue that companies got to choose their own auditors, so of course there were some bad ones out there. But looking at the market, it seems like (1) the race to the bottom has gotten ridiculous, and (2) the insurance companies do not currently trust the auditors in any meaningful way. So, yeah, point to you.

Once upon a time, I went through SOC2 audits where the auditors asked lots of questions about Vault and really tried to understand how credentials got handled. Sure, that was exceptional even at the time.

But that still leaves a whole pile of other audits and regulatory frameworks I need to comply with. Probably most of these frameworks will eventually accept "The code was written by an LLM and reviewed by an actual programmer." I am less certain that you'll be able to get away with vibe coding regulated systems any time soon.

by tptacek11 hours ago|

parent|

[-]

SOC2 has never been about software resilience. You can create a set of attestations that will require you to present evidence to your auditors (who are ~accountants and will not know what the dotted quads of an IP address mean) about software quality, but there is no reason to do that and most organizations don't. SOC2 cares a great deal more about access management (in the "plotting on spreadsheet" sense) than it does about vulnerabilities.

My thing here is: you want to summon some kind of deus ex machina reason why the unpredictability (say) of agent-generated software will fail in the real world, but the concrete one you came up with fails to make that argument, pretty abruptly. Which makes me think the argument is less about the world as it is and more about the world as you'd hope it would be, if that makes sense.

by yellowapple14 hours ago|

parent|

prev|

[-]

Since when are SOC audits not a meaningful thing?

by kasey_junk14 hours ago|

parent|

[-]

If soc audits are driving your development process you are doing it backwards. And _certainly_ a time is coming when just using the llm will be soc compliant.

by threecheese13 hours ago|

parent|

[-]

I’d think any company big enough or working in certain markets which has a Compliance Officer cares about this; regulations are a legitimate business risk, and software integration contracts have security control compliance requirements which very much impact the sdlc.

Would you have the same reaction to requiring an approval for a production deployment? That’s driving the development process.

—-

Also jfc I need to cool it with the buzzwords, sorry I just got home from “talk like this all day” $job

by tptacek12 hours ago|

parent|

[-]

SOC2 is generally regarded as a joke and has in fact almost nothing to do with software resilience even on its own terms.

by cfloyd15 hours ago|

parent|

prev|

[-]

Nailed it

by paganel15 hours ago|

prev|

[-]

> , it tests N(x) faster.

It does? You mean "it tests itself faster", which is not really a test now, is it?

by cfloyd15 hours ago|

parent|

[-]

I use one model for coding and another writing tests for that very reason. It’s surprisingly good at TDD

by guille_14 hours ago|

parent|

[-]

I find this fascinating because it's the sort of anthropomorphism that betrays a fundamental understanding of what an LLM is. Language models are not people. You can just achieve the same thing with a fresh context window. The only solid technical reason you'd want a different model is if you find a certain model produces better code and another produces better reviews. Nobody has really tested this, of course.

by Izkata8 hours ago|

parent|

[-]

I believe the theory isn't that one is better than the other, but that different models would make different mistakes, so you can be more confident in the places where the code and tests agree.

by kefirlife15 hours ago|

parent|

prev|

[-]

I read that to mean you can arm it with a harness that you design informing the user that tests pass. A LLM can leverage this to run tests faster than I would run the same harness myself. You can then have any programmatic logic needed to support that usage sufficient to cover your use case and have a degree of certainty that the product at least passed those tests.

by imiric16 hours ago|

prev|

[-]

> The major hurdle right now is actually pivoting LLMs from just generating code: integrating those tasks into workflows.

Funny, I thought that the major hurdle is improving accuracy and reliability, as it's always been. Engineering is necessary and useful, but it's a much simpler problem, which is why everyone is jumping on it.

by mfro12 hours ago|

parent|

[-]

As much as that’s true it’s clear a huge amount of people have accepted the current state and are working around it, successfully(in terms of ticking an executive’s checkbox) in a lot of cases. And it’s worth considering we’re seeing strong strides outside of model quality in the tooling and integration

by zapataband112 hours ago|

prev|

[-]

I think you are misunderstanding something, AI does not think, it is a token prediction algorithm.

by pingou17 hours ago|

prev|

[-]

Not sure why you are downvoted but I agree. Additionally, perhaps LLMs are just like another higher programming language as the author said, and they still need someone to steer them.

I'm sure it was very difficult to program in machine code, but if now (or soon) anyone can just write software using a LLM without any sort of learning it changes everything. LLMs can plan and create something usable from simple instructions or ideas, and they will only get better.

I think LLMs will be (and already are) useful for many more things than programming anyway.

by smartmic17 hours ago|

parent|

[-]

> I'm sure it was very difficult to program in machine code, but if now (or soon) anyone can just write software using a LLM without any sort of learning it changes everything. LLMs can plan and create something usable from simple instructions or ideas, and they will only get better.

Did you read the section "Power to the People?" ? In it, the author dismantles your thesis with powerful, highly plausible arguments.

by hombre_fatal14 hours ago|

parent|

[-]

I read that section but I disagree with it.

1. You don't have to be an LLM expert to get good, consistent results with LLMs.

My best vibe-code process after years of using LLMs is to have Claude Code create a plan file and then cycle it through Codex until Codex finds nothing more to review, then have an agent implement it. This process is trivial yet produces amazing results.

It's solved by better and better harnesses.

2. You don't have to write technical specs. The LLM does that for you. You just tell it "I want the next-tab button to wrap back to the first one" and it generates a technical plan. Natural language is fine.

3. Software that seems to work only to fail down the line in production is already how software works today. With LLMs you can paste the stacktrace or user bug email and it will fix it.

This is why vibe-coding works. Instead of simulating how an app will run in your head looking at its code, you run the app and tell the LLM what isn't working correctly. The app spec is derived iteratively through a UX feedback look.

4. I don't understand TFA's goalposts, but letting people create software that are only interested in the LLM process (rather than the software craftsmanship) would be a huge democratization of software.

by prmph6 hours ago|

parent|

[-]

This sounds like someone who have never had to write serious software.

> 1. You don't have to be an LLM expert to get good, consistent results with LLMs.

You don't get good consistent results with LLMs, expert or not

> 2. You don't have to write technical specs. The LLM does that for you. You just tell it "I want the next-tab button to wrap back to the first one" and it generates a technical plan. Natural language is fine.

Try this, have Claude write a section in your specs titled "Performance Optimizations" and see the gibberish it will come up with. Fluffy lists with no actually useful content specific to the project. This is a severe problem with LLM-driven speccing I have encountered uncountable times. I now rarely allow them to touch the specs document.

> 3. Software that seems to work only to fail down the line in production is already how software works today. With LLMs you can paste the stacktrace or user bug email and it will fix it.

And pretty soon you have a big ball of mud. But I guess if the rate of bugs accelerate, the LLMs can also "fix" them faster

> This is why vibe-coding works. Instead of simulating how an app will run in your head looking at its code, you run the app and tell the LLM what isn't working correctly. The app spec is derived iteratively through a UX feedback look.

I should tell you about the markdown viewer with specific features I want, that I have wanted to build only with LLM vibe-coding, and how none of them are able to do it.

by ianhxu4 hours ago|

parent|

prev|

[-]

[dead]

by mfro16 hours ago|

parent|

prev|

[-]

While I think the author is entirely right about 'natural language programming' in the current day, if LLMs (or some other AI architecture) continue to improve, it is easy to believe touching code could become unnecessary for even large projects. Consider that this is what software co. executives do all the time: outline a high level goal (software product) to their engineering director, who largely handles the details. We just don't yet know if LLMs will ever manage a level of intelligence and independence in open-ended tasks like this. And, to expand on that, I don't know that intelligence is necessarily the bottleneck for this goal. They can clearly tackle even large engineering tasks, but often complaints are that they miss on important architectural context or choose a suboptimal solution. Maybe with better training, context handling, documentation, these things will cease to be problems.

by pingou16 hours ago|

parent|

prev|

[-]

I have indeed missed the arguments that are so powerful that they dismantles my thesis.

Would there even be a debate in the tech community if such unassailable arguments existed? The author is entirely entitled to his opinion, just as I am allowed to disagree with him (not sure why I am also downvoted). The good thing is, if I'm right, we will see it in less than 10 years.

by fragmede17 hours ago|

parent|

prev|

[-]

> they will only get better.

I don't buy that's true. The "only" part, anyway. Look at how UX with software has evolved. This is gonna be an old man yells at clouds take, but before smartphones, there were hotkeys. And man, you could fly with those things. The computers running things weren't as fast as they are today, but you could mash in a a whole sequence thru muscle memory, and just wait for it to complete. Now, you have to poke at your phone, wait for it to respond, poke at it some more. It's really not great for getting fast at it. AI advancement is going to be like that. Directionally generally it will be better, but there's going to be some niche where, y'know what, ChatGPT-4o really had it in a way that 5.5 does not. (Rose colored glasses not included.)

by Animats17 hours ago|

parent|

[-]

> they will only get better.

Then came the new Claude update, which many people say is worse. Even Anthropic says it got worse.[1] HN discussion back on April 15th: [2]

Some of this is a pricing issue. Turning "default reasoning effort" down from "high" to "medium" was a form of shrinkflation. Maybe this technology is hitting a price/performance wall.

[1] https://www.anthropic.com/engineering/april-23-postmortem

[2] https://news.ycombinator.com/item?id=47778035

by brcmthrowaway17 hours ago|

prev|

[-]

True. Knowledge workers are cooked.

by dgellow17 hours ago|

prev|

[-]

Claude connected to a Postgres (readonly obviously) and Datadog MCP servers in addition to access to the codebase can debug prod issues so quickly. That’s easily a 10x win compared to a senior engineer doing the exact same debugging steps. IMHO that’s where the actual productivity boost is

by 17 hours ago|

parent|

[-]

deleted