undefined

upvote

points

by shreddude23 hours ago |

upvote

by williamdclt19 hours ago|

[-]

> projects I simply could not have ever approached alone.

I think that's part of the divide between enthusiasts and naysayers. If you use GenAI on things that you couldn't approach alone, it's an incredible tool. If you use it on stuff that you're pretty good at, it's not a gamechanger (and if you're an expert, it's a minor boost at best). Many people's job are about doing what they're an expert at.

reply

upvote

by pmontra15 hours ago|

[-]

I'm about to complete a new non trivial functionality in a project of a costumer of mine. I spent an hour writing the spec. Then I asked Claude (Sonnet 4.6) to check if I missed something. I did, the sort of minor issues one notice after starting writing code, edge cases etc. That made me think about more issues and after a few iterations we settled down on a spec. I asked Claude to make an implementation plan and we ended up with 9 steps. It wrote the code for a step with new automatic tests and I performed some manual QA, which found further issues we didn't think about. We are at step 8 of 9 in about 12 hours of work. I would have needed a week to be there alone, with time spent researching and fixing bugs I created along the way, an inevitable part of our job but not exactly the most pleasant one.

This speedup is great. It improves the overall quality of the product (as perceived by the users) because I can ask Claude to add features that my customers and I would have dismissed because they take too long to implement. We would have settled down with a more basic UX.

So is it a game changer? It is in the same way those HTML / CSS framework like Bootstrap were game changers: suddenly every developer could create a decent and consistent UI in a fraction of the time with a few bells and whistles that we wouldn't have bothered coding. As a side effect a lot of web apps felt look alike mass products and web designers had to reinvent themselves, but the economics leaded inevitably in that direction. Would I spend again one of two weeks doing alone what I could write in a day or two with a LLM? Not anymore, not at this cost ($20 per month.)

reply

upvote

by jowsie10 hours ago|

[-]

I'd love to read a full transcript of someone going through this kind of collaborative programming. I see this kind of process mentioned a lot but can't quite figure out the details in my head. If anyone has a link to a blog post or similar showing this process in depth, I'd love to give it a read :)

reply

upvote

by sntran13 minutes ago|

[-]

I think it will click once you actually sit down with the AI agent, toggle Plan mode, and just tell it what you want to do in couple of sentences. It will immediately start building up the plan, presenting it to you what it thinks is the right approach , with the steps to take, with open questions that you can look at and answers. Then send them back to the AI. Repeat. That process along would give you a progress way further than you try to do it by yourself.

You can tell it to start implementing step 1. And you pick it up from there. Very natural how you would approach an expert for help, but you can always audit.

reply

upvote

by nsvd26 hours ago|

[-]

Jon Gjengset has some live streams where he does agentic coding.

reply

upvote

by dahart4 hours ago|

[-]

> If you use it on stuff that you’re pretty good at, it’s not a gamechanger (and if you’re an expert, it’s a minor boost at best).

This was probably true last year, and it’s a common talking point, but I’ve seen too many examples now of deep experts using Claude & Codex in the last year to solve very big problems, and write or rewrite large systems. The experts do complain that the LLMs can sometimes get stuck or go off the rails and they need to pay attention and actively steer. But nobody I know who’s using it is still claiming the LLMs aren’t a game changer, even quite a few people who were staunch holdouts for a long time. I was skeptical myself, for a long time, but had my oh shit moment late last year.

One caveat - to get expert results, you do need to have some experience using LLMs, you need to use it to write plans and design docs, know how to use ‘skills’ and MCPs, use it to review code, and (for now) you need to understand context compaction and when/why to use sub-agents. If you’re a domain expert but an AI noob, it’s less effective than an expert who knows how to use AI and has experience.

One of the biggest problem with humans is we’re wired to spot patterns and draw conclusions and then we have a really hard time seeing and accepting change and updating our mental rules. The LLMs are getting better. They have already gotten better, and they’re going to continue getting better. It’s too early to draw conclusions, and many conclusions people have already declared are out of date and no longer true.

reply

upvote

by bawolff17 hours ago|

[-]

I think part of it is we often notice bad AI usage. The llm generated "art" by someone with bad taste, or the patches to open source projects by people who cant program at all and are teerrible.

If the use is half decent people just dont notice it.

reply

upvote

by tstrimple12 hours ago|

[-]

Anti-AI zealots (from a practical usability position. Not necessarily the moral ones) are like the people who looked at The Daily WTF and decided no humans are capable of programming. They had plenty of examples to point at, but refuse to look at decent to great programmers. The stories of "The AI deleted my database!" are prevalent and boosted by these folks because it confirms their biases. It literally doesn't matter if the LLM wrote strong warnings about the action about to be taken. They don't see that aspect of it. Just the fact that someone claims "The AI deleted my database!" is enough for them.

Despite all the liars telling me gaming is easier on Linux than Windows, most new games have some sort of issues launching with default settings. CC is able to dive into both the exact error logs and the recent community feedback on what tweaks / configurations are needed to make it work. I rarely have to go beyond two prompts before a game is playable. CC and Proton are enabling the Linux gaming experience far more than Linus ever has or ever was interested in.

reply

upvote

by Flere-Imsaho4 hours ago|

[-]

> Despite all the liars telling me gaming is easier on Linux than Windows, most new games have some sort of issues launching with default settings. CC is able to dive into both the exact error logs and the recent community feedback on what tweaks / configurations are needed to make it work. I rarely have to go beyond two prompts before a game is playable. CC and Proton are enabling the Linux gaming experience far more than Linus ever has or ever was interested in.

Heh - I've just gone through a similar journey transitioning from Windows to Bazzite to play Steam games on Linux. I wouldn't have bothered pre-LLMs because my day job is Linux/Software and the thought of trying to fix issues here just to play games put me off.

reply

upvote

by LouisSayers18 hours ago|

[-]

I find it's a huge boost for my day-to-day work.

If you work on architecture and Claude docs, then you can essentially just have it fill in the gaps. Work then mostly becomes a matter of defining what the next piece of functionality is (which you can also use Claude to help with).

The stuff that used to take days now takes hours. It's not perfect, but if you get your codebase into a good shape then the payoff is huge.

reply

upvote

by mattmanser12 hours ago|

[-]

I re-read something I did 6 months ago doing this.

It's so obviously AI and had much less value than I thought now I look at it with fresh eyes.

Worse it doesn't read like I wrote it, I don't recognize myself in the doc.

reply

upvote

by jorl1717 hours ago|

[-]

While I think this is true

> If you use GenAI on things that you couldn't approach alone, it's an incredible tool.

I think this isn't true in all cases

> If you use it on stuff that you're pretty good at, it's not a gamechanger (and if you're an expert, it's a minor boost at best).

I think even then there's a divide.

I mostly work greenfield projects (and love it!). For these, AI has been a literal game changer. Our projects are built faster, with one or two orders of magnitude more automated tests, and all quality metrics are up.

Meanwhile, nearly all of my friends complain that AI doesn't help them. But they mostly work in very large existing codebases.

Still, even in large projects I think AI (the expensive variant) has been a complete gamechanger for me. Sure, I spend a lot on tokens, but I just feel happier and enjoy what I do more. The singalong people say about "thinking at a higher abstraction level" is what I feel. I really am thinking about architecture and larger patterns, instead of the boring nitty-gritty (which wasn't boring at all when I was a kid learning to code!...)

I think a key factor in all of this, to me, has been dictation. Most of the time, I don't write -- I use voice-to-text. I don't even read what comes out of it -- the LLMs get it (it is mostly unintelligible to anyone else) .

This means when I'm planning a big feature, I give a gigantic brain dump to the LLM in perfect stream of consciousness way, going through ideas, pros and cons, edge cases, what exists, what doesn't exist, where I'm sure of something, where I'm not sure and want the LLM to browse the state-of-the-art. Sometimes I spend 20 minutes just talking to the microphone before I send the first prompt. When I pair that with Opus, I find that I am able to build much faster and to go through alternative designs much more frequently as well.

I keep trying to tell all my friends: use voice to text and braindump to the computer. But they refuse... I couldn't imagine having to type everything nowadays. Even though I'm a fast typer, it's still much slower than the speed of my thought, which, granted, is still faster than the speed of my voice.

In effect, I filter much less, but I've come to think that's positive for the good LLMs: I throw all the edge cases and what ifs I'm thinking about -- all those years of experience dealing with similar systems.

If I wanted to go back to work in-office, that would be my major problem: I need to be able to talk with my computer all the time, loudly, and pacing through my room.

reply

upvote

by bthallplz8 hours ago|

[-]

Yay for dictation! It's so nice to just think aloud and then have an easily editable record of your thoughts, even when you aren't feeding the outputs to LLMs.

reply

upvote

by 400thecat9 hours ago|

[-]

How do you use voice-to-text? You mean, in the browser? I am only familiar with Claude Code, which I have installed on remote server, and there obviously, voice-to-text does not work. I have to type, which is tiring.

reply

upvote

by bigfudge8 hours ago|

[-]

I’ve installed Hex on os x. You just hold down a hot key to talk and it writes into whatever text entry widget is focussed.

reply

upvote

by jorl172 hours ago|

[-]

There are many tools for this, and I use the one that I tried first, so there are probably better-suited alternatives out there.

I run MacWhisper, and I paired it with BetterTouchTool so it triggers on any input when I double tap the fn/globe icon.

Obviously all of my transcriptions through it are entirely local. I usually use the Large V3 Turbo model, though in the beginning I used Parakeet v3, which was slightly faster but produced more mistakes (and kept a lot of filler words -- 'ahhm', 'hummm').

However, if I'm interacting with the Claude or ChatGPT/Codex apps, I often use their voice recognition instead, because it tends to be more accurate, especially with punctuation, albeit significantly slower. OpenAI's is noticeably better than Anthropic but I feel like that gap has closed a bit recently (might be all in my head, though).

Like I said I don't really care about mistakes in the transcription. If you try to read it, it feels like a fever dream, but the LLMs get it.

If I say "taken" it may have "take and" If I say "all the while calling the method" it might have "although a while. while. call in the met of". This is a rather extreme example but I've seen them happen. The repetition of words happens because I'm talking with "humns and ahs" and do repeat words or just the ends of words. It's very rare for the models, especially Opus, to have any issue with this transcription. When they do, they tend to signal to me they didn't get it, or I catch them in the act. But, like I said, it really is very very rare.

As an example, I've got quite a significant feature to work on, which would have probably taken me weeks to design and implement, and I've used this exact method today to ink out the plan:

- I have spent the last couple of days researching the feature in my off-time and just "thinking about it in the background" (think: I fall asleep thinking of it -- a habit I've always had)

- I spent ~25 minutes brainstorming out loud. The transcript ended with ~17.000 characters and ~3.000 words.

- I sent that transcript, in cursor, to Opus 4.6-High with instructions on how to iterate on it and how I want to work while planning

- I then spent about 1.5 hours with it iterating and building the actual plan (and supporting technical decision document, which points at the FULL transcript of the whole interaction). Many of my original ideas made it to the final plan, others got scrapped or simplified, and others still got added. It contains a mixture of my ideas, Opus' ideas and our push-back on "each other".

- Now I have a multi-step plan, with at least 8 distinct stages to implement this massive feature which I know for a fact would have taken me weeks to implement, and I expect to implement it in at most 3 days, but very likely it will be a day and a half.

Final context (with regards to your Claude Code question): My main development environment is Cursor, though for personal projects I also use Codex and Claude code. For the initial "researching of the feature in my off-time" I often have interactions with ChatGPT and Claude where they have no access to the codebase, and I have them go find out what the state of the art on specific topics is. All of these interactions also involve me using my voice to talk to them (though nowadays I don't typically use their voice mode, I just let them reply in text). Then I brood over that.

reply

upvote

by CPLX1 hours ago|

[-]

This is exactly my workflow and it’s just incredible. I use aqua and wispr flow depending on which one seems to be returning the best results that day.

reply

upvote

by jiggunjer15 hours ago|

[-]

[dead]

reply

upvote

by dawnerd18 hours ago|

[-]

And in a team setting it can really accelerate tech debt especially if used by people that know just enough to be dangerous.

reply

upvote

by seventytwo7 hours ago|

[-]

The dangerous thing is when you’re a novice and can’t identify the BS. That’s why for people with “good” and “expert” skill, it’s not a huge boost. They can identify the BS, and what’s left is modestly helpful.

The highest danger in using AI comes precisely to people who stand the most to gain from it.

reply

upvote

by erikerikson2 hours ago|

[-]

I am more of a "huh, interesting demo, I'm gonna check in on it later" sayer than a naysayer. My biggest reason, with coding, is that I already, before AI, struggled to deal with too many distractions from my coding and too many piles of low quality output. I should probably check in since it's been a bit but every time I've tried to generate some simple project, I look through it and think what terrible garbage with so many errors. After two decades of developing my craft, I struggled with most of my fellow human programmers too. The business loves delivery it now even if then someone is revisiting it hundreds of times more to fix it in little bits for a total effort cost of 10-100 (or higher) times more.

reply

upvote

by jesse_dot_id18 hours ago|

[-]

Same. I'm a DevOps engineer, so a jack of all trades master of none type of guy, and Claude Code backfills my knowledge gaps and turns me into kind of a superhero. I think it's key to already have a pretty good idea of what you're looking at, though.

reply

upvote

by zahlman3 hours ago|

[-]

A lot of the time people relate an anecdote about how Claude helped do some cool thing, my reaction is that it's not a thing I would have thought about doing in the first place, and that I still can't really imagine wanting to do myself, even though it indeed sounds cool.

This is no exception.

reply

upvote

by sntran11 minutes ago|

[-]

You will be surprised that there are lots of things you want to do yourself but haven't been able to (not just ability, but time and effort).

reply

upvote

by thih92 hours ago|

[-]

> I honestly don’t understand AI naysayers.

As an AI naysayer, I see and appreciate the productivity gains, I don’t like the associated cost, mostly the spike in workflow centralization and opaqueness.

reply

upvote

by doctorwho4214 hours ago|

[-]

Maybe because the scale of investment out strips the value?

What trillion dollar problem is AI solving?

reply

upvote

by fragmede12 hours ago|

[-]

If you're going to put it that way, companies, globally, spend something on the order of $20 trillion on office workers. If corporations didn't have to spend that money on them, and everything else in order to support them, they wouldn't.

reply

upvote

by luckystarr8 hours ago|

[-]

Then the workers wouldn't spend 20 trillion and the economy as a whole would tank.

reply

upvote

by archagon21 hours ago|

[-]

[flagged]

reply

upvote

by donkey_brains20 hours ago|

[-]

Just as bad as the technical debt is the cognitive debt in your codebase. When something breaks, your only recourse is to ask the AI how to fix it, since it wrote it and you did not have time to review all of its code. Except now the code base is so large it won’t fit into the context window, and the AI can’t help you, and…you’re screwed.

reply

upvote

by shmoogy20 hours ago|

[-]

If you're vibing such complex things you should probably be in the habit of also generating detailed documentation and commits so the ai can follow breadcrumbs, add some playbooks for how to debug and it's actually pretty good. Too complex for local models context though - so you're probably still correct albeit there are ways to mitigate or delay this.

reply

upvote

by jplusequalt5 hours ago|

[-]

>there are ways to mitigate or delay this.

Yeah, like writing the code yourself!

reply

upvote

by rvnx23 hours ago|

[-]

[flagged]

reply

upvote

by jazzyjackson22 hours ago|

[-]

I’ll explain it: these tools are non-deterministic and people have different experiences with them. For a few people every interaction is totally fumbled and they think the cheerleaders of gen AI must be lying, for others the chatbot hits one home run after another and lets them add microcontrollers to their CAN bus. When these people’s good luck runs out and they start getting mixed results like the average user, they assert the service must have been down graded

reply

upvote

by triMichael22 hours ago|

[-]

I'll add to that: you are more likely to have a good experience if it has a lot of relevant data that it was trained on. You are also more likely to have a good experience if errors don't cause major issues.

So one-shotting a game of Snake should be great (tons of training data, errors are easily caught because it's a small program). Similar with building a lot of web UI front end, or one-shotting a personal project. On the other hand, I haven't been convinced that it's good enough to maintain large codebases or assist with niche topics that are not very well documented.

reply

upvote

by thewebguyd21 hours ago|

[-]

> if it has a lot of relevant data that it was trained on

This became evident to me the moment I tried to have these models work on some PowerShell tasks for me. Even Opus today struggles with PowerShell.

Since anything in PS is probably some internal sysadmin tool, there's not much public code out there outside of Microsoft's documentation. Plus the Verb-Noun naming scheme makes it really easy to just hallucinate cmdlets (which it does, often). Its easier to have the LLM just do things in python using M365 Graph API than any of the provided PowerShell cmdlets.

OTOH, I've been using Claude for a lot of Swift & Swift UI work lately and it has no problems there, and I'd imagine there's even less publicly available training data for that so to be honest I'm not entirely sure why it fails so badly at powershell.

reply

upvote

by picofarad13 hours ago|

[-]

I have deepseek or grok write bash-likes in pwsh often enough to wonder what sort of things you're doing in pwsh...

I use it to wrap ping.exe with colors and fewer columns, for example. yt-dlp wrapper to fetch 480p bestaudio with English subtitles, no playlist, works on a surprising number of video sites.

It does make cmdlets up, you're right, there.

reply

upvote

by lowbloodsugar21 hours ago|

[-]

> On the other hand, I haven't been convinced that it's good enough to maintain large codebases or assist with niche topics that are not very well documented.

Same is true of humans. So far my experience is that addressing the issue with the help of AI is faster than not (ie comprehending the system and creating the documentation).

reply

upvote

by cauch19 hours ago|

[-]

I don't understand the comments of the kind of "same is true with human".

This feels a bit like whataboutism.

It also feels like people don't listen to each others.

For example, reading the previous comment, it feels like the thing that reduce the enthusiasm was that at first GenAI looks like it was "reading, understanding and using its own knowledge to answer the problem", but as soon as it is a ore niche or a more complex situation, GenAI looks like it "does not understand the code, just does the equivalent of a StackOverflow search and try to apply the solutions that it found there, and this is why it felt like it understood the code before".

It does not at all means that GenAI is not terribly useful. And even better than humans in some situations.

But it feels that answering "same with humans" is missing this point: that's the opposite, humans usually try to understand the code and are bad at covering a very large range of very well documented subjects. That's the "uncanny valley" they talk about: they assumed GenAI performance on a subject X is due to a "human-like" approach, and it feels very strange when this impression falls apart.

reply

upvote

by lowbloodsugar13 hours ago|

[-]

No I mean I’m in the camp that believes AI and the human brain are analogous and work the same way. Someone once replied, “then why do I need to supervise them?” and I pointed out that there a people whose job is literally ”supervisor”.

reply

upvote

by cauch38 minutes ago|

[-]

I don't think that it is what means the parent comment you answer.

The comment you answer to says that their experience is that AI and the human brain are not analogous and that AI is good to store large amount of knowledge and repeat it (or extrapolate based on pattern on the large amount of knowledge), but bad at understanding the code as a human does. Which explains why a human is more efficient when reacting on a thing that don't have a lot of documentation (on which the AI built its knowledge).

Humans are bad at storing large amount of knowledge, and this is why we need supervisor for human.

AI are bad to understand new stuff, they need to be able to connect the new stuff with a lot of examples they have been trained on (it does not mean the stuff is "identical", but it means "connected"), and this is why we need supervisor for AI.

We need supervisors for both human and AI, but for different uncorrelated reason.

reply

upvote

by dyauspitr22 hours ago|

[-]

I still don’t get it I can dictate a prompt and sometimes I do it so quickly the text looks like a drunken parrot dictated it and it still always gets exactly what I’m asking for. I’m just going to attribute malice to the naysayers.

reply

upvote

by bonoboTP22 hours ago|

[-]

Some people are really bad at specifying what they want to ask for. Or they already start prompting with the attitude that it can't possibly work so they don't even really try, or stop at the first failure to point and say how bad it is.

reply

upvote

by thewebguyd21 hours ago|

[-]

People are really, really bad at specifying what they actually want. I've worked in IT for my whole career, starting in help desk (now an IT manager). My days in the service desk was enough proof that people have no idea what they actually want, or at least, they really struggle to articulate it into words.

It's the famous "email broken, fix pls" but in the form of an LLM prompt.

reply

upvote

by bonoboTP19 hours ago|

[-]

Well, today's multimodal llm agents with tools would at least have a good chance to do something with even such an underspecified query. Because fixing things is simpler to specify, the agent could look at config, network settings, send a test email, take a screenshot etc and get a good idea of what's broken. But when you want some new feature or new app, you can't do without actually asking for specifics, or at least you shouldn't complain if it didn't read your mind correctly. Or at least accept that you have to iterate. I think many average people can get this if they are motivated, and they can incrementally say what they don't like even in vague terms and it can get better. But some just stop without trying to ask for changes.

It can be frustrating to observe people interacting with these things. But it was just as frustrating 20 years ago, so maybe it's just a constant.

reply

upvote

by rvnx20 hours ago|

[-]

Similarly, doing service desk, the thing that makes me flip the table is how people start by explaining what does not work, instead of explaining what they are trying to do.

reply

upvote

by bonoboTP19 hours ago|

[-]

It's hard even at the highest levels, such as in writing scientific papers or doing scientific conference talks. People just generally have a hard time to step outside of their context and think with the head of someone who has a different set of facts and assumptions in their context. It's hard to know how much context you both share, and how to tailor the explanation so you also don't start from Adam and Eve but you explain just enough context and strip irrelevant tangents.

I don't think this is just about intention and willingness, it's just simply hard.

reply

upvote

by skydhash20 hours ago|

[-]

Or maybe people see how complex the code is and all the failure points, and don’t feel it’s ethical to use the output. In most of the comments, the most relevant point is that the poster is not an expert in the domain they got helped. While they can observe the result, they don’t have a causal model of the situation.

reply

upvote

by camel_gopher22 hours ago|

[-]

It’s a probabilistic parrot

reply

upvote

by foobarbecue20 hours ago|

[-]

What's the difference (stochastic vs probabilistic)?

Or... were you illustrating?

reply

upvote

by amelius19 hours ago|

[-]

I still would like to hear a public apology from the stochastic parrot crowd for their deceptive framing. Or maybe it was just incompetence.

reply

upvote

by trumpdong10 hours ago|

[-]

"everyone who doesn't share my opinion is deceptive or maybe incompetent"

reply

upvote

by jplusequalt5 hours ago|

[-]

>projects I simply could not have ever approached alone.

Learned helplessness.

reply