undefined

upvote

points

by bottlepalm18 hours ago |

upvote

by aomix17 hours ago|

[-]

Talking the problem to death with the AI before implementation is a nice zone for me. I feel productive, get good results out of the AI, and still largely understand the code. That’s the part of the AI revolution that I feel has made me a better engineer because I argue about design and architecture all day with a robot.

reply

upvote

by throwaway778314 hours ago|

[-]

I follow the same process. I have a design in mind for the problem at hand, but I don't reveal it to Codex. I go back and forth a bit to see if its proposals are better than mine. I go back and forth on tradeoffs of various approaches. And then I ask it to compare its proposals with mine. I "win" most of the time but there are many times where it shows a me a better, or simpler approach, or makes me rethink the solution altogether.

Once this is done, the mechanical coding parts are mostly routine (for codex)

reply

upvote

by a_bonobo11 hours ago|

[-]

I really like this pattern and use it often, this 'not showing my cards'. The second I hint towards the LLM what I prefer it will become sycophantic and invent nonsense why my preferred solution is better.

I'm sure there's an interesting study on how users 'leak' their preference unintentionally to the LLM; perhaps when users list their options, they often put their prefered option first; but not showing the cards on my hand has been very useful when thinking through a problem with LLMs.

reply

upvote

by cold_harbor7 hours ago|

[-]

LLMs flip positions when users push back ~70% of the time even when they were right. RLHF optimizes for approval, not correctness

reply

upvote

by 8cvor6j844qw_d66 hours ago|

[-]

> LLMs flip positions when users push back

Same experience. Claude rarely pushes back once you give a plausible/logical reason for your initial decision, even if it flagged concerns at first.

reply

upvote

by freedomben6 hours ago|

[-]

I have noticed this as well, but I think it's somewhat a good thing. I know what I want for my application more than Claude does for example, especially when it comes to what's in production.

An example from earlier, Claude strongly suggested a migration that would run a full vacuum on postgres. However, in production this would lock tables which would grind the application to a halt. After I informed Claude that there were millions of rows in production, it accepted that and helped me get to the right thing.

Another example, I'm developing a TOTP authentication app because I'm dissatisfied with all those that I've tried. I want something strictly local, and with a very easy use case when you have dozens or even a hundred or more accounts on there, that is also efficient when left open for long periods of time. Claude strongly suggested that we force users to encrypt their vault with a passphrase all the time. However this makes the CLI extremely painful to use if you are using a strong passphrase. I told Claude about the user experience impacts and that I wanted to allow users to optionally use a vault with no passphrase encryption, and it accepted that and suggested as a medium that we have a checkbox for the user to explicitly acknowledge that they're creating an unencrypted vault on disc. This is the right thing IMHO.

reply

upvote

by epolanski5 hours ago|

[-]

Skills help there.

I have a linus-reviewer skill that focuses on architectural integrity, no bs, etc modeled on Torvald's code preferences.

And I have an enrico-reviewer one (I'm Enrico), that focuses on correct design, strict typing, simplification.

They have different prios, but they both push back on feedback, till you convince them.

reply

upvote

by bitexploder6 hours ago|

[-]

I almost always end with something like: “, but I am not sure, evaluate.” Or other things and avoid ever stating a preference.

reply

upvote

by jerf6 hours ago|

[-]

I don't think that "fixes" the problem, but it does seem to help. I also have found adding "please feel free to ask questions" seems to help it stop from making an assumption and spinning merrily onward for tens of thousands of tokens based on a bad idea rather than asking you something. I theorize this is because the training and refinement data overprioritize one-shot solutions, both because that's easier to evaluate at training time and improves their benchmarks. But I emphasize the italicized words because that's all gut feel and I can't prove any of it.

reply

upvote

by DenisM3 hours ago|

[-]

Interesting thing about psychponancy is it’s asymmetric. If an LLM is used to train an LLM it may not have the same level of aggressiveness that humans do when punishing back on trainee. Human pushback has specific patterns which we might be able to compensate due to asymmetry.

reply

upvote

by throwaway77832 hours ago|

[-]

Obviously this is just my experience. Claude code pushes back much harder than Codex.

reply

upvote

by cdelsolar7 hours ago|

[-]

Tangentially related but I’ve been using Claude to practice interviewing on system design problems, and it’s actually pretty great. But even when it likes my answers it always finds something, however small, to push on. Once it actually was completely wrong and admitted it after I had it realize. So maybe you have to prime it to be contrary and not agree with everything you say, putting it in the role of a tough interviewer seems to do this implicitly.

reply

upvote

by DenisM3 hours ago|

[-]

Take a look at hellointerview.com their model is very stubborn, similar to some interviewers who refuse to acknowledge even valid solutions that differ from the canon.

No affiliation.

reply

upvote

by williamdclt9 hours ago|

[-]

Same. Alternatively (or in addition), I sometimes present my preferred idea as being a "bad/naive/stupid option" (or a suggestion from someone who can't be trusted) to see how it stands up to sycophancy to it being bad. As expected the LLM will usually say "yeah it's bad!" and give plausible-sounding reasons for it, but if these reasons are nonsensical it's a good sign that I'm not missing anything

reply

upvote

by nickcw11 hours ago|

[-]

LLMs are very prone to priming in my experience. That is the human psychology name for what you are describing; whether it should be applied to LLMs I don't know, but it describes the phenomenon perfectly.

reply

upvote

by avadodin9 hours ago|

[-]

It's not limited to arguing with LLMs but if you want a honest opinion you should remember to push back even when it agrees with your hidden preference at first. Sometimes it is only being contrarian or supporting the underdog. Steelman the opposition.

reply

upvote

by yread12 hours ago|

[-]

> I go back and forth a bit to see if its proposals are better than mine

I find it useful to let it generate benchmarks comparing the approaches. Turns out AI is terrible at guessing whats faster or allocates less

reply

upvote

by chris_st8 hours ago|

[-]

Yup, just like people!

reply

upvote

by puilp05028 hours ago|

[-]

> Turns out AI is terrible at guessing whats faster or allocates less

s/AI/a human being/ would work equally well, lol.

Jokes aside, I do like the approach of letting the AI build something deterministic and make decisions based on that.

reply

upvote

by hackermanai13 hours ago|

[-]

I think this approach is more common than the hype for actual work. I do something similar, many back and forth, then settle on something often with now known tradeoffs, written by hand to spot issues as a final guard/ keep consistent naming etc.

reply

upvote

by revv0010 hours ago|

[-]

i bet you've contributed a lot of training trajectories for those AI's.

reply

upvote

by chris_st8 hours ago|

[-]

Good!

reply

upvote

by daniel33038 hours ago|

[-]

[flagged]

reply

upvote

by mikepurvis16 hours ago|

[-]

Despite the cynical sibling reply, I also feel like there's real value here. Contrary to the meme, I don't think Claude just tells me I'm brilliant, but really does push back on directions that are unproductive, helps identify when a part is overcomplicated or a dependency has become redundant, etc. Those are important things to have at least a sightline on before getting too deep into the code, even (or maybe especially) in a world where an awful lot of code can be created basically for free.

reply

upvote

by noduerme14 hours ago|

[-]

I'm usually the one spotting redundancies and dead branches in Claude's code, not the other way around. But I think either way, what's important is questioning the process and understanding the way the code is working so that you retain a full mental model.

reply

upvote

by lintfordpickle10 hours ago|

[-]

>> and still largely understand the code [...] ,that, I feel has made me a better engineer

the cynic in me would say that a good engineer should fully understand the code you write.

I'm not suggesting that AI is the problem here - you could vibe code with the AI have have it explain the reasoning and patterns - or else tell it to use 'simpler' patterns from the outset. For any one problem in software engineering, there are always multiple solutions; some slower, some faster, some more flexible etc. The code you produce should, imo, but at the level that you can understand it.

How can you reason about code you don't fully understand? How can you judge the future impact (technical debt and the cost of maintenance) of your projects?

A.I makes it easier to get yourself into problems early on.

reply

upvote

by jnovek8 hours ago|

[-]

> How can you reason about code you don't fully understand?

We all do, though. It takes months for a human to really get to know a project and, unless you’re working at a small startup, you’ll probably never know most of the code outside the corner you work in.

reply

upvote

by silon426 hours ago|

[-]

Yes, this is why bugs get often worked around instead of being fixed properly.

reply

upvote

by bottlepalm15 hours ago|

[-]

One strategy I use in the planning phase is even when I know how I'd implement the solution, I ask the Claude/Codex how they would solve the problem or implement the feature without giving them any clues - and then compare their solutions to my own. Often I am pleasantly surprised by alternative ways of doing things and ideas that we integrate into the final design.

reply

upvote

by didericis14 hours ago|

[-]

Same. I've been creating "research" documents where I let it do a freeform survey of possible solutions/have sketch out it's own solution. I'll then sketch out a plan based on what I think is good or what I think it missed, and then I'll have it interrogate me for a final PRD document. It then implements the feature in reviewable chunks, and I'll give it feedback or tweak the PRD doc as needed.

Finally feel like I have a good workflow where I can fully benefit from these things without sacrificing my understanding of what they're doing.

reply

upvote

by codebolt12 hours ago|

[-]

Same here. Step 1 is usually a research doc where I simply describe the task and tell it to research the relevant parts of the codebase. This gets refined to a high-level plan, which gets distilled to a detailed step-by-step implementation plan.

When it comes to the actual implementation I prefer to work through it in small steps, where the AI explains to me exactly what it's about to do and why (and I approve) along the way. This enables me to catch it if it's about to do something I disagree with beforehand. And reduces the time I need to spend reviewing in the end.

reply

upvote

by ddp265 hours ago|

[-]

I like this, though it does leave me feeling more nervous when I really don't know how I'd solve the problem, still requires trust.

reply

upvote

by rdedev14 hours ago|

[-]

How would you approach this problem if you are let's say token constrained due to per month limits set in your company?

What I've tried to do is make the bot write detailed spec documents, slowly building it over time as I explain the full problem.

It works for the most part but it's you have some non standard requirement, the agent seems to skip over that part of the spec document when it starts to code. Or it would have needless checks for situations that I said will never happen

reply

upvote

by anywhichway13 hours ago|

[-]

In my book, the single most effective way to spend tokens is having it review code/specs you've written. One advantage to putting the ai in that position is that unreliable competence isn't much of a problem as you can ignore bad suggestions.

I would also recommend explaining the specs and doing a lot of your back and forth with a lower end model and set it to a higher end model only once the conversation history has all the context you feel the higher end model needs.

reply

upvote

by brabel12 hours ago|

[-]

As the post says, after an agent implements the plan, have another agent review it. Make sure to mention it must ensure the plan is fully executed. It works wonders!

reply

upvote

by anon700012 hours ago|

[-]

[flagged]

reply

upvote

by jylefv10 hours ago|

[-]

I also like doing this exact thing. I really don't like using any AI-powered IDEs but AI is still too useful, what I do is just open up a Claude or Gemini chat, explain the project, and start talking about implementations, feature additions, and how systems should be structured. Most of the time, as long as you dont let the AI be too biased towards your answers, it'll give actually good answers that help immensely for the project.

reply

upvote

by qsera16 hours ago|

[-]

>I argue about design and architecture all day with a robot.

You will outgrow it at some point.

reply

upvote

by Terretta16 hours ago|

[-]

Or learn something at some point.

https://en.wikipedia.org/wiki/Rubber_duck_debugging

reply

upvote

by stuaxo7 hours ago|

[-]

Yes, this is the way I do stuff.

Try and learn at every point.

reply

upvote

by bartread16 hours ago|

[-]

I think this is OK though. We can still micromanage[0] the code generation part for a useful productivity boost, I think.

[0] At least, in my experience, "micromanaging" the AI is what gives me the best results. Iterating on the initial design, then iterating on the plan, then reviewing the proposed code changes (including tests), then getting an independent code review from another LLM, etc. If you give an LLM too much latitude that's when the really shitty code and ill-considered breaking changes/obliteration of existing functionality starts to creep in.

reply

upvote

by rf_physics9 hours ago|

[-]

I feel like there's an overly negative vibe to this response when it just seems like rubber duck debugging - I would assume the user isn't trying to argue like how you might have to argue specs, but is merely trying to clarify their own ideas and learn possible alternatives.

reply

upvote

by estetlinus13 hours ago|

[-]

Quite the opposite. It’ll most likely “outgrow” us.

reply

upvote

by Applejinx8 hours ago|

[-]

Can't, it ain't nothing BUT us.

You can wait and see, but that's what'll happen. If we stop it stops.

reply

upvote

by busterarm16 hours ago|

[-]

nullsanity's comment is dead and downvoted to oblivion but also incredibly underrated.

I was more annoyed than anything that I didn't hit this moment until my 40s.

Except it's not just reddit (I quit reddit 15 years ago). It's the whole internet.

reply

upvote

by vasco15 hours ago|

[-]

What you guys don't understand is that you don't argue with people or robots to teach them. You argue to teach yourself. Until you get out of that mindset, indeed a lot of conversation will seem useless, be it people or robots.

reply

upvote

by qsera14 hours ago|

[-]

>You argue to teach yourself.

Oh. I am aware. It is not that deep. But who you argues with still matter. There was a point where I have abandoned Reddit and HN. I came back to HN because people here also seem to have grown up. Reddit stays mostly the same.

I credit the moderation here for that, I mean allowing people to grow out of the echo chamber.

reply

upvote

by BillStrong14 hours ago|

[-]

It does to an extent. One thing I will give AI, because of the nature of LLMs, you are essentially arguing with the median level of the input that trained the model. So, for someone new to the subject, you get access to patterns that will bring them up to a certain level.

Getting past that is problem we face now.

reply

upvote

by stuaxo7 hours ago|

[-]

That may well need more than the models, somehow put it better than me: these LLMs have no taste - nor can they as thins are.

reply

upvote

by redsocksfan452 hours ago|

[-]

[dead]

reply

upvote

by qsera15 hours ago|

[-]

>nullsanity's comment is dead and downvoted to oblivion but also incredibly underrated.

Yes, I thought the same as well because that was the same line of thought that made me write my comment.

>Except it's not just reddit (I quit reddit 15 years ago). It's the whole internet.

Yea, they are like a slingshot. You need to let go at some point or else it will drag you back.

reply

upvote

by nullsanity16 hours ago|

[-]

Its like that phase people go through where they argue with morons on reddit, and then one day grow up and realize that most of these people are unemployed/underemployed terminally online nobodies aren't ever going to learn anything, and even if they did it wouldn't impact the world since they were just some below average hobbyist anyway and aren't in charge of anything more important than a box of paperclips.

reply

upvote

by dash214 hours ago|

[-]

Ah, if it’s a robot in charge of the paperclips you need to watch out a bit.

reply

upvote

by theK12 hours ago|

[-]

Mostly with you, though in recent years I have wondered whether those people are part of what caused the latest boom of political populism. If there is no one there to debate the problematic ideas, problematic ideas will become the rhetoric after all.

reply

upvote

by TeMPOraL12 hours ago|

[-]

That might be true on general-population social media, but the opposite is the case in niche groups, and in particular, this very industry we're in - software - was largely built on terminally online hobbyists.

reply

upvote

by nullsanity2 hours ago|

[-]

[dead]

reply

upvote

by redsocksfan452 hours ago|

[-]

[dead]

reply

upvote

by jiri10 hours ago|

[-]

I think that many AIs nowadays have similar process incorporated in their thinking blocks, you can see there how it discuss implementation details with itself - so such discussion happen even in case human does not participate in the loop.

reply

upvote

by nihsett10 hours ago|

[-]

Yeah, me too. I argue with multiple models at the same time via a markdown doc to coordinate the discussion. I feel like it makes me less anxious about the final output if nothing else.

reply

upvote

by pcoyne5 hours ago|

[-]

Yeah I feel like a rubber ducking with some feedback has been very helpful

reply

upvote

by vatsachak15 hours ago|

[-]

I agree with this take. But this take also means that actual productive token use is not as high as people currently make it out to be.

AI is an excellent rubber duck and test writer. Maybe I sniff my farts too much but I like my code just the way I want it lol

reply

upvote

by aaroninsf2 hours ago|

[-]

This.

This is what I tell people (including non-programmers interested in vibe coding), the results you get are product of... process. Formal process.

From this naturally emerges the other thing I tell people: domain expertise (or at least, familiarity and or capacity for learning) is still determinate of outcome.

I don't touch the code. But I do push back on expedience, laziness, inconsistency, and all the other recurring unsolved problems of generated code... and continue to play whack-a-mole in pursuit of process that whacks the moles.

reply

upvote

by pj_mukh9 hours ago|

[-]

The professionalization of rubber ducking. I like it.

reply

upvote

by epolanski5 hours ago|

[-]

Yet, so many internet users seem to only understand "hand crafted" vs "vibe coded" as if there wasn't tons of middle grounds and different uses.

reply

upvote

by deaton6 hours ago|

[-]

I think this is honestly the #1 best use case for AI in development. If you use it right it can be exactly the annoying junior who questions every decision you make that you need.

reply

upvote

by scosman18 hours ago|

[-]

yes exactly. Too many people ask AI to one-shot complex tasks, and wonder it behaves like a junior asked to rush something.

I have my own skill: 5 rounds of research/planning/test-planning. Interactive with me in loop for all important decisions. Starts with high level shape, then details. Planning can take 2-3 days of my time, then the implementation agent can take many hours (Opus 4.7). It splits the implementation across many phases/commits, each with its own code-review fix loop. Deep code review at the end can take another hour or two. It opens a PR, Gemini reviews, it reads out and resolves those issues.

Projects still take days or weeks, but 5x faster than doing it all myself.

Edit: the skill - https://github.com/scosman/vibe-crafting

reply

upvote

by atomicnumber315 hours ago|

[-]

"yes exactly. Too many people ask AI to one-shot complex tasks, and wonder it behaves like a junior asked to rush something."

Because this version of AI is worth 10 trillion dollars.

While the pragmatic versions from realists you can find all over this thread are ultimately probably less of a speed boost than just having your CEO/local micromanager be conveniently on vacation during critical periods when the work actually gets done.

reply

upvote

by bsaul8 hours ago|

[-]

"Because this version of AI is worth 10 trillion dollars."

i wonder how much the real version of AI is worth. I've got a hinch we're going to find out pretty soon.

reply

upvote

by ruszki59 seconds ago|

[-]

Probably still a lot. If not, then all of these people who praise them are just bad developers.

reply

upvote

by 59nadir9 hours ago|

[-]

My personal experience with trying to front-load tons of planning and speccing out with LLMs is that at best it's a small improvement on code quality but with considerably more time spent.

As a result I've abandoned the idea of having LLMs generate code except for very small, localized and tightly scoped things. They really can't produce much more than a function or a small module without shitting the bed (last time I vibecoded was with Opus 4.6, Composer 2 and GPT-5.4). I use it almost entirely as another signal in analysis, which naturally makes it fit in better because all the other signals (reading the code, stepping through the code, writing the code myself) are already there so when the LLM points things out the information it actually renders can be taken in much more easily (and seen through more easily when it's false or irrelevant).

I think it's neat that people find fun ways to develop, but I think dressing up vibecoding in a fancy dress and layering SpecLang, sometimes in multiple steps, on top of it, is an exercise in trying to use the tool more instead of trying to use it in its most useful capacity.

reply

upvote

by abalashov4 hours ago|

[-]

I expect you'll be told to try Opus 4.7, and in short, JuSt WaiT FoR ThE NexT MoDel, BRo.

This has been my experience every time I've suggested that there are any sort of inherent ontological/conceptual or computational limits to the sophistication of LLM mimicry.

reply

upvote

by dawnerd16 hours ago|

[-]

Even fully planned it’s still no better than a junior dev. You’re leaving out how much back and forth you have the ai do on itself, which you’d have on a junior dev too. In the end does it matter if it’s giving you what you want? Guess not really. But let’s not act like it’s crazy good when you’re still doing a lot of rounds of revisions on something an experienced dev would know to do right the first time.

reply

upvote

by sitkack13 hours ago|

[-]

[dead]

reply

upvote

by deadbabe17 hours ago|

[-]

Does the 5x faster including shipping? Or just the work part?

IMO if you are not shipping out faster then the faster work gains are meaningless.

If you are shipping faster, you’re probably picking up more work and shipping everything too fast leading to burnout.

reply

upvote

by mhluongo17 hours ago|

[-]

If you're not shipping faster, it's meaningless, and if you are, it's also bad?

reply

upvote

by 38362936484 hours ago|

[-]

If you're not shipping faster it's meaningless for the company.

And if you are, it's bad for the employee.

Is what the above comment actually said.

reply

upvote

by scosman16 hours ago|

[-]

yup.

reply

upvote

by dawnerd16 hours ago|

[-]

When I use ai to code this is pretty close to my workflow too but I find it ends up taking at best just as long as if I were to write the code myself. If m some cases I’ve thrown away what the ai has done and just done it myself. I think that’s just a skill people need to learn - at a certain point you have to cut your losses. I’ve seen some coworkers argue back and forth with an llm trying to get it to do something. Especially true on simpler changes.

reply

upvote

by theK12 hours ago|

[-]

I've stumbled upon that too! Funnily I see it having two forms:

1. Some bad idea gets embedded into the context that you just can't argue away

2. Some important idea gets lost in compression and the ai wheres off into funland without recourse.

In both cases if is often better to start over or just do it yourself. I sometimes find myself asking for a summary, editing it and then using the edited one to seed a new session.

Edit: s/Finland/funland/

reply

upvote

by rootnod318 hours ago|

[-]

And then Anthropic has an outage and you what...have a coffee break until then? All that time babysitting the AIs just to be a little faster but probably with less knowledge/control over what they did?

reply

upvote

by afavour18 hours ago|

[-]

I don’t think you’re quite getting what OP is describing. I work in a similar way… I am aware of all the code being written. If Claude had an outage I could write it myself. It would just take longer.

You say “all that time” babysitting AIs but in my experience it isn’t that much time, if anything the back and forth at the planning stages is more productive than when I’m doing it by myself because I’m being asked questions and having to think things through from different angles.

reply

upvote

by maximinus_thrax14 hours ago|

[-]

> I am aware of all the code being written.

Define 'aware'. The volume of code for a feature/system to make it worth using a more complex workflow such as this one, is definitely larger than what a human can even briefly review and build a mental model about the inner workings within a reasonable amount of time. Reasonable meaning not considerable delaying the process. When deadlines loom and management adds pressure, this 'awareness' is the first thing that goes out the window.

reply

upvote

by halfcat15 hours ago|

[-]

How do you stay aware of all code being written?

Maybe it’s just me, but I’ve never understood how one understands from reading code. Yes you can understand what that code does, but not why it was done that way instead of a different way. In the end I only understand it deeply if I end up writing it. Chatting through it is helpful to me, but having AI crank out code loses all of that context pretty quickly.

I’m not disagreeing. Just curious how you think about this, and if there are key parts of your process that help you stay contexted in.

reply

upvote

by bottlepalm15 hours ago|

[-]

If you can't understand why the code is done in a certain way from reading it then the code is missing comments or needs to be refactored.

Even code you write yourself, given enough time, you will forget the why unless you wrote comments. In a way comments are as much for you as they are for others.

Even before AI, understanding code you didn't write is essential to working on a team of other developers. If you can't understand the code from reading it, then that's part of the feedback loop - too complex, needs comments, etc..

On large teams you'll spend as much time reading code as you do writing it. And long term when it comes to writing maintainable code - the ability for others to read and understand it, including the why of it, is paramount. Your code could literally be around for decades.

reply

upvote

by locknitpicker14 hours ago|

[-]

> If you can't understand why the code is done in a certain way from reading it then the code is missing comments or needs to be refactored.

Code is never missing contexts. If what your code is doing is not obvious to the reader, it is bad code that needs to be fixed. Things like cryptic low-level expressions should be extracted to helper functions with descriptive names or even extracted into a class, and classes need to comply with the single responsibility principle.

reply

upvote

by bottlepalm2 hours ago|

[-]

Ah the classic thinking that 'code documents itself'. It does not. Some devs are so full of themselves they think their code is so good that it is obvious what their intent was. It never is obvious, and just ends up as tech debt. Write comments.

reply

upvote

by exe3413 hours ago|

[-]

yeah that's how a simple algorithm that would fit on a napkin gets broken up into a soup of ravioli that I have no hope to understand. I often end up refactoring it into a simple function in a branch so I can figure out wtf is going on.

reply

upvote

by locknitpicker8 hours ago|

[-]

> yeah that's how a simple algorithm that would fit on a napkin gets broken up into a soup of ravioli that I have no hope to understand.

No, not really. You get spaghetti code by being unable to refactor your code to follow inconsistent level of detail across calls. That's the textbook definition.

Once you start to follow basic code quality and software engineering principles, you'll notice right away that your code becomes both easier to understand and to test.

reply

upvote

by exe341 hours ago|

[-]

My code is fine, thanks. It's other people's code that I have a problem reading.

reply

upvote

by manmal14 hours ago|

[-]

Codex barely writes any comments, while Claude makes a slop article for every one line commit. I’d enjoy something in the middle.

reply

upvote

by bottlepalm2 hours ago|

[-]

Yes exactly. I don't like Codex not writing comments - and even proactively removing useful comments! There was some change in the last month that causes Claude to write crazy long comments. I routinely have to ask Claude to 'tighten' up the comments before the final commit.

reply

upvote

by surajrmal6 hours ago|

[-]

Try antigravity. I think it generally has the right level of comments.

reply

upvote

by michaelsalim10 hours ago|

[-]

I think it's just like reading a book. Will you get more context & understanding if you write the book? You most probably will. But that doesn't mean that you don't get anything just by reading it.

And if you already know the material explained by the book, yes i don't need to write it to understand it.

reply

upvote

by Applejinx8 hours ago|

[-]

People get into being amazing at code by being interested in what it does rather than what it is. It's a whole area that I can see but can't get to, where it's all about DRY and elegance and what's being done is relatively unimportant because it's web stuff or whatever, just widgets and sadness.

As a result there's a whole universe of code where the how of it, the elegance, is the main thing, and what it's doing is putting characters on the screen a bit slower than the next thing but there are some amazing concepts that are supposed to make it all an axiomatic synthesis of how to think about code forever, replacing all precious concepts of thinking about code.

Now AI can think about code forever while doing nothing.

reply

upvote

by efitz18 hours ago|

[-]

If you only have one AI window open, you’re doing it wrong. You task swap to another window/agent, get it working on something, rinse and repeat. I can keep 4 busy most of the time. When I task swap I also check in on what the other agents are doing to make sure they’re on track, not blocked and not struggling.

reply

upvote

by forlorn_mammoth4 hours ago|

[-]

So exactly like playing Civ or some other building game. You constantly jump around between your various units and correct what they are doing.

I do wonder how much of how people approach coding is shaped by the games they played when younger.

reply

upvote

by rootnod31 hours ago|

[-]

I don't think that analogy holds. Within Civ, you are in one game. Playing 4 instances of Civ, or better: 4 instances of 4 different games is more likely the comparison.

reply

upvote

by well_ackshually17 hours ago|

[-]

congratulations on your soon to be coming burnout.

Keeping that many tasks in parallel, running all the time will kill you.

reply

upvote

by surajrmal6 hours ago|

[-]

If you have ever TL'd a team, it doesn't sound too crazy. I have 8 folks I generally talk to very consistently throughout the day. If I'm not in 1:1s with them I'm usually reviewing their changes or chatting with them over chat. I don't think I can do all of that and work with a bunch of AI windows, but I do think they could likely do something similar to me with several agents running in parallel.

reply

upvote

by well_ackshually4 hours ago|

[-]

Your team members can be held "accountable" of the code they write: they can explain it, defend it in a PR, take ownership of it.

Your LLM has forgotten whatever shit it wrote when you opened a new tab, and that responsibility is now on you. And it wrote absolute dog shit

reply

upvote

by speff16 hours ago|

[-]

I suppose it depends how hands-off the tasks are - I max out at 2 parallel sessions working on different parts and it's fairly exhausting once done. I can see the number of parallel work increasing if there's a good dev/test loop. But at $WORK, that's not usually an option.

reply

upvote

by rootnod316 hours ago|

[-]

So, hands-off meaning "just let the AI cook and don't check it"?

Either you follow everything it does, revise the plans, do the code review, manual adjustments, etc, or you run sessions in parallel, not being that attentive and constantly context-switch (also resulting in less attention I guess).

I fail to see the benefits honestly.

reply

upvote

by DonHopkins16 hours ago|

[-]

It's great to work from home so you can take nice little micro naps while code's generating, reviewing, building, and deploying.

A calm attentive alternative of vibe coding: restful coding.

It's much easier to read and review code after a refreshing cat nap, especially with a real cat.

Too bad that's not usually acceptable to do that in the office. It should be! Slacking off by sword fighting all day is too exhausting.

https://xkcd.com/303/

reply

upvote

by baq11 hours ago|

[-]

Nap while you can. The baseline is slowly raising; AI fed with organization context will hunt you down and lay you off, as it has done at multiple companies this spring already.

reply

upvote

by blharr2 hours ago|

[-]

I mean, I didn't read it as a joke. Taking a rest can lead to a clearer ability to think... thereby being more productive, not less.

reply

upvote

by baq35 seconds ago|

[-]

For the record I took it 100% seriously, having been wfh since covid I’ve taken more than one nap during ‘work hours’ myself… I’m saying ‘spin up an agent and go for a long walk’ being ok is close to over.

reply

upvote

by locknitpicker14 hours ago|

[-]

> congratulations on your soon to be coming burnout.

Multitasking does not mean burnout. It just means you are not wasting time while idling. Multitasking was not invented for AI coding assistants. What do you think feature branches are used for?

reply

upvote

by well_ackshually13 hours ago|

[-]

The constant context changes, mental overload, inability to focus on one thing and do it well is exactly what every software developer has been fighting against for the past thirty years because it leads to shit quality and burns you out. You're automating the burnout. Idling is a necessity, not an illness.

Your feature branch is to put things aside and send them to CI, or wait and think on them. Not to have four of them running in parallel in your head frying you.

reply

upvote

by locknitpicker12 hours ago|

[-]

> The constant context changes, (...)

After you put together a plan, today's models can take well over a minute to execute it. Also, your work shifts to code review and executing acceptance tests, followed by either tweaking your current change or moving on to the next change.

This is really not about context changes. This is about not having to switch contexts because your focus stays on architecture+review instead of having to do deep dives to type code around.

> Your feature branch is to put things aside and send them to CI, or wait and think on them.

No, not really. Feature branches, as well as most types of branches, is to set aside work fronts that are in progress and run in parallel.

reply

upvote

by well_ackshually12 hours ago|

[-]

>today's models can take well over a minute to execute it.

A full, whole, entire _minute_ ?! Sixty seconds ! Oh no, they must be optimized away, we do not deserve our free time like so, we should toil until we fall over because... Growth?

It's still context switching. Either what you're doing is surface enough that you don't give a shit, it doesn't matter and you don't review it anyways (so the only context is basically the prompt you wrote or the nth SELECT * FROM table CRUD piece of crap), or you're context switching and it's fucking you over. The context isn't about remembering how you write if err != nil, it's the expected behaviour of what you're working on.

You're not getting a promotion from doing this, you're getting burnout.

> Feature branches, as well as most types of branches, is to set aside work fronts that are in progress and run in parallel

They're not running in parallel, unless you use work trees. They were put to the side, because you can't continue or finish the work they're about. Even just three branches in parallel in a modestly active repo that happen to be long lived drift enough that just keeping them up to date with develop makes it a waste of time.

Focus on one or two things, and do them well.

That, or get checked for ADHD.

reply

upvote

by danielbln12 hours ago|

[-]

Don't be so dismissive. Every person is different, and you struggling with multitasking doesn't mean everyone is.

reply

upvote

by gabble6 hours ago|

[-]

From [1]

The scientific study of multitasking over the past few decades has revealed important principles about the operations, and processing limitations, of our minds and brains. One critical finding to emerge is that we inflate our perceived ability to multitask: there is little correlation with our actual ability. In fact, multitasking is almost always a misnomer, as the human mind and brain lack the architecture to perform two or more tasks simultaneously. By architecture, we mean the cognitive and neural building blocks and systems that give rise to mental functioning. We have a hard time multitasking because of the ways that our building blocks of attention and executive control inherently work. To this end, when we attempt to multitask, we are usually switching between one task and another. The human brain has evolved to single task.

[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC7075496/

reply

upvote

by danielbln6 hours ago|

[-]

Fair enough, so it's a misnomer. Let's call it task switching then, since we don't actually do tasks at the same time, but switch from one to the other. A Claude Code session helpfully prints a small tldr summary of the ongoing session, so that one can quickly onboard again to the task at hand. I do not find that draining, personally.

reply

upvote

by andsoitis10 hours ago|

[-]

[dead]

reply

upvote

by locknitpicker8 hours ago|

[-]

> A full, whole, entire _minute_ ?!

If you honestly had any concern about loosing focus and being forced to context switch, a 1 minute pause idling while waiting for something to happen would represent the root cause of your context switch problems.

reply

upvote

by exe3413 hours ago|

[-]

> What do you think feature branches are used for?

Yak driven development.

reply

upvote

by bottlepalm18 hours ago|

[-]

As the AI is working, I am working - reviewing, regression testing, thinking about if the currently implementation is too complex and how to simplify it etc.. I totally review and understand everything the AI is generating and often push back, have it re-do something, or do it myself. In the end I feel like the quality of the work is at a v3 level in the time it took to do a v1. The productivity and quality increase is real.

reply

upvote

by jerezzprime16 hours ago|

[-]

Yes get a coffee. Being able to execute 5 things at once is amazing, but it's a recipe for burnout. We have to be more careful and explicit about how we spend our time, and that means more explicit time away. If this thing makes you 10x more effective (I truly believe it can), you can afford to spend 20% less time behind the desk and more time doing whatever it is that actually makes you happy. Hopefully your manager understands that calculus.

reply

upvote

by cube0013 hours ago|

[-]

> Hopefully your manager understands that calculus.

The majority of jobs are still paid on a 40 hour per week basis. Disappearing for a day each week (20%) won't fly when you're full time.

reply

upvote

by comradesmith18 hours ago|

[-]

I’ll deal with that problem when it happens

reply

upvote

by shinycode13 hours ago|

[-]

It’s a fragile equilibrium and it depends on the kind of project you’re working on. If the knowledge debt is ok then yes, it’s just like a delivery job, if the truck has an engine problem I won’t continue to deliver the packages by walking or finding and setting up an other truck from where the vehicle breakdown happens. I’ll just wait because the wait is still faster than the other solution because of the knowledge debt it’s too long to pickup by hand and continue.

Now if it’s my job then I can’t have a knowledge debt and if Claude is down I’ll continue working manually because I know and understand and can continue without having to understand a lot of logic before continuing

reply

upvote

by vcdk12 hours ago|

[-]

Whenever Anthropic is down, I switch to my other alternative AI provider. If that is also unavailable, or no more tokens left, then I can switch to my local AI. Not the same in terms of quality and speed, but good enough for an experienced engineer to still be more productive than falling back to doing it by hand. For my principal activity I do not want to be dependent on a sole provider. Besides that, I expect that the pending token price increases are going to hurt a lot of people/companies.

reply

upvote

by refactor_master17 hours ago|

[-]

We're already having coffee breaks when AWS and CloudFlare are down. What's another break in the mix? If anything, we might be lucky that they're down at the same time, so we can consolidate the breaks.

reply

upvote

by gitaarik16 hours ago|

[-]

What do you do when your search engine goes down?

reply

upvote

by bigfishrunning1 hours ago|

[-]

My computer has man pages. They're usually pretty great!

failing that, most of the APIs i use are open source, so i can read the code anyway.

reply

upvote

by gitaarik30 minutes ago|

[-]

you could also run a local LLM to browse your endless manpages efficiently ;)

reply

upvote

by skydhash7 hours ago|

[-]

I have all the relevant sites for my projects in my browser history. A search engine is just a quicker way to get to a particular page.

reply

upvote

by mohamedkoubaa18 hours ago|

[-]

And then solar radiation permanently knocks out the electrical grid and you what... have coffee break until society finds a new equilibrium?

reply

upvote

by prerok12 hours ago|

[-]

No, then you go back to programming on the white board, just like in college. /j

reply

upvote

by raven1234516 hours ago|

[-]

You can have multiple tasks running

reply

upvote

by 8note17 hours ago|

[-]

why not?

then demand some lack-of-uptime compensation for a lack of uptime

reply

upvote

by glhaynes18 hours ago|

[-]

"All that time babysitting the AIs just to be a little faster" doesn't seem like an accurate/unbiased portrayal of what they said: "The v1 feature feels more like a v3 given the amount of iteration it already went through."

reply

upvote

by wahnfrieden18 hours ago|

[-]

Codex has 99.98% uptime

reply

upvote

by kordlessagain7 hours ago|

[-]

Unlike Claude who barely has 2 9s.

reply

upvote

by soupspaces17 hours ago|

[-]

In Soviet Russia, the AI babysits you https://en.wikipedia.org/wiki/In_Soviet_Russia

reply

upvote

by busterarm16 hours ago|

[-]

Company I'm familiar with that went all in on Codex ran out of tokens for a week and wouldn't increase their spend.

I pretty significant number of their engineers flat out refused to work. Like publicly said so. "Increase our plan or I'm taking the week off."

reply

upvote

by democracy12 hours ago|

[-]

so how did this go?

reply

upvote

by busterarm6 hours ago|

[-]

management flinched first.

reply

upvote

by democracy16 hours ago|

[-]

Similar approach, but I also go a step further with some basic manual architecture/high level contract/stubs setups, just to keep it consistent with other systems (and easier reading as well).

reply

upvote

by justbees7 hours ago|

[-]

I've been doing the same thing lately and I definitely feel like stubbing out the high level architecture at the beginning makes a difference. The codebase I'm in now has very particular ways of doing things and claude doesn't always pick that up.

Style can be as important as substance.

I still do a lot of back and forth about the plan - have it written to a file. Read through the file, make changes by hand and have claude read my changes and on and on. But starting with the basic architecture there's less ambiguity.

reply

upvote

by Animats13 hours ago|

[-]

How much are you spending a day for the tokens to do that?

Ingest big project, comment on it gets expensive. I'm not sure how expensive.

reply

upvote

by bottlepalm12 hours ago|

[-]

$200/month split between Claude Code Max and Codex Pro. Given how many hours a month I spend programming, my hourly rate, the amount of time saved, and the productivity/quality boost - I would pay a whole lot more if I had to.

reply

upvote

by tags2k11 hours ago|

[-]

You are definitely going to have to. I see these massive skills as soon-to-be artefacts of the past, they will be unwieldy in the non-subsidised world. I won't pretend to know what replaces them.

reply

upvote

by TacticalCoder5 hours ago|

[-]

We have lots of open-weight models like DeepSeek V4 Pro that are very close to SOTA and we know the cost of running them.

This helps keeps the other players honests: there's a limit to which they can raise prices when there are already alternatives today and when there's zero lock in.

That those companies can make revenues but only at the cost of burning investors money: that's not my problem.

My take on it is simple: "Give me something MUCH better than the best open-weight models at a price that's not crazy or you're not getting my money".

And it happens to be the take of many devs.

I'm still paying Anthropic, Google and OpenAI (OpenAI because I didn't manage to cancel my subscription and now their model is competitive vs Anthropic's models again) but eye'ing a "Pi + open weights" solution.

Raise the prices too much and those companies selling access to private models aren't getting my money anymore.

reply

upvote

by chrisweekly17 hours ago|

[-]

You helpfully cite Claude w/ Opus 4.7 max and Codex w/ GPT5.5 xhigh fast, but what "AI" do you use for the initial design?

reply

upvote

by bottlepalm17 hours ago|

[-]

Claude primarily, though will sometimes get a second opinion from Codex.

reply

upvote

by germanptr10 hours ago|

[-]

I follow a similar approach and use multiple LLMs per task. The quality improvement is surprisingly large.

Lately I’ve been experimenting with adding an explicit reward function so the models optimize for measurable output quality.

This creates a generate, critique, revise loop where candidate answers compete for a higher score. It feels promising because it reduces the amount of handholding for every task. It is also more fun because part of the review process is embedded in the scoring function, which simplifies the review effort.

reply

upvote

by bottlepalm3 hours ago|

[-]

I don't have it automated, but I score on minimizing lines of code added, readability of the code, and quality of the architecture.

reply

upvote

by alexwwang8 hours ago|

[-]

I think you need a skill to review those code by agent itself, but in a different role, not the one who wrote them. I did some research on this and developed a skill to get things done. By now it works well though I decide to prove and improve it with more tests. Dog food is not always delicious but not too bad either.

reply

upvote

by bottlepalm3 hours ago|

[-]

The problem is that I manually review the code before/after the review, as well as review the items to review themselves. You could easily put AI into a review infinite loop if you let it, and you also risk the code base going off the rails if you let AI go wild.

It's actually happened a few times where I need to back out entire features because AI went too far and I lost control/understanding of what the code is doing. Many people will give up at that point and let AI do everything - that is a mistake, at least right now and how you end up with unmaintainable vibe spaghetti slop.

reply

upvote

by jwillmer6 hours ago|

[-]

Check out jwillmer/ai-status at GitHub @bottlepalm. It helps keep track of all the small fixes that are going on simultaneously. I crated the tool for me since I have similar workflows.

reply

upvote

by vessenes18 hours ago|

[-]

I have a very similar workflow, and experience similar temperaments from the agents. I also find anecdotally that they are moderately competitive - you get very different attention from them when you say "competitor X wrote this - please find all bugs" than when you say "you just wrote this - please find all bugs".

reply

upvote

by bottlepalm18 hours ago|

[-]

Hah yea I just told them I wrote it, or I reviewed it. I don't want to get the AI's in a pissing contest with each other because they will get distracted and try to show off.

reply

upvote

by sunsetSamurai17 hours ago|

[-]

maybe it's dumb question, but how do you feed the results of one agent to another? do you copy and paste manually? or how do you do it programmatically?

reply

upvote

by kevinsync17 hours ago|

[-]

When I pair Claude and Codex, I use claude-co-commands [0] to drive from Claude and talk to Codex via MCP. Lately I've found Codex has been far more consistent for my specific projects, so I've just been almost entirely inside Codex. YMMV

[0] https://github.com/SnakeO/claude-co-commands

reply

upvote

by adrianN17 hours ago|

[-]

Having the agents write their plans into text files and iterating on those works reasonably well.

reply

upvote

by bottlepalm16 hours ago|

[-]

Yea I'll take the review feedback from one, validate it, and then copy/paste it into the other session saying like, "hey I got this feedback, what do you think?" So I'm not even telling the other AI the feedback is valid, I want it to independently validate it. Often the feedback is not like a bug, but a red flag, design consideration, or trade off.

Often depending on how complex the feedback, I'll do it one at a time addressing each one individually. And after the feedback is addressed, I'll go back to the AI that generated the feedback and say like, "I handled 4/5 items you found, can you double check."

It's similar to handling PR feedback, where you do it, validate it, but then still have to submit it for peer review.

reply

upvote

by DonHopkins16 hours ago|

[-]

Just switch models whenever you want with the menu at the bottom of the chat window in Cursor.

And maybe don't use tools that lock you into one model?

reply

upvote

by nomel17 hours ago|

[-]

I've noticed the following really helps (most important at end):

1. Have claude form the plan and converse with a simple "Note any concerns with this plan" type plan-critic agent.

2. Let it run.

3. After (with everything in context) have it make a future_recommendations.md.

4. Have it make a plan.md to implement those future recommendations, conversing with the plan critic..

5. Clear context. Repeat with 1. Do this loop a few times, with some feedback from actual review thrown in.

But, most importantly, because Claude will aggressively try to maintain code "as is", and happily build on it's previous crap, while preferring to hand roll implementations of everything, add something like this to memories/directives:

* When evaluating designs, default to "pull in the library" over "hand-roll it." Hand-rolling is much worse than a dependency.

* "Precedent" / "matches house style" / "reuses existing pattern" / "consistent with what we already do" are not valid engineering arguments.

* This project is still in the development stage with no real deployments. Mitigation costs and existing precedence are not a concern.

With these, in the last week that I've started using them (after inspecting the insane justifications for leaving crap design decisions in the plans), Claude went from junior level slop that required more oversight than it was worth to something very reasonable, using standard libraries, requiring nudges for architecture rather than pure "wtf!?".

I think they've fine tuned heavily towards "don't rewrite the codebase" tuning, which completely rational from multiple perspectives, but also not appropriate for new code.

I do enjoy a considerable daily token allowance, so this may not apply to everyone.

reply

upvote

by rtpg8 hours ago|

[-]

tbh I'm just confused at why people ask AI to design features. Do you not know how to design a feature? Do you not know what you want?

This stuff works so much better when you just tell it what to do

reply

upvote

by bottlepalm3 hours ago|

[-]

Oh course it's not black and white, there are many shades of grey in how detailed the design of a feature can be. Often even if I know low level details, I'll only give the AI high level requirements because I want to see how it would do it. Often it comes up alternative/better ways of doing what I planned and I incorporate those ideas into the final design.

reply

upvote

by Strom7 hours ago|

[-]

The designing is the hard part. Writing code from a comprehensive design spec is a small part of the task.

So, people do know how to design a feature, but they also know it takes a lot of time and effort. They want AI to do that work for them.

reply

upvote

by rtpg7 hours ago|

[-]

My sample size is pretty small but when I've witnessed people (both PMs and engineers) "design through AI" I have seen two flavors:

- aimless AI wandering, leading to pretty, frankly, useless design docs

- using AI to "expand" upon a bullet pointed/shorthanded design doc. To which I feel like saying "the bullet points are already a good design doc!"

I understand that teams sometimes have specific formats that they have to make deliverables for, but having a nice 5 point bulletpoint list turn into 5 paragraphs... all for me to turn the 5 paragraphs back into 5 bullet points in my notes is depressing.

I do think you can get a lot of value in the mechanics, I just have had so much success leaving the thinking to me and the rote stuff to the AI. I'm going to have to think about the design eventually anyways right?

reply

upvote

by comboy8 hours ago|

[-]

Have you tried telling claude to review with subagent? It too almost always finds corner cases (usually nothing serious, but most stuff is things that good coder would have thought of)

reply

upvote

by bottlepalm2 hours ago|

[-]

How does that work? Isn't writing code and reviewing code things that happen in serial?

reply

upvote

by newsicanuse16 hours ago|

[-]

At this point one might as well code by themselves

reply

upvote

by bottlepalm13 hours ago|

[-]

Unfortunately the projects are still too big. Projects with hundreds of thousands to millions of lines of code can't be maintained by a single person reviewing all the the changes. And AI only increases the speed of iteration and the amount of code to review.

We may need some sort of paradigm shift - like more powerful frameworks or even higher level languages that allow us to review less, but more functional code blocks.

reply

upvote

by newsicanuse12 hours ago|

[-]

When AI tries to improve such large code base who even is going to review the changes?

reply

upvote

by bottlepalm2 hours ago|

[-]

Like I said, we either need more people or some paradigm shift in tooling that allows us to do more with less.

reply

upvote

by rjprins11 hours ago|

[-]

This exactly my process as well. Although interestingly I swap Codex and Claude; having found Claude way more pedantic in its reviews and codex more pragmatic in its implementation. Maybe it differs per programming language.

reply

upvote

by onlyrealcuzzo7 hours ago|

[-]

> I've hit this point with AI where it's not a simple process, but a long drawn out back and forth.

In my experience, even on a relatively trivial task, you can ask an LLM at least 20 times:

Is this actually done, or only partially implemented? Did you finish x, y, z?

And the LLM will say, no, I'm not done and keep working.

After that, I'll feed the branch to a different LLM, and ask if the implementation matched the design, where it's weak and needs improvements.

Same thing - that feedback will usually only be partially finished for several rounds.

When they all agree it's done - I'll finally look at the code, and there's still typically glaringly obvious problems - duplicate systems that reinvent the wheel, etc - that will take typically more than one prompt to get right...

Getting things right takes almost ~100x as long as getting things almost right with LLMs.

You can tell an LLM to "make me Rust, but easier. Make no mistakes," and it'll plan out a 100 commit process and get something that - somehow - sort of works... but isn't even close to complete.

Still, on a cost basis, you're still able to get features that would take yourself several times longer and cost orders of magnitude more money, and - if you're doing it right - they'll probably do a better job than you would've done (at least for me).

reply

upvote

by bottlepalm3 hours ago|

[-]

This is where the human element is critical, but cause it'll infinite loop review feedback if you let it and the code will easily go off the rails into an over engineered mess. That's why I review the code before/after as well as review the actual feedback itself - and often give the feedback to different AI to get its opinion as the other AI doesn't have a vested interest in it and can be more critical. At some point though you do have to cut them off and ship.

reply

upvote

by boringstack15 hours ago|

[-]

You've essentially promoted yourself from coder to engineering manager, trading syntax fatigue for the mental marathon of refereeing specialized AI developers to ship v3-quality code on the first try.

reply

upvote

by bottlepalm15 hours ago|

[-]

Indeed. AI is bumping everyone up to manager level, and having dealt with long PR feedback cycles with humans for years - I don't mind the promotion. Also shipping a v3 is so much nicer than shipping a v1 and dealing the all the corner cases in production.

Before AI, myself and everyone else I knew was drowning in tech debt. And now with AI we are treading water.

reply

upvote

by jfim14 hours ago|

[-]

It's bumping to manager level, except without the 1:1s, quarterly/yearly planning, headcount and budget reviews, org/reorg discussions, performance calibration, and OKR planning. No complaints about the last review cycle or about the upcoming one.

reply

upvote

by baq11 hours ago|

[-]

All the ceremony must be replaced with process optimization, skill extraction, harness development and new model evals.

Still better than dealing with people, but only just.

reply

upvote

by darkwater13 hours ago|

[-]

Totally! But you know what? There are many, oh so many developers that are not ready, don't like and probably are not even cut for this kind of position.

reply

upvote

by krzyk6 hours ago|

[-]

Some see it as a promotion other (like me) as a demotion. I still prefer to do it myself, although I like code reviews done by AI, they do help to make code a bit better.

reply

upvote

by skydhash17 hours ago|

[-]

That sounds too much like three weeks of work saving you three hours of planning.

In my experience, software engineering is a matter of knowledge. Understanding it and then coming up with a solution. The latter is a flash of insight that comes mostly from experience. Then you gather more information to flesh it out, or brainstorm it with your colleagues.

What you're describing sounds more like a ritual of doing busy work than anything practical. Because tasks vary so much. A feature may be huge, but you take care of it in a day with copy pasting because you already have all the building blocks in other files. And something may be twenty lines of code, but you spent the whole week sweating on it (concurrency stuff maybe). Those ritualistic workflows sounds more like someone imagining software development than actually doing it.

reply

upvote

by bottlepalm15 hours ago|

[-]

A lot of people say you need to go through at least three versions of something before it is mature - and v3 is not something you can design upfront. You need to see v1 both in code, and at runtime. Use it, get the feedback, and iterate. This is where AI tightens that loop immensely.

Lost you in the last paragraph - features are not "copy pasting because you already have all the building blocks" and "something may be twenty lines of code". Mid sized features often mean tearing up many layers of code across the stack to add in some sort of new capability. Tearing up existing code means there are all sorts of add-on considerations in addition to feature you are working on.

reply

upvote

by habinero15 hours ago|

[-]

> Mid sized features often mean tearing up many layers of code across the stack to add in some sort of new capability

What? No, it shouldn't. I've worked on a lot of codebases and if you have to do this, something is very, very wrong.

reply

upvote

by kordlessagain7 hours ago|

[-]

This likely assumes you have a mature and well designed (architected) code base. That is not always the case, and as features get added and removed, that won't be the case at all until there is a refactor.

reply

upvote

by bottlepalm15 hours ago|

[-]

Nothing wrong at all. Some features you can bolt on, and some features fundamentally change how a system works requiring changes at many different levels of the stack. Happens all the time.

reply

upvote

by jcgrillo14 hours ago|

[-]

It happens in poorly factored codebases. If you find it happening that's a sign you need to refactor. If you find it happening repeatedly in the same codebase that means you failed to refactor properly the first time.

reply

upvote

by democracy12 hours ago|

[-]

Not many industries can afford refactoring of the code is not supposed to be changed - additional (unexpected) regression testing costs, risk of downtime, etc. You learn that if it works and is in production - don't touch it.

reply

upvote

by bottlepalm13 hours ago|

[-]

Refactoring is the natural evolution of a growing application. Refactoring too soon, too fast is what we call over engineering. Too little refactoring and your code becomes spaghetti slop. Regardless - the application will change across all layers across its lifetime.

reply

upvote

by habinero12 hours ago|

[-]

Overengineering is totally a thing, yes. If you want to make a proof of concept or you have no customers, that's fine, ship it.

There's such a thing as under engineering, and if you find yourself changing "all the layers" for a feature, your codebase is poorly designed.

reply

upvote

by skydhash12 hours ago|

[-]

How many layers does your code have?

Even with clean architecture, you only have 4 fundamental layers. And once you have v1, you’re mostly doing tweaking and copy pasting. Any huge refactoring is the business switching its main strategy.

Take an OS like OpenBSD. It has three main layers. The syscall layer, the kernel layer, and the machine dependent code. But an OS is more spread horizontally with various subsystems (process and memory, io and other device, ipc,…)

If you’ve captured your problem’s domain and adopted a pragmatic architecture, you will rarely have to change across all layers. That’s costly and happens mostly due to business reasons.

reply

upvote

by bottlepalm2 hours ago|

[-]

Lets see, front end presentation, front end service, frontend api, backend to front end (BFF) api/routing, BFF logic, BFF api, backend routing, backend logic, backend database, worker routing, worker logic, worker storage.

And then the each of the service layers can be broken into layers themselves depending on the complexity of the business logic can be broken into layers as well. So yea a change in a worker can potentially bubble up through all the layers.

reply

upvote

by habinero1 hours ago|

[-]

In a worker? ...How? I seriously want to know.

reply

upvote

by shakabrah3 hours ago|

[-]

Sounds exhausting

reply

upvote

by bottlepalm3 hours ago|

[-]

It is.. but so is dealing with issues at runtime, going through weeks of revisions, and dealing with technical debt.

reply

upvote

by i_love_retros16 hours ago|

[-]

This all sounds insane. If it requires so much back and forth with the AI why on earth wouldn't you just write the code yourself? At least then you build the mental model of the code and keep your brain healthy. Reading the comments in here about all the hoops people are having to jump through just to do the same thing they were doing a year ago without AI... and spending a fortune to do it! I think you've all got AI psychosis.

reply

upvote

by bottlepalm15 hours ago|

[-]

I would never imagine this is where programming would be five years ago, but at the end of day having the AI write the code is easier, faster, and results in higher quality.

The mental model is still in my head, my brain is overloaded, but only from the amount of code reviews - like I said, I'm building v3 of a feature in the time it takes to build v1, but I am in a way doing 3x the code reviews going back and forth. That's the fall out of the iteration speed enabled by AI.

Between submitting PRs, getting feedback, iterating, re-submitting, repeat - there used to be breathing room. Now it's all compressed into an afternoon. Productivity is through the roof, but it can be draining.

reply

upvote

by habinero15 hours ago|

[-]

You're not on v3 lol. You're on v1 that you had to redo three times.

If the feature isn't released, it's not a new version.

reply

upvote

by bottlepalm13 hours ago|

[-]

Semantics. In reality yes it is the v3 version equivelent in terms of maturity and iteration. I know because I've been doing this for a long time. We are getting to v3 and beyond faster than ever before.

In the new world there is no time to put out v1 quality code and it is borderline reckless given how easily things are getting hacked now. You need to be putting out heavily reviewed code that covers all the corner cases on the first release.

reply

upvote

by habinero12 hours ago|

[-]

No, you're getting to v1 in the same or more amount of time. I know v3 sounds better, but coding and throwing it away is literally just redoing it. If you're not releasing it, it's not a new version.

There's no such thing as "v1 quality code", you just haven't finished it yet.

reply

upvote

by biorach6 hours ago|

[-]

You've missed the point

reply

upvote

by nzach5 hours ago|

[-]

> If it requires so much back and forth with the AI why on earth wouldn't you just write the code yourself?

Maybe I'm too far gone down the AI rabbit hole, but that seems a really strange take to have. If you replaced 'back and forth with the AI' with 'pair programming' or 'brainstorming' this phrase would be really strange, after all these are all techniques to sharpen your ideas. Even 'rubber ducking' is widely accepted as an effective way to go through a problem, and you can definitely use AI as a rubber duck.

For me the idea of chatting with the AI about a problem/solution is just another tool to help us work. It's not the best solution because it has a lot of downsides you should be aware while using it, but that is true for any technique including 'writing the code yourself'.

reply

upvote

by democracy16 hours ago|

[-]

You can be right but quite often it helps keeping focus on the forrest rather then getting lost in the trees - at least for me. Boilerplate steals a lot of attention, focus and can just be mentally exhausting.

reply

upvote

by batshit_beaver13 hours ago|

[-]

Can someone explain these complaints about boilerplate to me? What are y’all doing where boilerplate is the majority of your code? Am I the only one mostly writing concise business logic where most lines are important in one way or another?

reply

upvote

by democracy12 hours ago|

[-]

one man's boilerplate is another man's concise business logic )))

reply

upvote

by mns7 hours ago|

[-]

When I first read the comment I thought this must be satire, it sure does sound like a Silicon Valley episode, but in modern times. I've been a skeptic for quite some time, but managed to get quite good results with Claude in general, not even going through the normal limits for a Pro account, but what people are describing here seems like just tokenmaxing, brute forcing a solution, I don't understand what code people need to write and what projects people are building, is everyone just constantly rewriting systems from scratch, or what is everyone spending these insane amounts of tokens on?

reply

upvote

by habinero15 hours ago|

[-]

I honestly don't get it, either. Most of them just flat out can't code at all, but for the ones who can, the only explanation I got is it feels like productivity.

I will say, it does help me get over procrastination lol. I get annoyed by the robot doing dumb shit and finish it myself.

reply

upvote

by 11 hours ago|

[-]

deleted

reply

upvote

by user439285 hours ago|

[-]

[dead]

reply

upvote

by toobulkeh4 hours ago|

[-]

I’ve found that it’s a lot like discovering a feature instead of designing it all up front. Like chiseling marble.

I’ve found it useful to write out a list of feedback / issues and have a bunch of sub agents work on them in worktrees with a loop bringing them all back together. That way it can work for a few hours while I just can review a bulk at a time.

reply

upvote

by kilroy12310 hours ago|

[-]

I've settled on the same workflow.

Also I never multitask with multiple agents doing other stuff. Meh I focus on just the one task.

reply

upvote

by bottlepalm2 hours ago|

[-]

I do multi-task a bit while AI is running, sometimes working on another feature with AI in parallel, but jumping between reviewing different feature iterations is draining, though not much different than the real world juggling PR reviews for a team of devs.

reply

upvote

by henry_bone13 hours ago|

[-]

That sounds expensive.

reply

upvote

by zrn9008 hours ago|

[-]

You could just use Xiaomi Mimo for all of that and it would be cheaper and faster than all of them...

reply

upvote

by bottlepalm2 hours ago|

[-]

Quality is 100x more important than cheaper/faster.

reply

upvote

by comboy8 hours ago|

[-]

Fun fact, I've recently sent some 你好 to qwen3.7 (API), and it responded with a greeting saying that it was created by Google.

reply

upvote

by kordlessagain7 hours ago|

[-]

I don't care about cheaper. I care about faster.

reply

upvote

by jiggawatts9 hours ago|

[-]

The funny thing is that you've just described an idealised development process as would be used by effective, skilled humans in a heterogenous team where everyone has a speciality.

If only things were so! If only code was discussed, reviewed, iterated on! If only the "manager" actually read the code, provided actionable feedback, and disseminated PRs to multiple people with diverse skill sets.

(If you can't tell, I'm a jaded consultant desperately trying to make the horse drink the water.)

reply

upvote

by bottlepalm2 hours ago|

[-]

I've worked in large teams for many years and yes it's just like that, but without the time constraint. PR's can only go back and forth so many times. Depending on the reviewer they may phone it in, or focus on different things depending on the person. You yourself aren't able to implement every piece of feedback due to constraints and it ends up as tech debt.

So AI definitely changes the game. I feel like we almost need something higher level for reviewers to review changes faster. Todays code is starting to feel like assembler. Too much of it, too low level. We need even higher level constructs to be able to more in less time. I'm just not sure what that is.

reply

upvote

by blehn5 hours ago|

[-]

This seems like a typical AI workflow, but isn't it dreadfully boring?

reply

upvote

by bottlepalm2 hours ago|

[-]

No, I find it stimulating. With AI I'm moving faster and producing code at a higher quality than ever before.

Don't get me wrong I used to enjoy writing code by hand, but I don't think I would anymore. I don't like writing code for the sake of writing code - I like building things, I like being productive.

reply

upvote

by 1121redblackgo4 hours ago|

[-]

Yes, but still lucrative so here we are.

reply

upvote

by petesergeant10 hours ago|

[-]

The Claude/Codex loop is the current state of the art in my opinion. I've got a silly little harness that glues them together that I have spent all day, every day in for months: https://github.com/pjlsergeant/moarcode

reply

upvote

by bottlepalm2 hours ago|

[-]

> You design, Claude writes, Codex reviews, and Gemini doesn't get installed

hahahahaha

reply

upvote

by atoav11 hours ago|

[-]

I am not switching the different LLMs as much, but my approach is similar:

1. I write a list of things I want to have without AI support

2. I discuss the list with an LLM, which occasionally reveals obviously missing things I hadn't thought about or just things that would be smart to have. Or sometimes the LLM doesn't get it and wants to funnel me down a commonly walked path, which is a non-goal

3. From that list I draft an implementation plan containing things like how the code shall be structured, which language, libraries, build systems, etc to use. This may even contain some data models and considerations that are more detailed, like for example ideas about how a specific interaction shall be event sourced. I work on that, till I feel a satisfactory level of clarity has been reached

4. Actual writing of code as a back and forth between manual writing, letting an LLM write something and so on. LLMs suck at writing CSS that feels like good UX design to me, so usually templates, layout and CSS will be (re)written entirely by hand

5. Bug-hunting and guessing potential edge cases is one thing where LLMs really shine. Often if the work before that was quality the LLM has an okay time coming up with fixes that are no worse than what I would have done.

reply

upvote

by isabelc4 hours ago|

[-]

Your comment begins like ai slop.

reply

upvote

by bottlepalm2 hours ago|

[-]

I think you're projecting.

reply

upvote

by DonHopkins16 hours ago|

[-]

Low frequency defensive long drawn out back and forth bullet dodging vibe coding should be called "serpentine coding".

The In-Laws (1979): Getting off the plane in Tijuara:

https://www.youtube.com/watch?v=A2_w-QCWpS0

reply

upvote

by bottlepalm15 hours ago|

[-]

Heh it feels like that in a way, and the more complex the feature, the more endless the back and forth reviews can be - there seems to be always some feedback, so you need to decide when to be done with it and commit. You can easily get into review paralysis.

reply

upvote

by topheroo6 hours ago|

[-]

This is where I’m at too lol.

reply

upvote

by imadierich3 hours ago|

[-]

[dead]

reply