undefined

upvote

points

by etothet20 hours ago |

upvote

by datsci_est_201520 hours ago|

[-]

Bad engineers continue being bad, good engineers continue being good.

I personally don’t know any colleagues who were good engineers just because they wrote code faster. The best engineers I know were ones who drew on experience and careful consideration and shared critical insights with their team that steered the direction of the system positively.

> Claude, engineer a system for me, but do it good. Thanks!

reply

upvote

by truncate14 hours ago|

[-]

>> Bad engineers continue being bad, good engineers continue being good.

I don't know if good engineers can necessarily continue to be good. There is limit to how much careful consideration one can give if everything is on an accelerated timeline. Regardless good or not, there is limit on how much influence you have on setting those timelines. The whole playing field is changing.

reply

upvote

by ori_b12 hours ago|

[-]

It's deeper. We used to mock architects that stepped back and stopped coding, because they generated trash.

There's a cycle that is needed for good system design. Start with a problem and an approach, and write some code. As you write the code, you reify the design and flesh out the edge cases, learning where you got the details wrong. As you learn the details, you go back to the drawing board and shuffle the puzzle pieces, and try again.

Polished, effective systems don't just fall out of an engineers head. They're learned as you shape them.

Good engineers won't continue to be good when vibe-coding, because the thing that made them good was the learning loop. They may be able to coast for a while, at best.

reply

upvote

by beeandapenguin10 hours ago|

[-]

Reminds me of Gall’s Law from his book Systemantics.

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.

https://en.wikipedia.org/wiki/John_Gall_(author)#Gall's_law

reply

upvote

by ori_b7 hours ago|

[-]

I find that the learning and iteration tends to lead to a simplified system, if you're willing to look hard enough at the shapes needed.

When there's a lot of complexity, it's often repetitive translation layers, and not something fundamental to the problem being solved.

reply

upvote

by sothatsit9 hours ago|

[-]

You don’t need to write code by hand to learn from iterations and experiments. I run more experiments and try out more different solutions than I ever could before, and that leads to better decisions. I still read all the code that gets shipped, and don’t want to give that up, but the idea that all craft and learning is lost when you don’t is a bit silly. The craft/learning just moves.

reply

upvote

by ori_b7 hours ago|

[-]

How much calculus do you think you could pick up skimming a textbook without doing exercises?

We mocked these "architects" from experience. We knew that if you weren't feeling the friction yourself, you wouldn't learn enough to do good design.

Maybe you don't care about engineering great systems. Most companies don't. It's good for profit. This isn't new, though AI enables less care.

reply

upvote

by torginus3 hours ago|

[-]

Imo the biggest issue with these no-code architects has been that you could become one without ever having coded at any noteworthy level of skill (which meant most of them were like this).

In my experience, in a lot of organizations, a lot of people either lacked the ability or the willingness to achieve any level of technical competence.

Many of these people played the management game, and even if they started out as devs (very mediocre ones at best), they quickly transitioned out from the trenches and started producing vague technical guidance that usually did nothing to address the problems at hand, but could be endlessly recycled to any scenario.

reply

upvote

by sothatsit6 hours ago|

[-]

The entire mistake you are making is comparing using AI to skimming textbooks, or taking shortcuts. Your entire premise is wrong.

People who care about craft will care about the quality of what they produce whether they use AI or not.

The code I ship now is better tested and better thought through now than before I used AI because I can do a lot more. That extra time goes into additional experiments, jumping down more rabbit holes, and trying out ideas I previously couldn’t due to time constraints. It’s freeing to be able to spend more time to improve quality because the ROI on time spent experimenting has gone up dramatically.

reply

upvote

by necovek5 hours ago|

[-]

This is an unpopular take, but when I was in undergrad maths in an old-school two-semester courses with one exam (exercises + oral) to cover it at the end, I was able to get to 60-80% score on exercises when I did just theory as prep.

I couldn't get exercises done where there were tricks/shortcuts which are learned by doing a lot of exercises, but for many, these are still the same tricks/shortcuts used in proofs.

This was indeed rare among students, but let's not discount that there are people who _can_ learn from well systemized material and then apply that in practice. Everyone does this to an extent or everyone would have to learn from the basics.

The problem with SW design is that it is not well systemized, and we still have at least two strong opposing currents (agile/iterative vs waterfall/pre-designed).

reply

upvote

by andai13 hours ago|

[-]

An old comic I like:

- I've taken a controversial new pill that accelerates my brain.

-- So you're smart now?

- I'm stupid faster!

That being said, being stupid faster can work if validation is cheap (and exists in the first place).

Turns out "eh close enough" for AGI is just stupidity in an "until done" loop. (Technically referred to as Ralphing.)

reply

upvote

by teddyh12 hours ago|

[-]

You mean this one: <https://knowyourmeme.com/photos/1567852-shen-comix>

reply

upvote

by andai1 hours ago|

[-]

My favorite one:

https://iili.io/BZbHyP9.jpg

I've optimized my game's code and it finally runs at 1000 FPS.

--So your game is good now?

It's shit faster.

reply

upvote

by sanderjd13 hours ago|

[-]

Yep, validation is key. The smartest thing I've heard on this, which has reoriented how I think about this is that the objective function of a piece of software is now more important to get right than the implementation.

reply

upvote

by dspillett12 hours ago|

[-]

> the objective function of a piece of software is now more important to get right than the implementation

That has always been the case. That is why weeks or even months of programming and other project busy work could replace a couple of days of time getting properly fleshed out requirements down.

reply

upvote

by sanderjd11 hours ago|

[-]

Agreed, it has always been the case. But I've never thought of it that way so explicitly. And I might argue that the important distinction is that the objective function is programmatically verifiable (which the word "requirements" has not always implied).

reply

upvote

by andai1 hours ago|

[-]

Turns out what was being rewarded all along is "the code looks all right" and "it looks like it works".

reply

upvote

by Salgat8 hours ago|

[-]

So the chimpanzees on the keyboard thing is real.

reply

upvote

by datsci_est_201512 hours ago|

[-]

> if everything is on an accelerated timeline

Good engineers are also capable of managing expectations. They can effectively communicate with stakeholders what compromises must be made in order to meet accelerated timelines, just as they always have.

We’ve already had conversations with overeager product people what the ramifications are for introducing their vibe coded monstrosities:

  - Have you considered X?
  - Have you considered Y?

Their contributions are quickly shot down by other stakeholders as being too risky compared to the more measured contributions of proper engineers (still accelerated by AI, but not fully vibe-coded).

If that’s not the situation where you work, then unfortunately it’s time to start playing politics or find a new place to work that knows how to properly assess risk.

reply

upvote

by sanderjd13 hours ago|

[-]

Hmmm, I think I disagree with this.

I estimate that I'm now spending about 10 to 30 hours less time a week in the mechanical parts of writing and refactoring code, researching how to plumb components together, and doing "figure out how to do unfamiliar thing" research.

All of those hours are time that can now be spent doing "careful consideration" (or just being with my family or at the gym or reading a book, which is all cognitively valuable as well).

Now, I suppose I agree that if timelines accelerate ahead of that amount of regained time, then I'm net worse off, but that's not the current situation at the moment, in my experience.

reply

upvote

by truncate12 hours ago|

[-]

Maybe we do different things. Not that you are wrong about spending less time on things that you don't care about, but at the same time all that mechanical things helps you build a really good mental model of your product from high level design to individual classes. If I already have a good mental model of that I can direct AI to make really good changes fast, if I don't I will get things done ... but it does end up with less than ideal changes that compounds over time.

What you said: "figure out how to do unfamiliar thing" -- is correct, and will get things done, but overall quality, maintainability or understanding how individual pieces work...that's what you don't get. One can argue who care about all that as AI can take care of that or already can. I don't think its true today at-least.

reply

upvote

by sanderjd11 hours ago|

[-]

I guess I just don't really agree that doing the tedious mechanical things is all that helpful for building the necessary mental model. I mean, I do think it was useful (indeed, necessary) for me to actually type out very similar lines of code over and over again when I was building up the programming skillset, but I really think the marginal value of that is just very low for me at this point. I worry a lot about how we're going to train the next generation of people without there being any incentive to do this part of the process! But for me, I already did that part.

What I find is actually necessary for me to have a mental model of the system is not typing out the definitions of the classes and such, but rather operating and debugging the system. I really do need to try to do things, and dig into logs, and figure out what's going on when something is off. And pretty much always ends up requiring reading and understanding a bunch of the implementation. But whether I personally typed out that implementation, or one of my colleagues, or an AI, is less important.

I mean, I already had to be able to build a mental model of a system that I didn't fully implement myself! I essentially never work on anything that I have developed in its entirety on my own.

reply

upvote

by kdnxownxkwkd12 hours ago|

[-]

Yeah! I mean, who needs to LEARN how to to these things properly when you can just let an autocorrect on steroids hallucinate the closest thing to “barely working”. Right?

10 to 30 hours saved on not learning new things! Hurray!

reply

upvote

by sanderjd11 hours ago|

[-]

I genuinely don't understand what you're talking about with this comment. Learn how to do what things properly? I've been writing software for two decades... I'm not primarily in a learning phase, I'm in a doing phase. I'll take advantage of tools that save me time and energy in my work (for the right price). Why wouldn't I?

What do you mean by "barely working"? I can now put more iterations into getting things working better, more quickly, with less effort. That seems good to me.

10 to 30 hours a week is 25% to 75% of my time working. Seems like a pretty good trade?

I do understand that the calculation is different for people who are new to this. And I worry a lot about how people will build their skills and expertise when there is no incentive to put in all the tedious legwork. But that just isn't the phase of my career that I'm in...

reply

upvote

by vultour3 hours ago|

[-]

There is simply no chance that LLMs are saving you 30 hours of work a week, especially if they're doing something where you'd have to do the research yourself. Either you're just simply wrong, or you went from understanding the code you were writing to skimming whatever the magic box spits out and either merging it outright or pawning off the effort of review on someone else.

reply

upvote

by skydhash7 hours ago|

[-]

My one question for you: What’s your level of editor fluency? Because I would really like to know if there’s a correlation between claiming these kind of time savings and not using advanced features in your editor.

My time is spent more on editing code than writing new lines. Because code is so repetitive, I mostly do copy-pasting, using the completion and the snippets engine, reorganize code. If I need a new module, I just copy what’s most similar, remove everything and add the new parts. That means I only write 20 lines of that 200 lines diff.

Also my editor (emacs) is my hub where I launch builds and tests, where I commit code, where I track todo and jot notes. Everything accessible with a short sequence of keys. Once you have a setup like this, it’s flow state for every task. Using LLM tools is painful, like being in a cubicle reading reports when you could be mentally skiing on code.

reply

upvote

by paulddraper6 hours ago|

[-]

There is no limit.

Or at least, the limit is increasing by the day.

reply

upvote

by runarberg9 hours ago|

[-]

When there is all that crap out there, good engineer may simply just carry out, call it good and leave the industry. Personally seeing the proliferation of wibe coded apps has made me hesitant of publishing and promoting my AI free apps.

reply

upvote

by embedding-shape19 hours ago|

[-]

> I personally don’t know any colleagues who were good engineers just because they wrote code faster

Same, if anything, the opposite seems to be true, the ones that I'd call "good engineers" were slower, less panicked when production was down and could reason their way (slowly) through pretty much anything thrown at them.

Opposite experience, I've sit next to developers who are trying their fastest to restore production and then making more mistakes to make it even worse, or developers who rush through the first implementation idea they had for a feature, missing to consider so many things and so on.

reply

upvote

by ryandrake19 hours ago|

[-]

> Same, if anything, the opposite seems to be true, the ones that I'd call "good engineers" were slower

Unfortunately, a lot of workplaces are ignoring this, believing their engineers are assembly line workers, and the ones who complete 10 widgets per minute are simply better than the ones who complete 5 widgets per minute.

reply

upvote

by nathan_compton17 hours ago|

[-]

It isn't just that they believe this - they want a business model where this is how it works. For a big company a star coder is a liability - they have strong labor power, they can leave and they are hard to replace, etc.

Companies want workflows that work with mediocre programmers because they are more like interchangeable parts. This is the real secret to why AI programming will work in a lot of places. If you look at the externalities of employing talented people, shitty code actually looks better than great code.

reply

upvote

by ryandrake17 hours ago|

[-]

To these kinds of companies, what's even better than a rack of mediocre programmers? AI agents that you can just conjure up and prompt. They take up no facility space, don't require lunch breaks or vacations, obey all commands and direction, and produce a predictable and consistent amount of output per dollar.

This is the earworm the leaders of these companies have allowed into their minds. Like Agent Mulder, they Want To Believe in this so badly...

reply

upvote

by overfeed14 hours ago|

[-]

> This is the earworm the leaders of these companies have allowed into their minds. Like Agent Mulder, they Want To Believe in this so badly...

If you assume they are not idiots and analyze the FOMO incentives via a little game-theory, it becomes clear why.

Assuming the competition has adopted AI, leadership can ignore it, or pursue it. If they adopt it, then they are level with the completion whether AI actually succeeds or fails - they get to keep their executive job.

If leadership ignores AI, and it actually delivers the productivity gains to the competition, they will be fired. If they ignore AI and it's a bust, they gain nothing.

reply

upvote

by m4x7 hours ago|

[-]

If AI turns out to be a bust, ignoring it could become a significant win. One possible outcome of AI adoption is that existing code bases are degraded, and existing programmer capability is allowed to atrophy. In that situation, companies that adopt AI lose out relative to companies that eschew it.

reply

upvote

by untrust10 hours ago|

[-]

What if the outcome is the competition burns their money on LLM usage for little to no gain? If you're an exec and you jumped into LLMs as well then you also lose any advantage you would have had by saving your money or hiring a few more humans.

reply

upvote

by overfeed7 hours ago|

[-]

> What if the outcome is the competition burns their money on LLM usage for little to no gain?

The company does better than the money-burning competition, but the executives personally gain nothing; there are no bonuses just because the competition took a misstep.

reply

upvote

by sanderjd13 hours ago|

[-]

Yeah but does this work? Are there companies doing this successfully?

reply

upvote

by duskdozer7 hours ago|

[-]

It's also true that a lot of times, it doesn't even matter how shitty the code is. For example, I'm locked in to a company whose web "app" hasn't functioned for me for the vast majority of the last two to three years. I can't leave without effectively being required to leave my job. So, they still get my business.

reply

upvote

by datsci_est_201517 hours ago|

[-]

Glad I find myself employed under a division called Research and Development. Poaching and retaining highly compensated individuals is the entire purpose.

reply

upvote

by anal_reactor13 hours ago|

[-]

Bingo. This is something that many people fail to understand.

reply

upvote

by hyperadvanced8 hours ago|

[-]

I think you can understand that line of reasoning, but you can question its feasibility. You might not have any “star coders”, nor need them day-to-day, but I think the cost of not having one true expert, or having a completely vibe coded system that crashes in production will be extremely high.

reply

upvote

by sanderjd13 hours ago|

[-]

Which workplaces?

reply

upvote

by sanderjd13 hours ago|

[-]

This is true. But I find AI tools to be a huge help for all of this. Not to do any of it faster, but to remove a bunch of the tedium from the process of testing ideas and iterating on them. Instead of "I wonder if the problem is..." requiring half an hour of research, now I can do an initial check of that theory in less than a minute, and then dig further, or move onto the next one. Or say I estimate it's gonna take me an hour or more to test an idea, I might just decide I don't have time to invest in that. Well now maybe I can get a tentative answer on that by spending a minute laying out the theory and letting an agent spend ten or twenty minutes on it in the background. In this way I can explore space I just would have determined was not worth the effort previously.

To me, none of this feels like "going faster", it feels like "opening up possibilities to try more things, with a lot less tedious work".

reply

upvote

by skydhash6 hours ago|

[-]

Have you ever wonder how people do it without it being a tedium for them?

For things that have a visual elements like UI and UX, you can start with sketches (analog or digital) and eliminate the bad ideas, refine the good ones with higher quality rendering. Then choose one concept and inplement it. By that time, the code is trivial. What I found with LLM usage is that people will settle on the first one, declaring it good enough, and not exploring further (because that is tedious for them).

The other type of problem are mostly three categories (mathematical, logical, or data/information/communication). For the first type you have to find the formula, prove it is correct, and translate it faithfully to code. But we rarely have that kind of problem today unless you’re in a research lab or dealing with floating-point issues.

The second type is more common where you enacting rules based on some axioms originating from the systems you depend on. That leads to the creation of constraints and invariants. Again I’m not seeing LLM helping there as they lack internal consistency for this type of activity. (Learning Prolog helps in solving that kind of problem)

The third type is about modelizing real world elements as data structures and designing how they transform overtime and how they interact with each other. To do it well, you need deep domain knowledge about the problem. If LLM can help you there that means two things: a) Your knowledge is lacking and you ought to talk to the people you’re building the system for; b) The problem is solved and you’d do well to learn from the solution. (Basically what the DDD books are all about)

Most problems are a combination of subproblems of those three categories (recursively). But from my (admittedly small amount of) interactions with pro LLM users, they don’t want to solve a problem, they want it to be solved for them. So it’s not about avoiding tediousness, it’s sidestepping the whole thing.

reply

upvote

by notnullorvoid7 hours ago|

[-]

> Bad engineers continue being bad, good engineers continue being good.

Unfortunately I have seen some really good software engineering peers regress into bad engineers through a increasing reliance on AI.

Conversely some very bad engineers (undeserving of the title) have been producing better outputs than I ever expected possible of them.

reply

upvote

by LtWorf13 hours ago|

[-]

Good engineers need to be allowed to be good. If they are told to pump features or lose their job, they might act like bad engineers as well.

reply

upvote

by sanderjd13 hours ago|

[-]

Aren't they more likely to leave?

reply

upvote

by LtWorf4 hours ago|

[-]

Depends. If they have a good salary, nice coworkers, WFH. If they manage to tolerate having to produce crap they might stick around if other factors are above average.

For someone with 3-4 kids who lives far from the city, WFH and time flexibility can be important motivators.

reply

upvote

by jkaptur18 hours ago|

[-]

> I personally don’t know any colleagues who were good engineers just because they wrote code faster.

However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out. It's precisely that speed that enables a process like "let's try X, hmm, how about Y, no... ok, Z is nice; ok team, here are the tradeoffs...". Then they remember their experience with X, Y, and Z, and use it to shape their thinking going forward.

Meanwhile, other engineers have gotten X to finally mostly work and are invested in shipping it because they just want to be done. In my experience, this is how a lot of coding agents seem to act.

It's not obvious to me how to apply the expert loop to agentic coding. Of course you can ask your agent to try several different things and pick the best, or ask it to recommend architectural improvements that would make a given change easier...

reply

upvote

by datsci_est_201518 hours ago|

[-]

Or: depth-first search of the solution space vs breadth-first (or balanced) search of the solution space.

> Of course you can ask your agent to try several different things and pick the best, or ask it to recommend architectural improvements that would make a given change easier

The ideal solution increasingly seems to be encoding everything that differentiates a good engineer from a bad engineer into your prompt.

But at that point the LLM isn’t really the model as much as the medium. And I have some doubts that LLMs are the ideal medium for encoding expertise.

reply

upvote

by sanderjd13 hours ago|

[-]

I really don't relate to this...

The way you apply the expert loop is to be the expert. "Can we try this...", "have you checked that...", "but what about...".

To some degree you can try to get agents to work like this themselves, but it's also totally fine (good, actually) to be nudging the work actively.

reply

upvote

by beacon29414 hours ago|

[-]

As you practice it will be apparent, you simply keep working on the application architecture yourself.

reply

upvote

by skydhash18 hours ago|

[-]

> However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out

The Pragmatic Programmer book has whole chapters about this. Ultimately, you either solve the problem analogously (whiteboard, deep thinking on a sofa). Or you got fast as trying out stuff AND keeping the good bits.

reply

upvote

by Quekid512 hours ago|

[-]

> However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out.

That's not my experience... mostly it's about first interrogating the actual problem with the customer and conditions under which it occurs. Maybe we even have appropriate logging in our production application? We usually do, because you know, we usually need to debug things that have already happened.

(If it's new/unreleased code, sure fine, let's find a debugger.)

reply

upvote

by nly12 hours ago|

[-]

The best paid engineers I know seem to be the super fast hackers who write unfathomable amounts of code in short order.

Unfortunately thoughtful design and engineering doesn't get recognised

reply

upvote

by bdangubic12 hours ago|

[-]

in my experience this is because there are very very very very few thoughtful designers and engineers, especially compared to people that are cranking out code.

reply

upvote

by galangalalgol9 hours ago|

[-]

Also thoughtful code varies from that library that does that thing you need with an api so intuitive you don't even need autocomplete or docs (though it has docs) to the library that is extensible to every possible use case you will never need but missing the obious ones you do or at least makes them horribly unergonomic in the name of that extensibility and purity with regard to some random paradigm that is self evidently the best one.

reply

upvote

by 14 hours ago|

[-]

deleted

reply

upvote

by jakevoytko18 hours ago|

[-]

Yeah, a lot of people came of age with a "we'll fix it when it's a problem" mindset. Previously their codebases would start to resist feature development, you'd fix the immediate bottlenecks, and then you could kick the can down the road a bit until you hit the next point of resistance. You kinda refactor as you do features. The frontier models have pushed the "it's a problem" moment further back. They can kinda work with whatever pile of code you give them... to a point. So it manifests as the LLM introducing extra regressions, or dropping more requirements than it used to, but it's not really manifesting as the job being harder for you. It's just not as smooth as it was from an empty repository. Then you hit the point where it just breaks too much and you need to fix it. And the whole codebase is just fractal layers of decisions that you didn't make. That's hard to untangle. And you're not editing the code yourself, so you don't have that visceral "adding this specific thing in this specific way has a lot of tension" reaction that allows you to have those refactoring breakthroughs.

reply

upvote

by meridian-v13 hours ago|

[-]

This is the sharpest observation in the thread. The "tension" you describe is proprioception for code — you feel where the abstractions leak, where the seams don't align, through the act of writing and refactoring. It's not a visual signal. You can't get it from reading a diff.

The risk isn't that agents write bad code. It's that developers lose the sense that tells them where code is bad. Code review is perception. Writing code is proprioception. They're different senses and one doesn't substitute for the other.

The question for the agent era isn't "is the code good enough to ship" — it's "do I still have enough coupling to the codebase to know when it isn't?"

reply

upvote

by patrick-elmore10 hours ago|

[-]

[dead]

reply

upvote

by layoric12 hours ago|

[-]

This is very true, I've found these tools that I am highly encouraged to use very hit and miss, which they are by nature. After using Matt Pocock's skills, I've come around to the idea that LLM's main utility is to act as the ultimate rubber ducky. The `grill-me` feature is honestly the most useful, not for guiding the follow up writing of code, but to make me write down and explore the idea I have more quickly. It's guesses of questions to ask are generally pretty good. I don't believe there is any 'understanding', so I feel the rubber ducky analogy works quite well. This isn't anything you couldn't do before with some discipline, but at least I find it helpful to be more consistent.

reply

upvote

by pydry12 hours ago|

[-]

The first time i used LLMs it was to try and refactor behind a solid body of tests i trusted.

I figure if it cant code when it has all of the necessary context available and when obscure failures are easily detected then why would i trust it when building features and fixing bugs?

It never did get good enough at refactoring.

reply

upvote

by layoric7 hours ago|

[-]

I agree, the mechanical refactoring of modern IDE tooling, especially with typed languages is so much faster and safer, it's not even close. These tools can be useful for sure, but I think in general they are being wayy over prescribed to different tasks.

reply

upvote

by teeray13 hours ago|

[-]

Can’t wait for the next stage of escalation when teams start to feel code review is keeping them from vibe coding utopia. It’ll probably be “AI review only, keep your human opinions to yourself” just so they can continue to check the “all changes are reviewed” box on security checklists.

reply

upvote

by tbrownaw12 hours ago|

[-]

> Vibe Coding (and LLMs) did not create undisciplined engineering organizations or engineers.

Loss of discipline can be a result of panic or greed.

Perhaps believing that your own costs or your competitors' costs are suddenly becoming 10x lower could inspire one of those conditions?

(Also for greenfield projects specifically, it can plausibly be an experiment just to verify what happens. Some orgs are big enough that of course they can put a couple people on a couple-month project that'll quite likely fall flat.)

reply

upvote

by bitexploder19 hours ago|

[-]

Vibe coded apps with barely no tests, invariants, etc. No wonder it turns into spaghetti. You can always refactor code, force agents to write small modular pieces and files. Good engineering is good engineering whether an agent or human wrote the code. Take time to force agents to refactor, explore choices. Humans must at least understand and drive architecture at this point still. Agents can help and do recon amazingly and provide suggestions.

reply

upvote

by mleo13 hours ago|

[-]

I can’t understand this. The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc… Whatever I would expect a developer to do, I would expect an agent to do. All implementation has to go through build success and mixed agent reviews before moving on. I might not do this with initial research/throwaway prototype, but once I know what direction to go and expect code to go to production it is vital to set guard rails.

reply

upvote

by gck112 hours ago|

[-]

> The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc

I do this too, but then I sit and observe how agent gets very creative by going around all of these layers just to get to the finish line faster.

Say, for example, if I needlessly pass a mutable reference and the linter screams at me, I know it's either linter is wrong in this case, or I should listen to it and change the signature. If I make the lazy choice, I will be dissatisfied with myself, I might even get scolded, or even fired if I keep making lazy choices.

LLM doesn't get these feelings.

LLM will almost always go for silencing it because it prevents it from reaching the 'reward'. If you put guardrails so that LLM isn't allowed to silence anything, then you get things like 'ok, I'll just do foo.accessed = 1 to satisfy the linter'.

Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?

reply

upvote

by Daishiman7 hours ago|

[-]

> Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?

Claude is remarkably good at figuring this is out. I asked it to look at a failing test in a large and messy Python codebase. It found the root cause and then asked whether the failure was either a regression or an insufficiently specified test, performed its own investigation, and found that the test harness was missing mocks that were exposed by the bug fix.

It has become amazingly good at investigating.

reply

upvote

by Quekid512 hours ago|

[-]

Generated tests... I mean... listen to yourself.

I can generate a lot of tests amounting to assert(true). Yeah, LLM generated tests aren't quite that simplistic, but are you checking that all the tests actually make sense and test anything useful? If no, those tests are useless. If yes, I don't actually believe you.

It's the typical 10 line diff getting scrutinized to death, 1000 line diff: Instant LGTM.

Pay attention to YOUR OWN incentives.

reply

upvote

by adastra225 hours ago|

[-]

LLMs are accelerants. They elevate great engineers to ever more dizzying heights of productivity. They also multiply massively the sloppy output of shit engineers.

reply

upvote

by jillesvangurp6 hours ago|

[-]

It's also helping the engineers that do have standards. A lot of what I put in my guard rails (crafted to get better outcomes for my prompts) is not exactly rocket science. Those guard rails just impose some sane engineering processes and stuff I care about.

As models get better, they seem to be biased to doing most of these things without needing to be told. Also, coding tools come with built in skills and system prompts that achieve similar things.

Two years ago I was copy pasting together a working python fast API server for a client from ChatGPT. This was pre-agentic tooling. It could sort of do small systems and work on a handful of files. I'm not a regular python user (most of my experience is kotlin based) but I understand how to structure a simple server product. Simple CRUD stuff. All we're talking here was some APIs, a DB, and a few other things. I made it use async IO and generate integration tests for all the endpoints. Took me about a day to get it to a working state. Python is simple enough that I can read it and understand what it's doing. But I never used any of the frameworks it picked.

That's 2 years ago. I could probably condense that in a simple prompt and achieve the same result in 15 minutes or so. And there would be no need for me to read any of that code. I would be able to do it in Rust, Go, Zig, or whatever as well. What used to be a few days of work gets condensed into a few minutes of prompt time. And that's excluding all the BS scrum meetings we'd have to have about this that and the other thing. The bloody meetings take longer than generating the code.

A few weeks ago I did a similar effort around banging together a Go server for processing location data. I've been working against a pretty detailed specification with a pretty large API surface and I wanted an OSS version of that. I have almost no experience with Go. I'd be fairly useless doing a detailed code review on a Go code base. So, how can I know the thing works? Very simple, I spent most of my time prompting for tests for edge cases, benchmarking, and iterating on internal architecture to improve the benchmark. The initial version worked alright but had very underwhelming performance. Once I got it doing things that looked right to me, I started working on that.

To fix performance, I iterated on trying to figure out what was on the critical path and why and asking it for improvements and pointed questions about workers, queues, etc. In short, I was leaning on my experience of having worked on high throughput JVM based systems. I got performance up to processing thousands of locations per second; up from tens/hundreds. This system is intended for processing high frequency UWB data. There probably is some more wiggle room there to get it up further. I'm not done yet. The benchmark I created works with real data and I added generated scripts to replay that data and play it back at an accelerated rate with lots of interpolated position data. As a stress test it works amazingly well.

This is what agentic engineering looks like. I'm not writing or reviewing code. But I still put in about a week plus of time here and I'm leaning on experience. It's not that different from how I would poke at some external component that I bought or sourced to figure out if it works as specified. At some point you stop hitting new problems and confidence levels rise to a point where you can sign off on the thing without ever having seen the code. Having managed teams, it's not that different from tasking others to do stuff. You might glance at their work but ultimately they do the work, not you.

reply

upvote

by lumost13 hours ago|

[-]

Honestly, the problem is one of BS detection.

Lead engineer says something is not workable? Pm overrides saying that Claude code could do it. Problems found months later at launch and now the engineers are on the hook.

New junior onboardee declares that their new vision is the best and gets management onto it cuz it’s trendy -> broken app.

It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.

reply

upvote

by tom133713 hours ago|

[-]

I hate how correct you are. Working at a company with only two engineers and few sales and marketing people the amount of "hey i made that feature with claude when can we ship it for the customer? I showed them and they really like it" only to look at the code and find out that it doesn't adhere any of our standards and is not of a good quality either. But if you tell that then it's "yea but everyone is ai shipping now and we cannot be the ones not doing it as we will lose customers..." yea but now we are losing maintainability, understanding of our codebase and make ourself dependant on LLM providers who are getting more expensive every week.

reply

upvote

by zxspectrumk486 hours ago|

[-]

> It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.

Exactly right.

reply

upvote

by jsemrau13 hours ago|

[-]

The same applies to banks and lending standards. In the end it is a function of governance and professional conduct.

reply