undefined

upvote

points

by CharlieDigital5 hours ago |

upvote

by bambushu45 minutes ago|

[-]

This matches my experience exactly. I built a tool that sends code to three different AI models for review because the model that wrote the code can't critique it honestly. It has all the context and actively suppresses objections. The second model, with zero context, immediately finds things the first one rationalized away. Taste isn't just knowing what good looks like, it's being willing to say "this isn't it" to your own work. AI can't do that to itself yet.

reply

upvote

by sodapopcan5 hours ago|

[-]

> Instead, the right train of thought is: "what would perfect code look like?" and then meticulously describe to the LLM what "perfect" is to shape every line that gets generated.

I think this goes against what a lot of developers want AI to be (not me, to be clear).

reply

upvote

by forgetfulness5 hours ago|

[-]

Also a lot of middle managers. Many organizations enthusiastically adopting AI are doing so because they want to appeal to the authority of the bots and bludgeon colleagues with it.

reply

upvote

by CharlieDigital5 hours ago|

[-]

I'm looking at it from a team perspective.

With the right docs, I can lift every developer of every skill level up to a minimum "floor" and influence every line of code that gets committed to move it closer to "perfect".

I'm not writing every prompt so there is still some variation, but this approach has given us very high quality PRs with very minimal overhead by getting the initial generation passes as close to "perfect" as reasonably possible.

reply

upvote

by sodapopcan5 hours ago|

[-]

Oh I agree with you, I'm just saying a lot of developers don't want to use it like that. AI has liberated them from the drudgery of reading and writing code and they won't accept that they should still be doing a bit of both, if not a lot of reading.

reply

upvote

by aaaronic5 hours ago|

[-]

It does amaze me when colleagues refuse to read what I (personally, deliberately) wrote (they ask AI to summarize), but then tell AI to write their response and it's absolutely bloated and full of misconceptions around my original document.

If they aren't willing to read what I put effort into, why should I be expected to read the ill-conceived and verbose response? I really don't want to get into a match of my AI arguing with your AI, but that's what they've told me I should be doing...

reply

upvote

by switchbak3 hours ago|

[-]

I've been having ongoing issues with a manager who responds in the form of Claude guided PRs. Undoubtedly driven from confused prompts. Always full of issues, never actually solving the problem, always adding HEAPS of additional nonsense in the process.

There's an asymmetry of effort in the above, and when combined with the power asymmetry - that's a really bad combo, and I don't think I'm alone.

I'm glad to see the appreciation of the enormous costs of complexity on this forum, but I don't think that has ascended to the managerial level.

reply

upvote

by CharlieDigital3 hours ago|

[-]

    > ...a manager who responds in the form of Claude guided PRs

I think the job of a dev in this coming era is to produce the systems by which non-engineers can build competently and not break prod or produce unmaintainable code.

In my current role, I have shifted from lead IC to building the system that is used by other IC's and non-IC's.

From my perspective, if I can provide the right guardrails to the agent, then anyone using any agent will produce code that is going to coalesce around a higher baseline of quality. Most of my IC work now is aligned on this directionality.

reply

upvote

by sodapopcan4 hours ago|

[-]

Ya, I can't stand that. Asking a question and being hit with "this is what claude said" gives me a new kind of rage.

reply

upvote

by Izkata1 minutes ago|

[-]

Yeah, this happened to me recently and the advice could have caused data corruption (yay old systems). I only caught it because they asked before making changes and I had a vague memory of it from having investigated the same thing almost a decade ago (and found the note and explanation with a link to a bugtracker in my personal wiki).

reply

upvote

by Shorel4 hours ago|

[-]

It doesn't matter, one way or the other. The overall market share will decide. In some cases, I think good code will be a decisive factor. Think Steam launcher Vs Epic. Epic doesn't have good code. Their performance suffers in consequence. In other cases the users are so trapped it makes no difference. MS Outlook and Teams is the prime example of this.

reply

upvote

by aaaronic5 hours ago|

[-]

I've worked in too many large codebases where no one can point to any _single file or class_ and label it "correct," ("the right way") yet management is amazed when the lack of a "North Star" means the codebase is full of overlapping, piecemeal patterns that are lucky to work together at all.

reply

upvote

by CharlieDigital4 hours ago|

[-]

That's why the team needs someone with "taste" to dictate the idiomatic way to do it and why LLMs (when used this way) can raise the floor of quality and baseline of consistency.

reply

upvote

by dist-epoch5 hours ago|

[-]

> Instead, the right train of thought is: "what would perfect code look like?"

That's the classic 2nd-system effect - "let's rewrite it from scratch, now that we know what we want". And you throw away all the hard-learned lessons.

https://en.wikipedia.org/wiki/Second-system_effect

reply

upvote

by CharlieDigital5 hours ago|

[-]

Not really the case; you're misunderstanding the term second system effect.

    > The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one.  The result, as Ovid says is a "big pile".  For example, consider the IBM 709 architecture, later embodied in the 7090.  This is an upgrade, a second system for the very successful and clean 704.  The operation set is so rich and profuse that only about half of it was regularly used.  (p.55)
    > 
    > The second-system effect has another manifestation somewhat different from pure functional embellishment.  That is a tendency to refine techniques whose very existence has been made obsolete by changes in basic system assumptions. (p.56)

It's the exact opposite: by explicitly dictating what is correct, perfect, and standard in this codebase, we achieve very high consistency and quality with very little "embellishment" and excess because the LLM is following a set of highly curated instructions rather than the whims of each developer on the team.

Suggest that you re-read what Brooks meant by "second system effect".

reply

upvote

by subhobroto3 hours ago|

[-]

> Instead, the right train of thought is: "what would perfect code look like?" and then meticulously describe to the LLM what "perfect" is to shape every line that gets generated.

I don't think there's perfect code.

Code is automation - it automates human effort and humans themselves have error, hence not perfect.

So as long as code meets or exceeds the human output, it's "good enough" and meets expectations. That's what a typical customer cares about.

A customer will happily choose a tent made of tarp and plastic sticks that's available at their budget, right now when it's raining outside, over an architectural marvel that will be available sometime in the future at some unknown pricepoint.

Put another way, I don't think if you built CharlieAI CharlieGPT today, where the only differentiating factor over ChatGPT was that CharlieGPT was written using perfect code, you would have any meaningful edge.

I am yet to see any evidence where everything else being equal, one company had an edge over another simply due to superior code.

Infact, I have overwhelming evidence of companies that had better code succumb and vanish against companies that had very little, if any code, because those dollars were instead invested in better customer discovery, segmentation and analytics ("what should we build?", "if we did one thing that would give our customers an unfair advantage, what would be that thing?")

Software history is full of perfect OS, editors, frameworks, protocols that is lost over time because a provably inferior option won marketshare.

You are using a software controlled SMPS to power your device right now. You have 0 idea what the quality of that code is. All you care about is whether that SMPS drains your battery prematurely and heats up your device unnecessarily. It's extremely unlikely that such an efficient, low overhead control system was written using well abstracted modules. It's more likely that control system is full of gotos and repeated violations of DRY that would make a perfectionist shudder and cry.

reply

upvote

by CharlieDigital3 hours ago|

[-]

    > I don't think there's perfect code

Note I used "perfect" in my text. In this context, meaning it passes human PR reviews following our standard guidelines with minimal feedback/correction required.

    > So as long as code meets or exceeds the human output, it's "good enough" and meets expectations. That's what a typical customer cares about.

Why settle for this when "perfect" is "free"? I understand this dichotomy when writing "perfect" code requires more expensive, more experienced human resources or more time so you settle for "good enough"; but this is no longer the case, is it? The cost of "perfect" is only perhaps a few fractions of a cent higher than shitty.

You only need to accurately describe what "perfect" is to the LLM instead of allowing it to regress to the mean of its training set. There really is no cost difference between writing shitty code and "perfect" code now; its just a matter of how good you are at describing "perfect" to the LLM.

For example, we very specifically want our agents to write code using C# tuple return types for private methods that return more than 1 value instead of creating a class. The tuple return type is a stack allocated value type and has a default deconstructor. We also always want to use named tuple fields every time because it removes ambiguity for humans and increases efficiency for agents when re-reading the code.

We want the code to make use of pattern matching and switch expressions (not `switch-case`) because they help enforce exhaustive checks at compile time and make the code more terse.

If we simply tell the agent these rules ahead of time, we get "perfect", consistent code each time. Being able to do so requires "taste" and understanding why writing code one way or using a specific language construct or a specific design pattern is the "right" way.

reply

upvote

by subhobroto1 hours ago|

[-]

> There really is no cost difference between writing shitty code and "perfect" code now; its just a matter of how good you are at describing "perfect" to the LLM.

The consequent is at odds with the antecedent. It's a performative contradiction (if the output were truly "free", the skill of the operator would be a zero-value variable - yet, by requiring skill, you acknowledge a cost) as I prove below

> The cost of "perfect" is only perhaps a few fractions of a cent higher than shitty.

Is your cost model accounting for the cost of specification, of review and additional cycles required if review fails or the specification itself needs to be adjusted?

> If we simply tell the agent these rules ahead of time, we get "perfect", consistent code each time

No, in the simplest case, your cost of perfection is simply moving up the chain of abstraction from implementation (coding) to design and specification. In reality it also splits and moves a part of that cost downstream to verification.

This isn't some special, magical insight I have, I'm reiterating Tesler's Law right back to you.

I also encourage you to read software history - for decades it has been trivial to split out perfectly working CRUD from an ER and UML diagram, no LLM necessary. The insight is understanding why we continue to hire cheap human labor to spit out CRUD instead of using those tools.

The cost of software is, and always has been, in the figuring out the intent, not the generation of syntax.

I wish pg was more active on HN - I expect this is one of the reasons why he wanted founders to have and share the painpoints of their (potential) customers. Figuring out the intent is expensive. Mistake the intent and the best case scenario is a pivot.

reply

upvote

by CharlieDigital1 hours ago|

[-]

    > I wish pg was more active on HN - I expect this is one of the reasons why he wanted founders to have and share the painpoints of their (potential) customers. Figuring out the intent is expensive. Mistake the intent and the best case scenario is a pivot.

Your mistake is that you think the point is that only engineers participate in the production of code. In fact, the point is that the product team and the people closest to the customer can generate the code. And for that reason, the goal is to produce a framework on top of which "perfect" code can be produced with relative ease and consistency regardless of whether the user is part of engineering or product.

    > Is your cost model accounting for the cost of specification

This is the same cost no matter what. The LLM does not generate code on its own; some operator must provide some instruction and specification regardless so you might as well give it good ones. But here, I would point out that there is a high level of broader general instructions that incur a one-time cost of specification ("Always write this code this way").

reply

upvote

by aaztehcy55 minutes ago|

[-]

[flagged]

reply