undefined

points

[-]

> The big reason the "AI thought leadership" claim that AI should do well at coding is because there are mechanical success metrics like tests.

I mean, if you properly define "do well" as getting a first draft of something interesting that might or might not be a step towards a solution, that's not completely wrong. A pass/fail test is verified feedback of a sort, that the AI can then do quick iteration on. It's just very wrong to expect that you can get away with only checking for passing tests and not even loosely survey what the AI generated (which is invariably what people do when they submit a bunch of vibe-coded pull requests that are 10k lines each or more, and call that a "gain" in productivity).

by pron1 days ago|

parent|

[-]

It's not completely wrong if you're interested in a throwaway codebase. It is completely wrong if what you want is a codebase you'll evolve over years. Agents are nowhere close to offering that (yet) unless a human is watching them like a hawk (closer than you'd watch another human programmer, because human programmers don't make such dangerous mistakes as frequently, and when they do make them, they don't hide them as well).