undefined

points

[-]

First, it is important for these discussions that people include details like I did. We're all better off to not generalize.

RE: Claude Code, no I haven't used it, but I did do the Anthropic interview problem, beating all of Anthropic's reported Claude scores even with custom harnesses etc.

It's not a dunk that agents can't produce "as fast as can be" code; their code is usually still reasonably fast; it's just often 2-10x slower than can be.

by jacobr156 minutes ago|

parent|

[-]

There is a lot to be done with good prompting.

Early on, these code agents wouldn't do basic good hygiene things, like check if the code compiled, avoid hallucinating weird modules, writing unit tests. And people would say they sucked ....

But if you just asked them to do those things: "After you write a file lint it and fix issues. After you finish this feature, write unit tests and fix all issues, etc ..."

Well, then they did that, it was great! Later the default prompts of these systems included enough verbiage to do that, you could get lazy again. Plus the models are are being optimized to know to do some of these things, and also avoid some bad code patterns from the start.

But the same applies to performance today. If you ask it to optimize for performance, to use a profiler, to analyze the algorithms and systemically try various optimization approaches ... it will do so, often to very good results.