undefined

points

by NitpickLawyer6 hours ago |

comments

by amelius6 hours ago|

[-]

> If pass -> give carrot :)

More like, give $$$ pass or not.

by onlyrealcuzzo2 hours ago|

prev|

[-]

The bad thing about software is that there's infinite ways to solve the same problem, and the vast majority of them are terrible and unmaintainable, so "working" is a prerequisite, but not really "good enough".

It's good if no LLMs can find a bug. It certainly does not mean there isn't one...

I've found LLMs to be very disappointing at identifying overly complex code (that they've written) and the correct architectural decisions to 1) make the code actually work, and 2) be simple, maintainable, and future proof.

They can certainly find some bugs, which definitely has value, but I've not had much success with them writing code that simply has no bugs...

That requires simplicity and architectural correctness, something LLMs are good at vaguely bullshitting, but not very good at getting correct.

I think this can be solved by feeding them the right metrics, but I haven't found prior art for how to algorithmically pinpoint: 1) what is actually complex in a bad way (there's a lot of ways to do this roughly), and 2) where exactly the problem is most acutely (less prior art here, but some), and 3) what viable solutions are.

If you can get better at 1 and 2, the LLMs can get much better at 3.

Anybody who has ideas, I'd love to hear them, as this is what I'm working on now.