undefined

points

[-]

I'm expecting we'll likely end back up on agents making PRs, and having to review them. Either that or giving up on quality etc/dealing with very messy code. I've been trying various automated testing/linting/etc strategies, and they only work so well.

by Frannky14 hours ago|

parent|

[-]

That would be a nightmare. One thing is to review a PR generated by a human using AI and caring about the code; another is reviewing wild agents, especially when they make changes everywhere

by mswphd14 hours ago|

parent|

[-]

I'm not excited about it, but the only main way I've been able to discover LLM-isms that sneak in are

1. via seeing them glimpse by in the agents' window as its making edits (e.g. manual oversight), or 2. when running into an unexpected issue down the line.

If LLMs cannot automatically generate high quality code, it seems like it may be difficult to automatically notice when they generate bad code.

by paulddraper16 hours ago|

prev|

[-]

> I would love to figure out how to stop that from happening automatically.

AGENTS.md

by jazzypants16 hours ago|

parent|

[-]

> AGENTS.md

-- which will be ignored just often enough that you can never quite trust it.

by theowaway21345615 hours ago|

parent|

[-]

Yup. No matter how much you tell it to keep things simple, modular, crisp, whatever, it generates tons of garbage much too often.

by bigmadshoe14 hours ago|

parent|

prev|

[-]

Btw it may be obvious but afaik claude by default only reads CLAUDE.md and not AGENTS.md

by paulddraper15 hours ago|

parent|

prev|

[-]

And yet still less often than the average developer.

by Frannky15 hours ago|

parent|

prev|

[-]

I think the issue is deeper than prompts, agents.md, smart flows, etc. I think the problem is that LLMs are searchers, trained on preferring some results. So, if the dumb solution is there, and the smart solution is not there, they won't spit it out.