1. via seeing them glimpse by in the agents' window as its making edits (e.g. manual oversight), or 2. when running into an unexpected issue down the line.
If LLMs cannot automatically generate high quality code, it seems like it may be difficult to automatically notice when they generate bad code.
AGENTS.md
-- which will be ignored just often enough that you can never quite trust it.