> The full setup works with any project that has a benchmark and test suite.
so having a clear and measurable verification step is key. Meaning you can't simply give an AI agent a vague goal e.g. "improve the quality of the codebase" because it's too general.