upvote
You would think git history should be the first thing an agent would look at, as they make so many mistakes before they get to the correct answer. They don't.

I haven't measured, but documenting bug fixes and architecture seems to help, along with TDD patterns, including integration tests.

I would probably add it to Claude.md to look for all of the above when tackling a new bug.

reply
I made a harness that preserves memory for both user messages and task execution. One reason this works is related to judge agents - they can't review information that was not written down. So I track everything in my harness. The judge agents bring the most benefit, based on my evals. The coding agent can execute a task without all the ceremony just as well, but judging needs something to grasp on, besides code. And adding new perspectives helps a lot, it is the most useful intervention. My flow is - user emits a task, the agent plans, then judge agents review the plan, then main agent executes, then judge again reviews the execution. Might consume more tokens to track execution and judgements, but worth it.
reply
My Claude code frequently looks through git history, both when planning and debugging.
reply
>Is searching through the current version of your code and possibly git history not enough?

While you can document everything and use git history, I think that having short entries in a kind of memory to remember past decisions, how issues were solved would be much more token efficient than reading lots of documentation and looking at git history and past code.

reply