upvote
Yeah, I see it write stuff to memory pretty regularly, maybe it works sometimes, but for things I want it to stop doing or always do, I make it impossible to do otherwise via lint or some style enforcement, or via a test that fails if code shows up that violates the constraint.

But, it does a good job following existing conventions in a codebase, as long as they're really consistent. So the more actively you enforce that consistency the more likely it is to do the right thing without memories or prompting.

I don't like "never do" or "always do" type rules in AGENTS.md or in memory, as it often over-interprets them and ties itself in knots trying to satisfy an impossible set of goals.

reply
In my own multi agent framework I use cheap models to check the responses of the expensive models, as well as using multiple expensive models adversarially in debate. The cheap models are great at spotting eg the model getting stuck in the alternate between two broken ideas or not following code conventions or missing a step in the skill and so on. I’m currently working on making them detect user corrections and police that going forward to intervene when the expensive models forget the thing you just corrected them about etc.
reply
I've explicitly banned Opus from creating memories unprompted, as it would often save info that's incorrect and which would then be propagated to future sessions until caught. Ugh x 10.
reply