So presumably (if we assume there isn't a bug where the sources are ignored in the cli app) then the problem is that encoding this state for the LLM isn' reliable. I.e. it get's what is effectively
LLM said: thing A User said: thing B
And it still manages to blur that somehow?
I don't think the problem here is about a bug in Claude Code. It's an inherit property of LLMs that context further back in the window has less impact on future tokens.
Like all the other undesirable aspects of LLMs, maybe this gets "fixed" in CC by trying to get the LLM to RAG their own conversation history instead of relying on it recalling who said what from context. But you can never "fix" LLMs being a next token generator... because that is what they are.
But that doesn’t make them go away, it just makes them less glaring.
The magic is in deciding when and what to pass to the model. A lot of the time it works, but when it doesn't, this is why.