undefined

points

[-]

When you use LLMs with APIs I at least see the history as a json list of entries, each being tagged as coming from the user, the LLM or being a system prompt.

So presumably (if we assume there isn't a bug where the sources are ignored in the cli app) then the problem is that encoding this state for the LLM isn' reliable. I.e. it get's what is effectively

LLM said: thing A User said: thing B

And it still manages to blur that somehow?

by jasongi10 hours ago|

parent|

[-]

Someone correct me if I'm wrong, but an LLM does not interpret structured content like JSON. Everything is fed into the machine as tokens, even JSON. So your structure that says "human says foo" and "computer says bar" is not deterministically interpreted by the LLM as logical statements but as a sequence of tokens. And when the context contains a LOT of those sequences, especially further "back" in the window then that is where this "confusion" occurs.

I don't think the problem here is about a bug in Claude Code. It's an inherit property of LLMs that context further back in the window has less impact on future tokens.

Like all the other undesirable aspects of LLMs, maybe this gets "fixed" in CC by trying to get the LLM to RAG their own conversation history instead of relying on it recalling who said what from context. But you can never "fix" LLMs being a next token generator... because that is what they are.

by coffeefirst9 hours ago|

parent|

[-]

I think that’s correct. There seems to be a lot of fundamental limitations that have been “fixed” through a boatload of reinforcement learning.

But that doesn’t make them go away, it just makes them less glaring.

by afc9 hours ago|

parent|

prev|

[-]

That's exactly my understanding as well. This is, essentially, the LLM hallucinating user messages nested inside its outputs. FWIWI I've seen Gemini do this frequently (especially on long agent loops).

by exitb11 hours ago|

prev|

[-]

Aren’t there some markers in the context that delimit sections? In such case the harness should prevent the model from creating a user block.

by dtagames10 hours ago|

parent|

[-]

This is the "prompts all the way down" problem which is endemic to all LLM interactions. We can harness to the moon, but at that moment of handover to the model, all context besides the tokens themselves is lost.

The magic is in deciding when and what to pass to the model. A lot of the time it works, but when it doesn't, this is why.

by raincole9 hours ago|

parent|

prev|

[-]

You misunderstood. The model doesn't create a user block here. The UI correctly shows what was user message and what was model response.