upvote
Anthropic very explicitly says below their diagrams ( https://platform.claude.com/docs/en/build-with-claude/contex... ) on this:

"Stripping extended thinking: Extended thinking blocks (shown in dark gray) are generated during each turn's output phase, but are not carried forward as input tokens for subsequent turns. You do not need to strip the thinking blocks yourself. The Claude API automatically does this for you if you pass them back."

It's more nuanced in the various modes, but i haven't seen it boil down towards Thinking Tokens surviving more than two turns.

reply
https://platform.claude.com/docs/en/build-with-claude/extend...

default depends on the model class. Opus: Claude Opus 4.5 and later Opus models keep all prior thinking blocks; Claude Opus 4.1 (deprecated) and earlier Opus models keep only the last assistant turn's thinking. Sonnet: Claude Sonnet 4.6 and later Sonnet models keep all; Claude Sonnet 4.5 and earlier Sonnet models keep only the last turn. Haiku: all Haiku models through Claude Haiku 4.5 keep only the last turn. Claude Mythos Preview also keeps all prior thinking blocks.

reply
Now Im even more confused : D

That would also explain the issue I mention in my other comment. And would also reinforce how much output would degrade without this. Opus 4.5 was a step above previous models in my experience. At some point it degraded and only got better when I disabled adaptive thinking. Adaptive thinking is always on for 4.6 and above.

reply
Thats really surprising, I stand corrected. I have had a lot of issues with hallucinations I attributed to adaptive thinking, but I wonder if those were actually due to this behavior instead.

I also wonder if they actually do a hybrid of "standard reasoning" and then classify this stripped chain of thought as "extended thinking".

reply