Wait... when some Claude 5x/20x users say they are getting "$2000 of tokens for $100," does the 2k value include cached tokens, counted at the same $/token either way?
We cannot be this dumb as a community, can we? I must be wrong/misunderstanding..
One spent 200,000 tokens, to produce 10,000.
The other spent 1.9 million.
It could have been a single LLM call (10k tokens). lmao
(I note that the latter was designed by a company whose main source of revenue is token spend...)
I have been "agentic coding" since Sonnet 3.5 and after this paper came out, it became my bible:
https://github.com/adobe-research/NoLiMa
Last I checked, all models suck as you fill the context window. "Context engineering" is how you do this whole thing.