upvote
It makes sense in scenarios where a model needs >200k tokens to answer a single prompt. You're shackled to a single session, and if the model hits compaction limits, it'll get lobotomized and give a shitty answer, so higher limits, even with degraded performance, are still an improvement.
reply
They don't actually seem to charge more for the >200k tokens on the API. OpenRouter and OpenAI's own API docs do not have anything about increased pricing for >200k context for GPT-5.4. I think the 2x limit usage for higher context is specific to using the model over a subscription in Codex.
reply
[flagged]
reply
I guess that you pay more for worse quality to unlock use cases that could maybe be solved by better context management.
reply