upvote
It should be easy for a company like Anthropic to prove this beyond a doubt. Why don't they? Why don't they have a collection of prompts and side-by-side comparisons with other models showing how far ahead they are?
reply
I think it's mainly because the difference in models at the frontier isn't "response to prompt X", but rather "coherence with 500K tokens of context and instructions in play"
reply
Good morning to the Anthropic office good sir
reply