Some of it’s probably timing. Some of it is wanting to look good. That said, I just went to the claw-eval site, and neither 4.7 nor 5.5 from oAI are listed on the benchmarks. So there’s also just the time from others to get benchmarking done and published.
Opus-4.6 was probably the best model so far before it got nerfed. 4.7 is nowhere near experience I had. In fact I stopped using it completely because more often than not its output is just dumber than local models.
Still anecdotal but the exact same coding task on the exact same repo (I clone from GitHub templates for projects) worked amazingly well in December with CC/Opus, couldn’t accomplish the goal anymore end of march, with essentially identical prompts, and 4.7 was just comically useless. But even these days I tried repeatedly and 4.6 still can’t do the thing it could in December.
Did you even use it? It was nerfed to hell and back. It stopped following instructions, forgot what sub-agents responded and so on. Stop spreading this pro-Anthropic narrative. They did a rug pull due to lack of compute.