points
Example 1, just from top of my mind, Composer 2.5 released today. Go look at their benchmark.
Composer 2.5 and Opus 4.7 ranked around the same, meanwhile gpt-5.5 was miles ahead.
You wouldn’t have caught me dead using a gpt model 2 years ago