The difference in progress in smaller models is far more impressive.
Compare Gemini 3.5 Flash to a ~16B parameter model from 24 months ago.
Compare GPT-5.5 to a frontier model 24 months ago.
Yes, GPT-5.5 got better. At orders of magnitude smaller parameter sizes (when factoring in ACTIVE parameters) the increase is far more pronounced.
And as good as 5.3 Codex is at writing code, 5.5 is easily just as good, if not better. But 5.5 is more than a one trick pony and it is much better at planning, writing copy, documentation, etc. I can choose to run 5.3-Codex instead of 5.5, but I never ever do.