Snarky, but true. It is truly astounding, and feels categorically different. But it's also perfectly useless at the moment. A digital fidget spinner.
do you have the foresight of a nematode?
You don't actually need "frontier models" for Real Work (c).
(Summarization, classification and the rest of the usual NLP suspects.)
If we are going for accuracy, the question should be asked multiple times on multiple models and see if there is agreement.
But I do think once you hit 80B, you can struggle to see the difference between SOTA.
That said, GPT4.5 was the GOAT. I can't imagine how expensive that one was to run.