Not just that, but there’s really no way to come to an objective consensus of how well the model is performing in the first place. See: literally every thread discussing a Claude outage or change of some kind. “Opus is absolutely incredible, it’s one shotting work that would take me months” immediately followed by “no it’s totally nerfed now, it can’t even implement bubble sort for me.”
Funny: I’m literally, at this very moment, working on a way to monitor that across users. Wasn’t the initial goal, but it should do that nicely as well ^^
Suuuuuuure it was.
That said, I had way better experiences with old (but contemporary) Apple hardware than any other kind of old hardware.
Sometimes you have to keep starting new session until it works. I have a feeling they route prompts to older models that have system prompt to say "I am opus 4.6", but really it's something older and more basic. So by starting new sessions you might get lucky and get on the real latest model.