"personality issues" I was able to tell that Opus 4.7 would take instructions more literally, which I appreciated once I calibrated my phrasing to be more precise (often asking to investigate issues, pre-4.7 it'd start making code changes instead of just giving write up). But I can see contexts where handling vague prompts would've just been worse
Opus 4.8 is the first tangible improvement since Opus 4.5. And it doesn't seem to have the personality problems of the last release -- I've been enjoying using it.
That'll populate over the next couple weeks -- those are the live games on the spectate tab which take a while to generate statistically worthwhile data. I'm curious how it does. From using it all day, I can say Opus 4.8 is my new favorite model, hands down.