GPT-5.4 is currently the strongest model (this changes hourly)
Methodology: https://aistupidlevel.info/faq#methodology
False positives and poorly defined tasks/acceptance criteria have let some models have insanely inflated scores on bad benchmarks.
And sure, you can say they're not disclosed to prevent gaming, but if you're the only one who can review them then the might as well be a random number generator display with an unreadable UI.
I'd also think that they would transparently degrade, just to prevent production outages for clients that are requesting Fable explicitly.
It did just use a small harness to run docker compose with different envs and other settings to validate a very small change, so... Feels like Fable
Opus 4.8 spams a lot more text. It'd be obvious.
> There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it.
Claude Code v2.1.177
Fable 5 with low effort · Claude Max
~/testing
Never mind, it failed a few minutes later with:
There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it. Run /model to pick a different model.And now we're done. Oh well.
(I have never had an agent do enough to burn up the 5 hour quota on Max)
(edit: just switched my CC model to 4.8 and my 5-hr cycle reset back to 0%, even though it previously had 2 more hours to go)
> There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it. Run /model to pick a different model.
edit: And... it's gone
> There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it. Run /model to pick a different model.