undefined

points

[-]

If by "got caught" you mean "published it in their system card paper".

(Admittedly it was buried pretty deep in that 300+ page PDF, but they did at least disclose it. If they hadn't I imagine it would have taken quite some time for the research community to figure out what was going on.)

by afthonos2 hours ago|

parent|

[-]

It was in the announcement, too. I’m 99% sure they edited it after they changed their mind, because I knew about it from reading that, and never opened the model card.

by skavi2 hours ago|

parent|

[-]

On the earliest web archive snapshot I can find [0], I do not see any mention of the safeguard/sabotage under discussion [1].

And to be clear, this isn't the safeguard where the model is explicitly downgraded to Opus, but rather where the Fable/Mythos model's "effectiveness" is transparently "limited" via "prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT)".

[0]: https://web.archive.org/web/20260609173222/https://www.anthr...

[1]: https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-...

by ajyoon1 hours ago|

parent|

prev|

[-]

I wasn't buried, it was on the third page after the ToC

by bellowsgulch2 hours ago|

parent|

prev|

[-]

Yes, I actually do mean that. I skimmed the system card. Them stating it openly, doing it, and being called out on it just doesn't have any meaningful difference.

They could have simply told people "we do not permit using Claude models to perform frontier AI research," which is defensible from a policy point of view. This particular usage of their products requires no deception, nor hiding information prevent abuse.

However, instead, they chose for some reason to publicly display a morally poor way to execute a reasonable business decision (preventing abuse, defending your business interests, etc.)

by afthonos2 hours ago|

prev|

[-]

They didn’t get caught, they explicitly said they would do that in the announcement. I think it was both bad and a weird idea, but it certainly wasn’t sneaky.

by cyanydeez2 hours ago|

prev|

[-]

is it a moat or just a way to implement the permanent underclass?