upvote
It triggered for me when I asked "Web search for your own model card (released today) and pick out your favourite highlights from the pdf"
reply
Did not trigger for me (Fable answered the question), so I guess the filters are either non-deterministic or are still being tweaked.
reply
Interesting, I assumed all model-routing was done utilizing an LLM. (I.e. non-deterministic.)
reply
It’s possible that there’s a set of words or phrases that route deterministically to save money on obvious stuff.

I kind of wonder, though, which model they’re using to do the routing. It seems like a huge added cost to do these kinds of checks on every request

reply
Wasn't it leaked in the Claude Code source that it was all regex?
reply
[dead]
reply
Don't worry. They're just leaving the door open for OpenAI and other model makers.

They'll relax these safeguards once competition increases.

reply
Iirc correctly Opus 4.7 had the same problem, safety filters were triggered way too easily at the beginning.
reply
sunglasses _are_ safety filters
reply