upvote
"don't do something that would make me get mad at you."

These prompts sound like abusive relationships.

reply
> "NEVER FUCKING GUESS!"
reply
"Oops, I guessed! I'm Sorry~~ uWu!!"

- Claude Opus 4.6, when asked to run a root cause analysis on itself

reply
hmmmm ok, what if we add a bit more profanity to that? perhaps some extra exclamation marks? maybe that'll make the agents actually follow the rules?
reply