upvote
Yes, and the previous approach Anthropic took was "allow anything that looks remotely benign". The only thing that would get a refusal would be a downright "write an exploit for me". Which is why I favored Anthropic's models.

It remains to be seen whether Anthropic's models are still usable now.

I know just how much of a clusterfuck their "CBRN filter" is, so I'm dreading the worst.

reply
But this technology is now out there, the cat's out of the bag, there's no going back to a world where people can't ask AI to write malware for them.

I'd argue that black hats will find a way to get uncensored models and use them to write malware either way, and that further restricting generally available LLMs for cybersec usage would end up hurting white hats and programmers pentesting their own code way more (which would once again help the black hats, as they would have an advantage at finding unpatched exploits).

reply