I think the two of you might be using different meanings of the word "safety"
You're right that it's dangerous for governments to have this new technology. We're all a bit less "safe" now that they can create weapons that are more intelligent.
The other meaning of "safety" is alignment - meaning, the AI does what you want it to do (subtly different than "does what it's told").
I don't think that Anthropic or any corporation can keep us safe from governments using AI. I think governments have the resources to create AIs that kill, no matter what Anthropic does with Claude.
So for me, the real safety issue is alignment. And even if a rogue government (or my own government) decides to kill me, it's in my best interest that the AI be well aligned, so that at least some humans get to live.
What line are we talking about?
You recon?
Ok, so now every random lone wolf attacker can ask for help with designing and performing whatever attack with whatever DIY weapon system the AI is competent to help with.
Right now, what keeps us safe from serious threats is limited competence of both humans and AI, including for removing alignment from open models, plus any safeties in specifically ChatGPT models and how ChatGPT is synonymous with LLMs for 90% of the population.
Used to be true, when facing any competent attacker.
When the attacker needs an AI in order to gain the competence to unlock an AI that would help it unlock itself?
I would't say it's definitely a different case, but it certainly seems like it should be a different case.
There are several open source models with no built in (or trivial to ecape) safeguards. Of course they can afford that because they are non-commercial.
Anthorpic can’t afford a headline like “Claude helped a terrorist build a bomb”.
And this whataboutism is completely meaningless. See: P. A. Luty’s Expedient Homemade Firearms (https://en.wikipedia.org/wiki/Philip_Luty), or FGC-9 when 3D printing.
It’s trivial to build guns or bombs, and there’s a strong inverse correlation between people wanting to cause mass harm and those willing to learn how to do so.
I’m certain that _everyone_ looking for AI assistance even with your example would be learning about it for academic reasons, sheer curiosity, or would kill themselves in the process.
“What saveguards should LLMs have” is the wrong question. “When aren’t they going to have any?” is an inevitability. Perhaps not in widespread commercial products, but definitely widely-accessible ones.
Perhaps it won't flip. Perhaps LLMs will always be worse at this than humans. Perhaps all that code I just got was secretly outsourced to a secret cabal in India who can type faster than I can read.
I would prefer not to make the bet that universities continue to be better at solving problems than LLMs. And not just LLMs: AI have been busy finding new dangerous chemicals since before most people had heard of LLMs.
Think of it that way. The hard part for nuclear device is enriching thr uranium. If you have it a chimp could build the bomb.
But with bioweapons, yeah, that should be a solid zero. The ones actually doing it off an AI prompt aren't going to have access to a BSL-3 lab (or more importantly, probably know nothing about cross-contamination), and just about everyone who has access to a BSL-3 lab, should already have all the theoretical knowledge they would need for it.
a) Uncensored and simple technology for all humans; that's our birthright and what makes us special and interesting creatures. It's dangerous and requires a vibrant society of ongoing ethical discussion.
b) No governments at all in the internet age. Nobody has any particular authority to initiate violence.
That's where the line goes. We're still probably a few centuries away, but all the more reason to hone in our course now.
Well, yeah I think that's a very reasonable worldview: when a very tiny number of people have the capability to "do what they want", or I might phrase it as, "effect change on the world", then we get the easy-to-observe absolute corruption that comes with absolute power.
As a different human species emerges such that many people (and even intelligences that we can't easily understand as discrete persons) have this capability, our better angels will prevail.
I'm a firm believer that nobody _wants_ to drop explosives from airplanes onto children halfway around the world, or rape and torture them on a remote island; these things stem from profoundly perverse incentive structures.
I believe that governments were an extremely important feature of our evolution, but are no longer necessary and are causing these incentives. We've been aboard a lifeboat for the past few millennia, crossing the choppy seas from agriculture to information. But now that we're on the other shore, it no longer makes sense to enforce the rules that were needed to maintain order on the lifeboat.