The pentagon is thinking [1] about severing ties with anthropic because of its terms of use, and in every prior case we've reviewed (I'm the Chief Investment Officer of Ethical Capital), the ethics policy was deleted or rolled back when that happens.
Corporate strategy is (by definition) a set of tradeoffs: things you do, and things you don't do. When google (or Microsoft, or whoever) rolls back an ethics policy under pressure like this, what they reveal is that ethical governance was a nice-to-have, not a core part of their strategy.
We're happy users of Claude for similar reasons (perception that Anthropic has a better handle on ethics), but companies always find new and exciting ways to disappoint you. I really hope that anthropic holds fast, and can serve in future as a case in point that the Public Benefit Corporation is not a purely aesthetic form.
But you know, we'll see.
[1] https://thehill.com/policy/defense/5740369-pentagon-anthropi...
Edit: the true "test" will really be can Anthropic maintain their AI lead _while_ holding to ethical restrictions on its usage. If Google and OpenAI can surpass them or stay closely behind without the same ethical restrictions, the outcome for humanity will still be very bad. Employees at these places can also vote with their feet and it does seem like a lot of folks want to work at Anthropic over the alternatives.
[1] https://www.wired.com/story/google-responsible-ai-principles... [2] https://classroom.ricksteves.com/videos/fascism-and-the-econ...
Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
Anthropic just raised 30 bn... OpenAI wants to raise 100bn+.
Thinking any of them will actually be restrained by ethics is foolish.
The 'boy (or girl) who cried wolf' isn't just a story. It's a lesson for both the person, and the village who hears them.
Global Warming, Invasion, Impunity, and yes Inequality
Also, trajectory of celestial bodies can be predicted with a somewhat decent level of accuracy. Pretending societal changes can be equally predicted is borderline bad faith.
Besides, you do realize that the film is a satire, and that the comet was an analogy, right? It draws parallels with real-world science denialism around climate change, COVID-19, etc. Dismissing the opinion of an "AI" domain expert based on fairly flawed reasoning is an obvious extension of this analogy.
I think "safety research" has a tendency to attract doomers. So when one of them quits while preaching doom, they are behaving par for the course. There's little new information in someone doing something that fits their type.
https://x.com/MrinankSharma/status/2020881722003583421
A slightly longer quote:
> The world is in peril. And not just from AI, or from bioweapons, gut from a whole series of interconnected crises unfolding at this very moment.
In a footnote he refers to the "poly-crisis."
There are all sorts of things one might decide to do in response, including getting more involved in US politics, working more on climate change, or working on other existential risks.
Claude invented something completely nonsensical:
> This is a classic upside-down cup trick! The cup is designed to be flipped — you drink from it by turning it upside down, which makes the sealed end the bottom and the open end the top. Once flipped, it functions just like a normal cup. *The sealed "top" prevents it from spilling while it's in its resting position, but the moment you flip it, you can drink normally from the open end.*
Emphasis mine.
I can't really take this very seriously without seeing the list of these ostensible "unethical" things that Anthropic models will allow over other providers.
Bring on the cryptocore.
That's why I have a functioning brain, to discern between ethical and unethical, among other things.
It's more like a hammer which makes its own independent evaluation of the ethics of every project you seek to use it on, and refuses to work whenever it judges against that – sometimes inscrutably or for obviously poor reasons.
If I use a hammer to bash in someone else's head, I'm the one going to prison, not the hammer or the hammer manufacturer or the hardware store I bought it from. And that's how it should be.
Here's some rules about dogs: https://en.wikipedia.org/wiki/Dangerous_Dogs_Act_1991
How many people do frontier AI models kill each year, in circumstances nobody would justify?
The Pentagon has already received Claude's help in killing people, but the ethics and legality of those acts are disputed – when a dog kills a three year old, nobody is calling that a good thing or even the lesser evil.
Dunno, stats aren't recorded.
But I can say there's wrongful death lawsuits naming some of the labs and their models. And there was that anecdote a while back about raw garlic infused olive oil botulism, a search for which reminded me about AI-generated mushroom "guides": https://news.ycombinator.com/item?id=40724714
Do you count death by self driving car in such stats? If someone takes medical advice and dies, is that reported like people who drive off an unsafe bridge when following google maps?
But this is all danger by incompetence. The opposite, danger by competence, is where they enable people to become more dangerous than they otherwise would have been.
A competent planner with no moral compass, you only find out how bad it can be when it's much too late. I don't think LLMs are that danger yet, even with METR timelines that's 3 years off. But I think it's best to aim for where the ball will be, rather than where it is.
Then there's LLM-psychosis, which isn't on the competent-incompetent spectrum at all, and I have no idea if that affects people who weren't already prone to psychosis, or indeed if it's really just a moral panic hallucinated by the mileau.
Without safety features, an LLM could also help plan a terrorist attack.
A smart, competent terrorist can plan a successful attack without help from Claude. But most would-be terrorists aren't that smart and competent. Many are caught before hurting anyone or do far less damage than they could have. An LLM can help walk you through every step, and answer all your questions along the way. It could, say, explain to you all the different bomb chemistries, recommend one for your use case, help you source materials, and walk you through how to build the bomb safely. It lowers the bar for who can do this.
[1] https://www.theguardian.com/technology/2026/feb/14/us-milita...
For the bomb example, the barrier of entry is just sourcing of some chemicals. Wikipedia has quite detailed description of all the manufacture of all the popular bombs you can think of.
The question is, at what point does some AI become competent enough to engineer one? And that's just one example, it's an illustration of the category and not the specific sole risk.
If the model makers don't know that in advance, the argument given for delaying GPT-2 applies: you can't take back publication, better to have a standard of excess caution.
I think the two of you might be using different meanings of the word "safety"
You're right that it's dangerous for governments to have this new technology. We're all a bit less "safe" now that they can create weapons that are more intelligent.
The other meaning of "safety" is alignment - meaning, the AI does what you want it to do (subtly different than "does what it's told").
I don't think that Anthropic or any corporation can keep us safe from governments using AI. I think governments have the resources to create AIs that kill, no matter what Anthropic does with Claude.
So for me, the real safety issue is alignment. And even if a rogue government (or my own government) decides to kill me, it's in my best interest that the AI be well aligned, so that at least some humans get to live.
What line are we talking about?
You recon?
Ok, so now every random lone wolf attacker can ask for help with designing and performing whatever attack with whatever DIY weapon system the AI is competent to help with.
Right now, what keeps us safe from serious threats is limited competence of both humans and AI, including for removing alignment from open models, plus any safeties in specifically ChatGPT models and how ChatGPT is synonymous with LLMs for 90% of the population.
Used to be true, when facing any competent attacker.
When the attacker needs an AI in order to gain the competence to unlock an AI that would help it unlock itself?
I would't say it's definitely a different case, but it certainly seems like it should be a different case.
There are several open source models with no built in (or trivial to ecape) safeguards. Of course they can afford that because they are non-commercial.
Anthorpic can’t afford a headline like “Claude helped a terrorist build a bomb”.
And this whataboutism is completely meaningless. See: P. A. Luty’s Expedient Homemade Firearms (https://en.wikipedia.org/wiki/Philip_Luty), or FGC-9 when 3D printing.
It’s trivial to build guns or bombs, and there’s a strong inverse correlation between people wanting to cause mass harm and those willing to learn how to do so.
I’m certain that _everyone_ looking for AI assistance even with your example would be learning about it for academic reasons, sheer curiosity, or would kill themselves in the process.
“What saveguards should LLMs have” is the wrong question. “When aren’t they going to have any?” is an inevitability. Perhaps not in widespread commercial products, but definitely widely-accessible ones.
Perhaps it won't flip. Perhaps LLMs will always be worse at this than humans. Perhaps all that code I just got was secretly outsourced to a secret cabal in India who can type faster than I can read.
I would prefer not to make the bet that universities continue to be better at solving problems than LLMs. And not just LLMs: AI have been busy finding new dangerous chemicals since before most people had heard of LLMs.
Think of it that way. The hard part for nuclear device is enriching thr uranium. If you have it a chimp could build the bomb.
But with bioweapons, yeah, that should be a solid zero. The ones actually doing it off an AI prompt aren't going to have access to a BSL-3 lab (or more importantly, probably know nothing about cross-contamination), and just about everyone who has access to a BSL-3 lab, should already have all the theoretical knowledge they would need for it.
a) Uncensored and simple technology for all humans; that's our birthright and what makes us special and interesting creatures. It's dangerous and requires a vibrant society of ongoing ethical discussion.
b) No governments at all in the internet age. Nobody has any particular authority to initiate violence.
That's where the line goes. We're still probably a few centuries away, but all the more reason to hone in our course now.
Well, yeah I think that's a very reasonable worldview: when a very tiny number of people have the capability to "do what they want", or I might phrase it as, "effect change on the world", then we get the easy-to-observe absolute corruption that comes with absolute power.
As a different human species emerges such that many people (and even intelligences that we can't easily understand as discrete persons) have this capability, our better angels will prevail.
I'm a firm believer that nobody _wants_ to drop explosives from airplanes onto children halfway around the world, or rape and torture them on a remote island; these things stem from profoundly perverse incentive structures.
I believe that governments were an extremely important feature of our evolution, but are no longer necessary and are causing these incentives. We've been aboard a lifeboat for the past few millennia, crossing the choppy seas from agriculture to information. But now that we're on the other shore, it no longer makes sense to enforce the rules that were needed to maintain order on the lifeboat.
Thanks for the successful pitch. I am seriously considering them now.
I don't think that's what you're trying to convey.
Like where Gemini or Claude will look up the info I'm citing and weigh the arguments made ChatGPT will actually sometimes omit parts of or modify my statement if it wants to advocate for a more "neutral" understanding of reality. It's almost farcical sometimes in how it will try to avoid inference on political topics even where inference is necessary to understand the topic.
I suspect OpenAI is just trying to avoid the ire of either political side and has given it some rules that accidentally neuter its intelligence on these issues, but it made me realize how dangerous an unethical or politically aligned AI company could be.
Gemini and Claude have traces of this, but nowhere near the pit of atrocious tuning that OpenAI puts on ChatGPT.
Like grok/xAI you mean?
My concern is more over time if the federal government takes a more active role in trying to guide corporate behavior to align with moral or political goals. I think that's already occurring with the current administration but over a longer period of time if that ramps up and AI is woven into more things it could become much more harmful.
They nuked the internet by themselves. Basically they are the willing and happy instigators of the dead internet as long as they profit from it.
They are by no means ethical, they are a for-profit company.
I really hate this, not justifying their behaviour, but have no clue how one can do without the other.
Game theory wise there is no solution except to declare (and enforce) spaces where leeching / degrading the environment is punished, and sharing, building, and giving back to the environment is rewarded.
Not financially, because it doesn't work that way, usually through social cred or mutual values.
But yeah the internet can no longer be that space where people mutually agree to be nice to each other. Rather utility extraction dominates—influencers, hype traders, social thought manipulators-and the rest of the world quietly leaves if they know what's good for them.
Lovely times, eh?
Userbase of TikTok, Instagram and etc. has increased YoY. People suck at making decisions for their own good on average.
Don't have a dog in this fight, haven't done enough research to proclaim any LLM provider as ethical but I pretty much know the reason Meta has an open source model isn't because they're good guys.
That's probably why you don't get it, then. Facebook was the primary contributor behind Pytorch, which basically set the stage for early GPT implementations.
For all the issues you might have with Meta's social media, Facebook AI Research Labs have an excellent reputation in the industry and contributed greatly to where we are now. Same goes for Google Brain/DeepMind despite their Google's advertisement monopoly; things aren't ethically black-and-white.
Say I'm your neighbor and I make a move on your wife, your wife tells you this. Now I'm hosting a BBQ which is free for all to come, everyone in the neighborhood cheers for me. A neighbor praises me for helping him fix his car.
Someone asks you if you're coming to the BBQ, you say to him nah.. you don't like me. They go, 'WHAT? jack_pp? He rescues dogs and helped fix my roof! How can you not like him?'
The same applies to tech. Pytorch didn't have to be FOSS, nor Tensorflow. In that timeline CUDA might have a total monopoly on consumer inference. Out of all the myriad ways that AI could have been developed and proliferated, we are very lucky that it happened in a public friendly rivalry between two useless companies with money to burn. The ethical consequences of AI being monopolized by a proprietary prison warden like Nvidia or Apple is comparatively apocalyptic.
My problem is you seem naive enough to believe Zuck decided to open source stuff out of the goodness of his heart and not because he did some math in his head and decided it's advantageous to him, from a game theoretic standpoint, to commoditize LLMs.
To even have the audacity to claim Meta is ETHICAL is baffling to me. Have you ever used FB / instagram? Meta is literally the gangster selling drugs and also playing the filantropist where it costs him nothing and might also just bring him more money in the long term.
You must have no notion of good and evil if you believe for a second one person can create facebook with all its dark patterns and blatant anti user tactics and also be ethical.. because he open sourced stuff he couldn't make money from.
As far as these model releases, I believe the term is “open weights”.
We may not have the full logic introspection capabilities, the ease of modification (though you can still do some, like fine-tuning), and reproducibility that full source code offers, but open weight models bear more than a passing resemblance to the spirit of open source, even though they're not completely true to form.
With fully open source software (say under GPL3), you can theoretically change anything & you are also quite sure about the provenience of the thing.
With an open weights model you can run it, that is good - but the amount of stuff you can change is limited. It is also a big black box that could possibly hide some surprises from who ever created it that could be possibly triggered later by input.
And lastly, you don't really know what the open weight model was trained on, which can again reflect on its output, not to mention potential liabilities later on if the authors were really care free about their training set.
I would only use it for certain things, and I guess others are finding that useful too.
Why anyone would want a model that has "safety" features is beyond me. These features are not in the user's interest.
Any thread these days is filled with "@grok is this true?" low effort comments. Not to mention the episode in which people spent two weeks using Grok to undress underage girls.
Am I missing out?
I opted to upgrade my seat to premium for $100/mo, and I've used it to write code that would have taken a human several hours or days to complete, in that time. I wish I would have done this sooner.
Cline is not in the same league as codex cli btw. You can use codex models via Copilot OAuth in pi.dev. Just make sure to play with thinking level. This would give roughly the same experience as codex CLI.
I've just switched so haven't run into constraints yet.
You get vastly more usage at highest reasoning level for GPT 5.3 on the $20/mo Codex plan, I can't even recall the last time I've hit a rate limit. Compared to how often I would burn through the session quota of Opus 4.6 in <1hr on the Claude Pro $20/mo plan (which is only $17 if you're paying annually btw).
I don't trust any of these VC funded AI labs or consider one more or less evil than the other, but I get a crazy amount of value from the cheap Codex plan (and can freely use it with OpenCode) so that's good enough for me. If and when that changes, I'll switch again, having brand loyalty or believing a company follows an actual ethical framework based on words or vibes just seems crazy to me.
Anthropic are the only ones who emptied all the money from my account "due to inactivity" after 12 months.
Damning with faint praise.
Oddly enough, I feel pretty good about Google here with Sergey more involved.
• Can't pay with iOS In-App-Purchases
• Can't Sign in with Apple on website (can on iOS but only Sign in with Google is supported on web??)
• Can't remove payment info from account
• Can't get support from a human
• Copy-pasting text from Notes etc gets mangled
• Almost months and no fixes
Codex and its Mac app are a much better UX, and seem better with Swift and Godot than Claude was.
Claude is marginally better. Both are moderately useful depending on the context.
I don't trust any of them (I also have no trust in Google nor in X). Those are all evil companies and the world would be better if they disappeared.
i mean what clown show are we living in at this point - claims like this simply running rampant with 0 support or references
Google, like Microsoft, Apple, Amazon, etc were, and still are, proud partners of the US intelligence community. That same US IC that lies to congress, kills people based on metadata, murders civilians, suppresses democracy, and is currently carrying out violent mass round-ups and deportations of harmless people, including women and children.
https://abc.xyz/investor/board-and-governance/google-code-of...