upvote
This is a price discrimination/upsell strategy. Sure, if you just want software, use our public model. Don’t worry; it’s safe.

But if you want your model to be secure, and you want to deal with dangerous stuff, contact us for pricing. BTW if you don’t pay for us to pentest you, maybe someone else will, idk.

Oh also you’re not allowed to pentest yourself with our public models anymore because it looks like hacking

reply
The fundamental tension is that the models are getting weirdly good at hacking while still sort of sucking at a bunch of economically valuable tasks.

So they've hit the point where the models are simultaneously too smart (dangerous hacking abilities) and too stupid (can't actually replace most employees). So at this point they need to make the models bigger, but they're already too big.

So the only thing left to do is to make them selectively stupider. I didn't think that would be possible, but it seems like they're already working on that.

reply
models are getting weirdly good at hacking while still sort of sucking at a bunch of economically valuable tasks

like most human hackers

reply
They are training them on decompilation and reverse engineering/blackbox reimplementations/pentesting because it’s one of the best ways to generate interesting and rare RL traces for agentic coding AND teach them how lots of things work under the hood.

Just throw Claude at millions of binaries and you can get amazing training data. Oh wait 4.7 gives you refusals for that now

reply
Honestly I feel sometimes like about the only thing they do successfully is hacking. Not just in the sense of breaking into systems that are assumed to be secure although also in that sense. They're just, highly effective at fumbling around with a hatchet until something works. We just happen to have version control and automated testing that generally makes that approach somewhat viable for the task of programming. But while I've been genuinely impressed at how much it can put features into a workable state, I've never been confident looking at its output that it's going to do more than POC quality at the current state of things. But it's pretty dang effective at that given enough time and a space safe to hack away and reset until the product looks close enough.
reply
Yes, it's a losing strategy; no one else is going to do this. They are inviting parties to partner with them, so it's not totally in conflict, but yeah I'm sure there's genuine concern coming out of Anthropic, but I also think as this point they've likely culturally internalized "Dangerous [think: powerful] AI" as a brand narrative.

"The Beware of Mythos!" reads to me as standard Anthropic/Dario copy. Is it more true now than it was before? Sure. Is now the moment that the world's digital infrastructure succumbs to waves of hackers using countless exploits; I doubt it.

reply
>Is now the moment that the world's digital infrastructure succumbs to waves of hackers using countless exploits; I doubt it.

I am not into cybersecurity but the existing "technical debt" in terms of security has been barely exploited.

The issue is that literally all software has some vulnerability, want it or not. And these LLMs are like brute forcing all possibilities faster than a human can do. Sometimes humans even ignore low security issues, while maybe these LLMs are capable to build exploits on top of multiple ones.

For me they understood the moat - cybersecurity is such a trivial space to get into, I guess they are investing heavily on that because as someone else mentioned in other threads, it's obvious they are too limited for other tasks.

Becoming a "mandatory" (SOC-2 etc, things like that) integrated part of your CI/CD pipeline would be a huge win for them. Imagine that.

reply
This is the company that allowed a vibe-release resulting in the leaking the entirety of the Claude Code codebase. What is the bar you're expecting here exactly?
reply
Curious how the safeguards work and what impact they will have.

In general I feel that over-engineering safeguards in training comes at a noticeable cost to general intelligence. Like asking someone to solve a problem on a white board in a job interview. In that situation, the stress slices off at least 10% of my IQ.

reply
While I believe that mythos is better than the models we have right now, the "too dangerous to release" sounds largely a marketing gimmick to me. Well not for me to speculate, I simply need to wait for the huge wave of security patches to all software in the coming weeks, as per Anthropic's claims
reply
I feel it’s fine as a short term solution, and probably a good thing. Gives the good guys some time to stay on top.

Always remember: a defender must succeed every time , an attacker only once.

reply
Given the list of very large companies in the "glasswing" project - it is likely every competent state actor and criminal organization already has access to Mythos in one way or another. Meanwhile the opensource volunteers responsible for the security of the entire internet don't have access.
reply
So then why expect that you're making the world safer by limiting the capability that your vendor locked customers have access to while attackers will go find the best de-censored model that works for them, wherever they can find it?
reply
Yeah, it is easier to destroy than to create. Models will always be better at hacking than at building.
reply
I'm not a security expert and don't know how to properly audit every github repo that I come across. Maybe I sometimes want to build gnome extensions or cool software projects from source and I want some level of checking along the way for known vulnerabilities. They can't claim this is an obvious win for security when it centralizes rather than democratizes security.
reply
I interpreted their actions as providing time for vendors to protect themselves against the new model proactively, not to nerf the models themselves.

Although perhaps I am naive.

reply