undefined

points

[-]

>It's restricted because it's genuinely good at finding vulnerabilities, and employees felt that it's not a good idea to give this capability to everyone without letting defenders front-run.

It's a possibility, but it doesn't eliminate the possibility that it's hype. If these claims were indeed serious, they would submit it for independent analysis somewhere.

This isn't some crazy process. Defense contractors are required to submit their systems (secret sauce and all) for operational test and evaluation before they're fielded.

by afthonos16 hours ago|

parent|

[-]

> If these claims were indeed serious, they would submit it for independent analysis somewhere.

They have. 40 different companies that have all committed resources to patching their systems based on vulnerabilities found by Mythos. One of them, Google, is a frontier AI lab that pointedly did not say that their own models have found similar vulnerabilities.

> Defense contractors are required to submit their systems (secret sauce and all) for operational test and evaluation before they're fielded.

Does this look something like having 40 separate companies look at the outputs of the system, deciding that it’s real and they should do something about it, and committing resources to it?

At some point, “cynicism” is another word for “lalala can’t hear you”.

by jerf16 hours ago|

parent|

[-]

Another cross-check I've run is, are the claims Anthropic is making for Mythos that out of line with the current status of AI coding assistents?

To which my answer is clearly, no, not even remotely. If Anthropic is outright lying about what Mythos can do, someone else will have it in a year.

In fact the security world would have to seriously consider the possibility that even if Mythos didn't exist that nation states have the equivalent in hand already. And of course, if Mythos does exist, nation states have it now. The odds that Antropic (and every other AI vendor) isn't penetrated enough by every major intelligence agency such that they have access to their choice of model approach zero.

I wonder about the overlap between people being skeptical of Mythos' capabilities, and those who are too skeptical of AI to have spent any time with it because they assume it can't be any good. If you are not aware of what frontier models routinely do, you may not realize that Mythos is just an evolution of existing capabilities, not a revolution. Even just taking a publicly-available frontier model, pointing it at a code base and telling it to "find the vulnerabilities and write exploits" produces disturbingly good results. I can see the weaknesses referenced by the Mythos numbers, especially around the actual writing of the exploits, but it's not like the current frontier models fall on their face and hallucinate wildly for this task. Most everything they produce when I try this is at least a "yeah, that's worth thinking about" rather than an instant dismissal.

by rakejake16 hours ago|

prev|

[-]

Sure, I am not precluding the possibility that they've trained a genuinely great model. All I am saying is that the "this model better than that model" is moot when on one side you have model weights, and on the other side a whitepaper and some accompanying comments on the danger.

I'm not that old but have been here long enough that I remember when GPT-3 was considered too dangerous to release. Now you have models 10x as good, 1/10th the size and run on 8GB VRAM.

by dmix4 hours ago|

prev|

[-]

That safety stuff is almost always quacks whose job it is to exaggerate LLMs at their non profits or marketing hype that "our models are so powerful you should fear them". Then they release them and the world moves on and adapts.

Mythos will benefit security in the long run more than hackers, if it can do what they claim. And there's nothing that will stop an LLM like it from being released in the near term so it's very likely just resource constraints or marketing

by louiereederson15 hours ago|

prev|

[-]

I don't think you can say this with confidence, outside-in. It's not just about safety. The additional unknown is cost - I don't just mean API cost, but fully loaded cost for a given task. Is the model cost effective for tasks such that it has product market fit?

We don't yet know if Mythos was a level shift in the capability/cost frontier, or a continued extension of the same logarithmic capability/cost curve.

by solenoid093715 hours ago|

parent|

[-]

Some people have access to the model for red team purposes as part of Glasswing and they came away quite spooked according to what I heard

by louiereederson14 hours ago|

parent|

[-]

I don't doubt it, I just mean the decision to release/not release generally may also be informed by the commercial/economic viability of the model for general usage patterns versus extremely high value patterns like vulnerability assessment

by jayd1615 hours ago|

prev|

[-]

If it wasn't marketing it wouldn't have fancy branding... It wouldn't even be announced.

by frank-romita16 hours ago|

prev|

[-]

Or, They created the illusion that it's restricted for security reasons but in reality they just lack the necessary for this to be used widespread!

by zzzeek16 hours ago|

prev|

[-]

it seems likely it's both a better model to some unknown extent and doing this "we have to give it to the defenders first" thing is super great marketing material. it seems an entirely natural marketing campaign "announce that we can't even give the model to everyone at first, it's so great!", plus there's some truth to it, even better.

unless you are an employee at anthropic and shouldn't be talking about any of this at all, there's no way to know what the model's capabilities are.

by 298359216 hours ago|

prev|

[-]

How do you know? If you have access you are not unbiased, otherwise you cannot know by definition.

AI companies routinely claim that something is too dangerous to release (I think GPT-2 was the first case) for marketing reasons. There are at least 10 documented high profile cases.

They keep it secret because they now sell to the MIC with China and North Korea bullshit stories as well as to companies who are invested in the AI hype themselves.

by Glemllksdf15 hours ago|

parent|

[-]

I prefer a more cautios approach than the musk style were stuff gets fixed after.

And with gpt-2 the worry was mass emails a lot better and more detailed and personal, social media campaigns etc.

How many bots are deployed today on X and influencing democrazy around the globe?

Its fair to say it had an impact and LLMs still have.

by afthonos16 hours ago|

parent|

prev|

[-]

> How do you know? If you have access you are not unbiased, otherwise you cannot know by definition.

The platonic ideal of how to dismiss any argument by anyone about anything.

by SpicyLemonZest15 hours ago|

parent|

prev|

[-]

GPT-2 was obviously too dangerous to release at the time! It's OK-ish now, when the knowledge that AI can produce arbitrary text is widely shared. It would have been a disaster for scammers and phishers to get GPT-2 at a time when almost everyone still assumed that large volumes of detailed text proved there's a real human being on the other end of the conversation.

by jayd1615 hours ago|

parent|

[-]

And, as we all know, humans can't be scammers. They need the robots to lie.