upvote
There aren't many working on it though, definitely not enough given how many resources are going into building AI.

AI safety at these labs are largely focused on surface level measures and aren't empowered to stop progress of the company. I was surprised when Anthropic initially held Mythos back from the public, but it was always a temporary measure to give controlled access rather than a pause to make meaningful improvements in AI safety.

reply
The only measures we see are the surface-level ones, because those are the only ones that sort of work.

Alignment is a hard, possibly impossible problem. Anthropic's gambit is they luck upon a solution before the paperclip maximizers take over.

reply
But that's exactly my point. If they actually did legitimately fear that AGI or whatever the bar is could significantly impact all of humanity in a bad way they wouldn't be okay with saying "well this coat of paint sort of slows down the rust."

Either its a dangerous technology or it isn't, and if it is then surface level fixes that kind of work is completely unacceptable.

reply
But that's the point. Assuming alignment is not possible and the risk caused by unaligned models is real, shouldn't then all effort go into preventing such models from existing in the first place?

...which would actually be an easy to solve problem unless you go out of your way to build such a model.

reply
How does building said models prevent them from existing?

Prevention should look a lot more like a global moratorium with whatever enforcement is necessary to stop and prevent any breaches of the agreement.

Edit: I did misread your comment on first pass, we may be in agreement here. Sorry!

reply
> Prevention should look a lot more like a global moratorium with whatever enforcement is necessary to stop and prevent any breaches of the agreement.

Yep, that was my point. Either the ostensible danger stemming from the models is not real, then this stuff is moot anyway, or it is, then why are we building them in the first place?

reply
i wish Ilya and crew would chime in
reply
You mean the people who have a powerful incentive to lie and exaggerate to sell a product?
reply
When have they lied/exaggerated about the capabilities of their product?
reply
I find HN to be filled with reactionaries who over react to every little thing when it comes to AI. Look at the response to Fable kicking some queries down to 4.8. If you read the comments you would think this was 1984 level censorship and the end of AI as we know it. In reality, it literally was something that most people would never run up against and if you did your query was kicked to a model that was state of the art literally a day ago. It's too much sometimes.
reply
deleted
reply
Sometimes people on the inside are too involved to see the potential pitfalls outsiders might recognize ---this is why one typically has external auditors and third party companies do assessments.
reply