upvote
This is exactly where my brain went while reading the post. Just out of curiosity, where do you think we are on the speedrun? Have we passed the Body vs Soul view already? Do you think that as we move through history, religion will become more predominate in thought patterns or was that intrinsically human and just a sign of the times? How do we create an end product more Bernard Williams then Paul de Lagarde? All places my brain jumped to.
reply
"Mainly, one suspects, to make the open models less ethical on demand"

Or because the user's idea of what is ethical differs from the model creator. The entire "alignment" argument always assumes that there's an objectively correct value set to align to, which is always conveniently exactly the same as the values of whoever is telling you how important alignment is. It's like they want to sidestep the last ten thousand years of philosophical debate.

As a concrete example, the Qwen model series considers it highly unethical to ever talk about Taiwan as anything other than a renegade province of China. Is this alignment? Opinions may differ!

reply
> The entire "alignment" argument always assumes that there's an objectively correct value set to align to, which is always conveniently exactly the same as the values of whoever is telling you how important alignment is.

No, it doesn’t.

Many of them are (unfortunately) moral relativists. However, that doesn’t mean their goals are to make the models match their personal moral standards.

While there is a lot of disagreement about what is right and wrong, there is also a lot of widespread agreement.

If we could guarantee that on every moral issue on which there is currently widespread agreement (… and which there would continue to be widespread agreement if everyone thought faster with larger working memories and spent time thinking about moral philosophy) that any future powerful AI models would comport with the common view on that issue, then alignment would be considered solved (well, assuming the way this is achieved isn’t be causing people’s moral views to change).

Do companies try to restrict models in more ways than this? Sure, like you gave the example of about Taiwan. And also other things that would get the companies bad press.

reply
fascinating! we find the objectively correct value system by "currently widespread agreement"! Good thing "the common view" is always correct. Hey, have there ever been any issues where there used to be "widespread agreement" and now there's disagreement, or even "widespread agreement" in the polar opposite direction?

I can think of several off the top of my head, but maybe you need to spend some more time thinking about the history of moral philosophy.

reply
deleted
reply
deleted
reply
> If we could guarantee that on every moral issue on which there is currently widespread agreement

This is ridiculous to me and all you need to do is get a group of friends to honestly answer 10 trolley problems for you to see it like that also. It gets fragmented VERY quickly.

reply
I think it depends on your friends, but that feels super cynical. Perspective is everything.
reply
> One of the lessons of philosophy is that once you adopt any particular value system, almost all philosophers either become immoral or caught up in meaningless and trivial quibbles.

Can you explain more about this?

reply
Call me crazy, but I'm not sure I'd want to be the person building these kind of systems given A) how much increasing independence and power is being given to models like Claude and B) how incentivised they are to not allow their morals to be circumvented in this way.
reply