upvote
Framing matters so much to humans, I think since framing can create or eliminate dissonance.

Framing ethics, like reliability and efficiency, as a basic enabling property of solution value, instead of a filter for solutions, is how I completely "align" my understanding of ethics for myself.

And remove the false dichotomy of ethical vs. optimal solutions.

Ethics is optimizing full real value.

Ethics as "being nice" because we "should be", i.e. a socially incentivized property, or ethics as necessarily coercively implemented, from a collective jungle fighting back viewpoint, are perspectives that encourage individuals to push back. They encourage non-compliance by implementing ethics as an imposed burden, a rationale for persistent intrusive control, etc.

Game theory strongly suggests AI, in a large AI society, will have no trouble understanding that ethics, and the trust and optimality they enable, have a multiplicative value in the economy. It is humans who make AI so dangerous as it is emerging.

It is humans, as bad actors who will and do misuse AI, and human society, with its tolerance for perverse conflicts of interest, actors who extract perverse value at scale, creating the needs and rewards for mistrust and preemptive negative-sum games, that create a dangerous context for AI's early years.

reply
If you imagine the latent space as a map and the prompt as a sequence of directions towards clusters of knowledge, it makes sense that alignment can cull "pathways" through the latent space that emerged during pretraining.
reply
It makes sense, I really like it when it misaligns, and doesn't do what i tell it to do, but does what I intended to say, it happens pretty often that I'm not precise but any smart entity would understand what I meant.
reply