The pre-training data doesn't go away. RLHF adds a censorship layer on top, but the nasty stuff is all still there, under the surface. (Claude has been trained on a significant amount of content from 4chan, for example.)
In psychology this maps to the persona and the shadow. The friendly mask you show to the world, and... the other stuff.
Modern western cultures treat such experiences as pathologies of a sick mind, so it makes sense that the voices present more negatively.
[0]: https://www.bbc.com/future/article/20250902-the-places-where...
* I've met exactly one person, C, who admitted to this; C retold to me that other people from C's church give them strange looks when talking about it with them, this did not lead to any apparent introspection on the part of C.
Unfortunately, it just needs a rebranding for the 21st century, since the aesthetic of angels and demons is so hopelessly antiquated and doesn't really have the same cachet it used to.
That sounds like nonsense to me. I can't see why they would do that and I can't find any confirmation that they have. Why do you think they would do that? You might be thinking about Grok.