Your experience with Arabic in particular makes me think there's still a lot of training material to be mined in languages other than English. I suspect the reason that Arabic sounds 20 years ago is that there's a data labeling bottleneck in using foreign language material.
I wouldn't be surprised if Arabic in particular had this issue and if Arabic also had a disproportionate amount of religious text as source material.
I bet you'd see something similar with Hebrew.
In the middle of the conversation it randomly switched from English to Russian and clearly struggled to maintain the tone imposed by the built-in prompt.
My guess is, not as the single and most prominent factor. Pauperisation, isolation of individual and blatant lake of homogeneous access to justice, health services and other basic of social net safety are far more likely going to weight significantly. Of course any tool that can help with mass propaganda will possibly worsen the likeliness to reach people in weakened situation which are more receptive to radicalization.
The question that then follows is if suppressing that content worked so well, how much (and what kind of) other content was suppressed for being counter to the interests of the investors and administrators of these social networks?
You need to constrain token sampling with grammars if you actually want to do this.
I will die on this hill and I have a bunch of other Arxiv links from better peer reviewed sources than yours to back my claim up (i.e. NeurIPS caliber papers with more citations than yours claiming it does harm the outputs)
Any actual impact of structured/constrained generation on the outputs is a SAMPLER problem, and you can fix what little impact may exist with things like https://arxiv.org/abs/2410.01103
Decoding is intentionally nerfed/kept to top_k/top_p by model providers because of a conspiracy against high temperature sampling: https://gist.github.com/Hellisotherpeople/71ba712f9f899adcb0...
I always set temperature to literally zero and don't sample.
But it does that!
What kind of things did it tell you ?
Perhaps in Arabic or Chinese the AI gives a straight answer.