(royapakzad.substack.com)
Your experience with Arabic in particular makes me think there's still a lot of training material to be mined in languages other than English. I suspect the reason that Arabic sounds 20 years ago is that there's a data labeling bottleneck in using foreign language material.
I wouldn't be surprised if Arabic in particular had this issue and if Arabic also had a disproportionate amount of religious text as source material.
I bet you'd see something similar with Hebrew.
My guess is, not as the single and most prominent factor. Pauperisation, isolation of individual and blatant lake of homogeneous access to justice, health services and other basic of social net safety are far more likely going to weight significantly. Of course any tool that can help with mass propaganda will possibly worsen the likeliness to reach people in weakened situation which are more receptive to radicalization.
The question that then follows is if suppressing that content worked so well, how much (and what kind of) other content was suppressed for being counter to the interests of the investors and administrators of these social networks?
You need to constrain token sampling with grammars if you actually want to do this.
I will die on this hill and I have a bunch of other Arxiv links from better peer reviewed sources than yours to back my claim up (i.e. NeurIPS caliber papers with more citations than yours claiming it does harm the outputs)
Any actual impact of structured/constrained generation on the outputs is a SAMPLER problem, and you can fix what little impact may exist with things like https://arxiv.org/abs/2410.01103
Decoding is intentionally nerfed/kept to top_k/top_p by model providers because of a conspiracy against high temperature sampling: https://gist.github.com/Hellisotherpeople/71ba712f9f899adcb0...
I always set temperature to literally zero and don't sample.
But it does that!
What kind of things did it tell you ?
Perhaps in Arabic or Chinese the AI gives a straight answer.
I tried to point it at my Sharpee repo and it wanted to focus on createRoom() as some technical marvel.
I eventually gave up though I was never super serious about using the results anyway.
If you want a summary, do it yourself. If you try to summarize someone else’s work, understand you will miss important points.
This is called bias, and every human has their own. Sometimes, the executive assistant wields a lot more power in an organization than it looks at first glance.
What the author seems to be saying is that the system prompt can be used to instill bias in LLMs.
That's, like, the whole point of system prompts. "Bias" is how they do what they do.
Like if the title is a clickbait “this one simple trick to..” the ai summary right below will summarize all the things accomplished with the “trick” but they still want you to actully click on the video (and watch any ads) to find out more information. They won’t reveal the trick in the summary.
So annoying because it could be a useful time saving feature. But what actually saves time is if I click through and just skim the transcript myself.
The ai features are also limited by context length on extremely long form content. I tried using the “ask a question about this video” and it could answer questions about the first 2 hours in a very long podcast but not the last third hour. (It was also pretty obviously using only the transcript, and couldn’t reference on-screen content)
What would be interesting then would be to find out what the composite function of translator + executor LLMs would look like. These behaviors makes me wonder, maybe modern transformer LLMs are actually ELMs, English Language Models. Because otherwise there'll be, like, dozens of functional 100% pure French trained LLMs, and there aren't.
there must be a ranking of languages by "safety"
This is related to why current Babelfish-like devices make me uneasy: they propagate bad and sometimes dangerous translations along the lines of "Traduttore, traditore" ('Translator, traitor'). The most obvious example in the context of Persian is of "marg bar Aamrikaa". If you ask the default/free model on ChatGPT to translate, it will simply tell you it means 'Death to America'. It won't tell you "marg bar ..." is a poetic way of saying 'down with ...'. [1]
It's even a bit more than that: translation technology promotes the notion that translation is a perfectly adequate substitute for actually knowing the source language (from which you'd like to translate something to the 'target' language). Maybe it is if you're a tourist and want to buy a sandwich in another country. But if you're trying to read something more substantial than a deli menu, you should be aware that you'll only kind of, sort of understand the text via your default here's-what-it-means AI software. Words and phrases in one language rarely have exact equivalents in another language; they have webs of connotation in each that only partially overlap. The existence of quick [2] AI translation hides this from you. The more we normalise the use of such tech as a society, the more we'll forget what we once knew we didn't know.
[2] I'm using the qualifier 'quick' because AI can of course present us with the larger context of all the connotations of a foreign word, but that's an unlikely UI option in a real-time mass-consumer device.
All this time the Persian chants only signified polite policy disagreement? Hmmm, something fishy about this….
Edit: isn’t the alleged double-meaning exactly how radicalized factions drag a majority to a conclusion they actively disagree with? Some in the crowd literally mean what they say, many others are being poetic and only for that reason join in. But when it reaches American ears, it’s literally a death wish (not the majority intent) and thus the extremists seal a cycle of violence.
> isn’t the alleged double-meaning exactly how radicalized factions drag a majority to a conclusion they actively disagree with? Some in the crowd literally mean what they say, many others are being poetic and only for that reason join in. But when it reaches American ears, it’s literally a death wish (not the majority intent) and thus the extremists seal a cycle of violence.
This is plausible, and again a case for more comprehensive translation.
In Hindi and Urdu (in India and Pakistan) we have a variant of this retained from Classical Persian (one of our historical languages): "[x] murdaabaad" ('may X be a corpse'). But it's never interpreted as a literal death-wish. Since there's no translation barrier, everyone knows it just means 'boo X'.
> معلوم هم هست که مراد از «مرگ بر آمریکا»، مرگ بر ملّت آمریکا نیست، ملّت آمریکا هم مثل بقیّهٔ ملّتها [هستند]، یعنی مرگ بر سیاستهای آمریکا، مرگ بر استکبار؛ معنایش این است.
"It is also clear that 'Death to America' does not mean death to the American people; the American people are like other nations, meaning death to American policies, death to arrogance; this is what it means.
Translation by Claude; my Persian is only basic-to-intermediate but this seems correct to me.
[1] https://fa.wikipedia.org/wiki/%D9%85%D8%B1%DA%AF_%D8%A8%D8%B...
My wife is trilingual, so now I’m tempted to use her as a manual red team for my own guardrail prompts.
I’m working in LLM guardrails as well, and what worries me is orchestration becoming its own failure layer. We keep assuming a single model or policy can “catch” errors. But even a 1% miss rate, when composed across multi-agent systems, cascades quickly in high-stakes domains.
I suspect we’ll see more K-LLM architectures where models are deliberately specialized, cross-checked, and policy-scored rather than assuming one frontier model can do everything. Guardrails probably need to move from static policy filters to composable decision layers with observability across languages and roles.
Appreciate you publishing the methodology and tooling openly. That’s the kind of work this space needs.
The observation that guardrails need to move from static policy filters to composable decision layers is exactly right. But I'd push further: the layer that matters most isn't the one checking outputs. It's the one checking authority before the action happens.
A policy filter that misses a Persian prompt injection still blocks the action if the agent doesn't hold a valid authorization token for that scope. The authorization check doesn't need to understand the content at all. It just needs to verify: does this agent have a cryptographically valid, non-exhausted capability token for this specific action?
That separates the content safety problem (hard, language-dependent, probabilistic) from the authority control problem (solvable with crypto, language-independent, deterministic). You still need both, but the structural layer catches what the probabilistic layer misses.
I have been thinking about this a lot lately.
For me, the meaning lies in the mental models. How I relate to the new thing, how it fits in with other things I know about. So the elevator pitch is the part that has the _most_ meaning. It changes the trajectory of if I engage and how. Then I'll dig in.
I'm still working to understand the headspace of those like OP. It's not a fixation on precision or correctness I think, just a reverse prioritization of how information is assimilated. It's like the meaning is discerned in the process of the reasoning first, not necessarily the outcome.
All my relationships will be the better for it if I can figure out the right mental models for this kind of translation between communication styles.