This is literally what "prompt injection" is. The sooner people understand this, the sooner they'll stop wasting time trying to fix a "bug" that's actually the flip side of the very reason they're using LLMs in the first place.
And being coerced or convinced to bypass rules is exactly what prompt injection is, and very much not uniquely human any more.
Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.
Yes, that is exactly the point.
> Unless you were both faceblind and bad at recognizing voices, the same attack wouldn't work in-person, you'd know the attacker wasn't your boss.
Irrelevant, as other attacks works then. E.g. it is never a given that your bosses instructions are consistent with the terms of your employment, for example.
> Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.
It is very much "convincing", yes. The ability to convince an LLM is what creates the effective lack of separation. Without that, just using "magic" values and a system prompt telling it to ignore everything inside would create separation. But because text anywhere in context can convince the LLM to disregard previous rules, there is no separation.