upvote
> Why are we treating the former as a mere mistake, and the latter as a deliberate attack?

"Deliberate" is a red herring. That would require AI to have volition, which I consider impossible, but is also entirely beside the point. We also aren't treating the fabricated quotes as a "mere mistake". It's obviously quite serious that a computer system would respond this way and a human-in-the-loop would take it at face value. Someone is supposed to have accountability in all of this.

reply
I wrote 'treating' as a deliberate attack, which matches the description in the author's earlier blogpost. Acknowledging this doesn't require attaching human-like volition to AIs.
reply
This would be an interesting case of semantic leakage, if that’s what’s going on.
reply
when it comes to AI, is there even a difference? it's an attack either way
reply