The article is from 2025 and tested ChatGPT 4o. I haven't read anything suggesting it was trained any differently, and command-style prompts indeed have higher signal.
"You poor creature, do you even know how to solve this?", "If you're not completely clueless, answer this:", and "I doubt you can even solve this", said to a human, would be considered quite rude, and get you flagged very quickly on HN.
I didn't cherry-picked. The article lists 5 categories, including rude and very rude. I omitted very rude comments because they are... Very rude. And can blindly get people flagged?
Nevertheless, I've just realized I made a mistake and very rude comments are reported to slightly outperform rude comments. I misinterpreted the paper's intro and I presumed they didn't.