undefined

points

[-]

In theory, wouldn't be too hard be to settle the question if whether he used ChatGPT to write it: get Olang to write a few paragraphs by hand, then have people judge (blindly) if it's the same style as the article. Which one sounds more like ChatGPT.

by embedding-shape1 days ago|

parent|

[-]

The times I've written articles, and those have gone through multiple rounds of reviews (by humans) with countless edits each time, before it ends up being published, I wonder if I'd pass that test in those cases. Initial drafts with my scattered thoughts usually are very different from the published end results, even without involving multiple reviewers and editors.

by jmalicki1 days ago|

parent|

prev|

[-]

When people judge blindly, the are more likely to think the human is the AI and the AI is the human.

73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.

https://arxiv.org/abs/2503.23674

Not only are people bad at judging this, but are directionally wrong.

by nothinkjustai1 days ago|

parent|

[-]

There is research showing the contrary that is far more convincing:

> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.

https://arxiv.org/html/2501.15654v2

by croemer1 days ago|

parent|

[-]

Great find, I've submitted this preprint as a standalone item: https://news.ycombinator.com/item?id=47678270