OR it could be because their concerns are genuine but are ignored in favour of a good sounding story.
So that is definitively a biased interpretation. This is independent of how accurate my POV or your POV is on whether LLMs degrade documents. I am simply saying the experiment conducted is COMPLETELY DIFFERENT from how LLMs AND humans edit papers.
* than
As I was reading this article, a similar thought occurred to me: "I wonder if that's better or worse than a human?" Unfortunately, there was no human baseline in this study. That said, there are studies that compare LLM to human performance. Usually, humans perform much better (like 5-7x better) at long-running tasks.
In other words, a human would probably do better than an LLM on this task.
Humans lose to LLMs in narrow, well-specified text/symbolic reasoning tasks where the model can exploit breadth, speed, and search. Usually, the LLM performed ~15% better than humans, but I saw studies that were as high as 80%. To my surprise, these studies were usually about "soft skills" like creativity and persuasion.
Show your edit by regurgitating this entire thread by hand on a paper. Don't use any additional tools like Find and replace.
Boom there's your baseline. I can simulate the result in my head.
Guys I'm basically saying the experiment is innaccurate to the practical reality of how LLMs are actually used.