upvote
> People love to interpret the results in the most negative way possible because it's a threat to their occupation and identity.

OR it could be because their concerns are genuine but are ignored in favour of a good sounding story.

reply
But no one in this thread addressed the inaccuracy of the experiment. The experiment did not test the actuality of HOW LLMs are used in reality.

So that is definitively a biased interpretation. This is independent of how accurate my POV or your POV is on whether LLMs degrade documents. I am simply saying the experiment conducted is COMPLETELY DIFFERENT from how LLMs AND humans edit papers.

reply
> a human will DO worse then a 25% degradation.

* than

reply
See that’s an example of degradation by a human. Not even an LLM wil make that kinda mistake.
reply
deleted
reply
[flagged]
reply
> a human will DO worse then a 25% degradation

As I was reading this article, a similar thought occurred to me: "I wonder if that's better or worse than a human?" Unfortunately, there was no human baseline in this study. That said, there are studies that compare LLM to human performance. Usually, humans perform much better (like 5-7x better) at long-running tasks.

In other words, a human would probably do better than an LLM on this task.

Humans lose to LLMs in narrow, well-specified text/symbolic reasoning tasks where the model can exploit breadth, speed, and search. Usually, the LLM performed ~15% better than humans, but I saw studies that were as high as 80%. To my surprise, these studies were usually about "soft skills" like creativity and persuasion.

reply
You can do a baseline study right now. Read this entire thread and make an edit of changing every E to an I.

Show your edit by regurgitating this entire thread by hand on a paper. Don't use any additional tools like Find and replace.

Boom there's your baseline. I can simulate the result in my head.

Guys I'm basically saying the experiment is innaccurate to the practical reality of how LLMs are actually used.

reply