undefined

points

[-]

> People love to interpret the results in the most negative way possible because it's a threat to their occupation and identity.

OR it could be because their concerns are genuine but are ignored in favour of a good sounding story.

by threethirtytwo15 hours ago|

parent|

[-]

But no one in this thread addressed the inaccuracy of the experiment. The experiment did not test the actuality of HOW LLMs are used in reality.

So that is definitively a biased interpretation. This is independent of how accurate my POV or your POV is on whether LLMs degrade documents. I am simply saying the experiment conducted is COMPLETELY DIFFERENT from how LLMs AND humans edit papers.

by redsocksfan4517 hours ago|

parent|

prev|

[-]

[dead]

by ActionHank12 hours ago|

prev|

[-]

> a human will DO worse then a 25% degradation.

* than

by threethirtytwo11 hours ago|

parent|

[-]

See that’s an example of degradation by a human. Not even an LLM wil make that kinda mistake.

by 21 hours ago|

prev|

[-]

deleted

by ieieue20 hours ago|

prev|

[-]

[flagged]

by tieTYT16 hours ago|

prev|

[-]

> a human will DO worse then a 25% degradation

As I was reading this article, a similar thought occurred to me: "I wonder if that's better or worse than a human?" Unfortunately, there was no human baseline in this study. That said, there are studies that compare LLM to human performance. Usually, humans perform much better (like 5-7x better) at long-running tasks.

In other words, a human would probably do better than an LLM on this task.

Humans lose to LLMs in narrow, well-specified text/symbolic reasoning tasks where the model can exploit breadth, speed, and search. Usually, the LLM performed ~15% better than humans, but I saw studies that were as high as 80%. To my surprise, these studies were usually about "soft skills" like creativity and persuasion.

by threethirtytwo15 hours ago|

parent|

[-]

You can do a baseline study right now. Read this entire thread and make an edit of changing every E to an I.

Show your edit by regurgitating this entire thread by hand on a paper. Don't use any additional tools like Find and replace.

Boom there's your baseline. I can simulate the result in my head.

Guys I'm basically saying the experiment is innaccurate to the practical reality of how LLMs are actually used.