Some days, I spend over 4 hours a day reading walls of text written by Claude. If I couldn't recognize Claude's default "voice" by now, something would be wrong. It would be like a Hemingway fan not being able to recognize Hemingway. Except more so, because Claude's writing style is getting worse from version to version, descending into self parody.
On the statistical side, Pangram's model identifies AI-authored text with a 1-in-5,000 false positive rate, measured against hold-out texts from before 2022. My "ear" also agrees closely with Pangram. If I think something sounds AI written, Pangram virtually always comes back with "AI, confidence: high."
> On top of that, LLM writing is often bad in a very particular way: it's weak on actual things to say, but with an overheated style.
This point is interesting because it raises the question of what "LLM writing" actually is. If it is expanding a smaller prompt into a larger article then yes, by construction the information density is low. But it can also be used to take a semi-coherent stream of consciousness and turn it into something readable and the people using it that way might already have started to slip under the radar.
This is a lot like how the criminals seem especially stupid because the ones who get caught are disproportionately the stupid ones. The easily detectable LLM writers are going to be the lazy ones.
To an extent, true. There are a lot of lazy ones though. And even for those who take steps, it sometimes leaves enough of a trace to at least raise the question.
The end product is something much more polished than anything I'd writ eon my own, but still comes off as being genuinely from me. At least that's what people have told me when I've asked.
Also, FWIW, Pangram scores my writing as entirely human.
Claude's writing isn't easy to identify because it uses em-dashes and bulleted lists. Claude's distinctive style goes much deeper than that.
In other words, correlation != causation
Edit: You know how you can recognise someone just from their gait while they walk towards you? I would struggle to describe that for an individual person but it doesn't mean I can't identify them from that alone.
(As of now, that four-word low-effort comment has generated over a thousand words in response, none of which improve this article's discussion.)
But AI written pieces do have a certain feeling. A sort of saccatto in the succession of ideas that does not feel natural. They emphasize certain points, and you as a reader, you just wonder why is that. There is the “This thing, not just that thing”. There are also the three successive propositions (mostly in one sentences) to accentuate an idea and “Negation. Strong positive idea in the same direction”.
In general try reading one (vocally) to yourself and it will feel really weird.
Everytime someone claims that they have always written like this I grab a pre-2022 post of theirs and five both to a few SOTA chatbots and ask "did the same writer author both these texts".
Thus far I have never gotten a "likely" response.
If the author truly did not use an AI to write something, then it is more likely that theybhave spent so much time conversing with their LLM than reading human authored material that they now sound like an LLM.
This specific article, though, doesn't look anything like LLM output.
PS. Isn't it odd how all LLMs have converged on the same speech patterns, patterns which resemble almost no human authored material outside of high-pressure sales techniques?
And yes, I agree that most people who light up on AI tell scans are indeed using AI. That's not my point.
You start to be interested by the title of an article or a book cover, and then you start reading it and it’s just vapor. Nothing tangible to be gained. It’s like buying something expensive and finding out a cheap trinket under the wrapping.
After a couple of times, you will develop a certain kind of heuristics for this kind of texts. It will not be perfect and will have some false positives, but that’s the only way to keep your sanity.
I like your vapor term. For me it's about the content anyways. If it's just some sort of vapid, pointless drivel then I'm not going to like it regardless of who or what wrote it. And in my experience text that strongly correlates with AI tells also strongly correlate with having sparse substance and lots of fluff.
In other words, if people don't like something, just don't read it. Shouting at people for AI generated text just makes you look foolish if the text is not in fact AI generated. And the person shouting "AI slop!" has no way to prove it other than vibes.
LLMs were trained with a lot of synthetic data to transform them from a complete this text into a chatbot, I suspect that this tons of synthetic data that forces the LLM to answer questions into a specific ways also forced them to have this "synthetic/robotic" language. Claude users would have noticed the "belter and suspenders" phrase just started popping out after an update and I am sure is nto because lots of developers used it in their blogs and Anthropic scrapped those blogs in that update.