upvote
>> But if a vendor makes a pretty believable claim that there are repetitive statistical patterns in LLM output, it's all of sudden treated the same as palm reading.

That's what fortunetellers do. The problem isn't guessing correctly about AI content in writing. The problem is false positives. That's what puts it in the same category is predictive policing scam software. And fortunetelling.

reply
It has nothing to do with predictive policing. I don't understand this example, it has nothing to do with detecting intent. You're looking for evidence of a past misdeed.

False positive and false negative rates are non-zero, as with almost anything, but the tools are pretty good. I encourage you to give them a try. Pangram is a good state-of-the-art choice and you can try it for free. They also publish evals and other data about their approach.

reply
Eliminating any statistically significant difference between a high-quality human-written text and LLM-written text is exactly what the LLMs are being trained for. At this point, "text is low quality, therefore must be human" is a much stronger signal.
reply
> Eliminating any statistically significant difference between a high-quality human-written text and LLM-written text is exactly what the LLMs are being trained for.

I think you're basing this off a fundamental misunderstanding of what these detectors look for. LLMs generate human-like text, but they also generate roughly the same style and content every time for a given prompt, modulo some small amount of nondeterminism. In essence, they are a very predictable human. Ask Gemini or ChatGPT ten times in a row to write an essay about why AI is awesome, and it will probably strike about the same tone every single time, with similar syntax, idioms, etc.

This is what these tools detect: the default output of "hey ChatGPT, write me a school essay about X". This can be evaded with clever prompting to assume a different writing personality, but there's only so much evasion you can do without making the text weird in other ways.

reply
You can detect if texts from a year ago used AI based on statistical patterns. Nobody is taking issue with that. But once you tell people "we will run these tests to detect if your future submissions are using AI" you create an adversarial environment and your statistical methods will continuously break. Not because statistics is broken, but because you are trying to hit a moving target that doesn't want to be hit.

That's not like detecting thoughts via fMRI, it's like detecting tomorrows malware with yesterday's malware signatures. Or like researchers making a vaccine against the common cold

And the obvious proposal to fix that has been made multiple times in this thread: don't make take-at-home tasks part of the grade. Instead of trying to punish what you can't reliably detect, take away the incentive to do it in the first place

reply
> You can detect if texts from a year ago used AI based on statistical patterns.

I don't understand your argument. The vendors for these detection tools can acquire recent samples from all frontier models just as easily as you can use them to write essays. There's nothing that requires a one-year delay.

reply
> you create an adversarial environment

Do AI vendors specifically train models to circumvent AI detectors? Why would they?

reply
> When we have a press release from a university about how researchers can detect thoughts via fMRI, we have no issue with the claim.

Different people. I for one have always claimed that fMRI is too coarse-grained for detailed thought detection.

If AI detection "sometimes fails", it doesn't "work". It works well enough to convict someone with other evidence, but when there's no other evidence nor an attempt to get any, it has no good use.

What I propose is simple: grade only closed-book exams, and hold students' phones during the exams. Students don't need 1:1 monitoring, it's the same as 10-20 years ago.

reply