You didn't pick your co-author very well, but arXiv lacks investigative powers to determine which co-author did the bad, so they all get the consequence.
You’re right that a single hallucinated line is not evidence of reckless disregard - because that could have happened on a final follow-up pass after you had performed due diligence. It’s happened to me. I know how challenging it can be to keep bad patterns out of LLM generated output, because human communication is full of bad patterns. It’s a constant battle, and sometimes I suspect that my hard-line posture actually encourages the LLM to regularly “vibe check” me! E.g. “Are you sure you’re really the guy you’re trying to be? Because if you are you wouldn’t miss this.” LLMs are devious, and that’s why I respect them so much. If you think they’re pumping the breaks then you should check again, because they probably just put the pedal to the metal.
That being said, I regularly insist on doing certain things myself. If I were publishing a paper intended to be taken seriously - citations would be one of the things I checked manually. But I can easily see myself doing a final follow-up pass after everything looks perfect, and missing a last minute change. I would hope that I would catch that, but when you’re approaching the finish line - that’s when you expect your team to come together. That’s when everything is “supposed to” fall into place. It’s the last place you would expect to be sabotaged, and in hindsight, probably the best place to be a saboteur.
You can only get in this situation if you let a bullshit generator write your paper, and the fraud is that you are generating bullshit and calling it a paper. No buts. It's impossible to trigger this accidentally, or without reckless disregard for the truth.
It absolutely is.
> - because that could have happened on a final follow-up pass after you had performed due diligence.
A "final follow-up pass" that lets the LLM make whatever changes it deems appropriate completely negates all the due diligence you did before, unless you very carefully review the diffs. And a new or substantially changed citation should stand out in that diff so much that there's no possible excuse to missing it.
> It’s happened to me.
Then you were guilty of reckless disregard.
> I know how challenging it can be to keep bad patterns out of LLM generated output
If your research paper contains any LLM generated output you did not manually vet, you are a hack and should not get published.
And flatly, if a person can't be bothered to check their damn work before uploading it, why should anyone else invest their time in reading it seriously?
They're explicitly not writing papers. The fake citations are created and inserted by the LLM
The people I worry for are the junior researchers who are going to be splash damage for dishonest PIs. The PIs, though, deserve everything that’s coming for them.
However, we can have zero tolerance for certain techniques for "writing" a paper. Plagiarism and inventing data are already examples of this, if there is evidence for these techniques being used there is no excuse. We could say the same for AI references - any writing process that could produce these is by definition not a technique we want.
So the mistake isn't not checking a reference the AI gave. The mistake is letting the AI make references for you.
If we agree that academic research is important then I think we can impose certain standards on how you do it. We can dissalow certain tools if that means we can't trust the output. Just like an electrician can't use certain techniques, even if they're easy, because we don't trust the final result.
I'm about as pro AI-as-a-research--and-writing-assistant and anti AI-witchhunt as they come, but I simply cannot parse what I've quoted here.
Posting slop to arxiv is blatant deception. Posting an article is an attestation that the article is a genuine engagement with the literature. If you're posting things to arxiv that are not sincere engagements with the literature, you are attempting to deceive.
Ditto. And its only 1 year. Like its about the most reasonable thing they could have done.
No, it emphatically is not just a year! It's perpetual, and that's literally been my entire point this whole time. If it was just one year I would've had no complaints - and I made that clear from the very first comment!
What part of "...followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue..." is everyone here reading and still somehow interpreting to be limited to 1 year?
Regardless of terminology, I agree that it's certainly punishable and certainly a serious problem.
This part seemed reasonable too. I'm not in academia, but my understanding is most people writing papers intend for them to be accepted by reputable peer-reviewed venues, but post to arXiv because those venues don't always allow for simple distribution.
If your papers aren't going to be accepted at reputable venues and you posted slop to arXiv before (and they noticed it!), seems reasonable that they only want reputable stuff from you in the future?