> This policy is not open to discussion, any content submitted that is clearly labelled as LLM-generated (including issues, merge requests, and merge request descriptions) will be immediately closed, and any attempt to bypass this policy will result in a ban from the project.
It's similar to how I can't implement a feature by copying-and-pasting the obvious code from some commercially licensed project. But somebody else could write basically the same thing independently without knowing about the proprietary-license code, and that would be fine.
Like, this should be enshrined as the quintessential “they simply, obstinately, perilously, refused to get it” moment.
Shortly, no one is going to care about anyone’s bespoke manual keyboard entry of code if it takes 10 times as long to produce the same functionality with imperceptibly less error.
If the submitter is prepared to explain the code and vouch for its quality then that might reasonably fall under "don't ask, don't tell".
However, if LLM output is either (a) uncopyrightable or (b) considered a derivative work of the source that was used to train the model, then you have a legal problem. And the legal system does care about invisible "bit colour".
For one simple reason. Intention.
Here's some code for example: https://i.imgur.com/dp0QHBp.png
Both sides written by an LLM. Both sides written based on my explicit prompts explaining exactly how I want it to behave, then testing, retesting, and generally doing all the normal software eng due diligence necessary for basic QA. Sometimes the prompts are explicitly "change this variable name" and it ends up changing 2 lines of code no different from a find/replace.
Also I'm watching it reason in real time by running terminal commands to probe runtime data and extrapolate the right code. I've already seen it fix basic bugs because an RFC wasn't adhered to perfectly. Even leaving a nice comment explaining why we're ignoring the RFC in that one spot.
Eventually these arguments are kinda exhausting. People will use it to build stuff and the stuff they build ends up retraining it so we're already hundreds of generations deep on the retraining already and talking about licenses at this point feels absurd to me.
There are plenty of good reasons why somebody might not want your PR, independent of how good or useful to you your change is.
CLEARLY, a lot of developers are not reasonable
Once identity is guaranteed, privileges basically come down to reputation — which in this case is a binary "you're okay until we detect content that is clearly labelled as LLM-generated".
[Added]
Note that identity (especially avoiding duplicate identity) is not easily solved.
This heuristic lets the project flag problematic slop with minimal investment avoiding the cost issues with reviewing low-quality low-effort high-volume contributions, which should be near ideal.
Much like banning pornography on an artistic photo site, the perfect application on the borderline of the rule is far less important than filtering power “I know it when I see it” provides to the standard case. Plus, smut peddlers aren’t likely to set an OpenClaw bot-agent swarm loose arguing the point with you for days then posting blogs and medium articles attacking you personally for “discrimination”.
Just require that the CLA/Certificate of Origin statement be printed out, signed, and mailed with an envelope and stamp, where besides attesting that they appropriately license their contributions ((A)GPL, BSD, MIT, or whatever) and have the authority to do so, that they also attest that they haven't used any LLMs for their contributions. This will strongly deter direct LLM usage. Indirect usage, where people whip up LLM-generated PoCs that they then rewrite, will still probably go on, and go on without detection, but that's less objectionable morally (and legally) than trying to directly commit LLM code.
As an aside, I've noticed a huge drop off in license literacy amongst developers, as well as respect for the license choices of other developers/projects. I can't tell if LLMs caused this, but there's a noticeable difference from the way things were 10 years ago.
What do you mean by this? I always assumed this was the case anyway; MIT is, if I'm not mistaken, one of the mostly used licenses. I typically had a "fuck it" attitude when it came to the license, and I assume quite a lot of other people shared that sentiment. The code is the fun bit.
No, it wasn't that way in the 2000s, e.g., on platforms like SourceForge, where OSS devs would go out of their way to learn the terms and conditions of the popular licenses and made sure to respect each other's license choices, and usually defaulted to GPL (or LGPL), unless there was a compelling reason not to: https://web.archive.org/web/20160326002305/https://redmonk.c...
Now the corporate-backed "MIT-EVERYTHING" mindvirus has ruined all of that: https://opensource.org/blog/top-open-source-licenses-in-2025