I am going to assume it's the latter.
If you in your house take an AGPL program, host it for yourself, and use it yourself, nothing in the AGPL obligates you to publish the source changes.
In fact, even if you take AGPL software and put it behind a paywall and modify it, the only people who the license mandates you to provide the source code for are the people paying.
The AGPL is basically the GPL with the definition of "user" broadened to include people interacting with the software over the network.
And the GPL, again, only requires you to provide the source code, upon request, to users. If you only distribute GPL software behind a paywall, you personally only need to give the source to people paying.
Although in both these cases, nothing stops the person receiving that source code from publishing it under its own terms.
Google “examples of GPL enforced in court” for a few
Yeah it requires finding out, but how do you prove a whistleblower broke their NDA?
I'm missing something there, that's precisely what I'm arguing again. How can it do a clean-room reimplementation when the open source code is most likely in the training data? That only works if you would train on everything BUT the implementation you want. It's definitely feasible but wouldn't that be prohibitively expensive for most, if not all, projects?
But we'd be able to look at his clone code and see it's different, with different algorithms, etc. We could do a compare and see if there are any parts that were copied. It's certainly possible to clone GNU grep without copying any code and I don't think it would fail any copyright claims just because the GNU grep code is in the wild.
If that was the case, the moment any code is written under the GPL, it could never be reimplemented with a different license.
So instead of a human cloner, I use AI. Sure, the AI has access to the GPL code - every intelligence on the planet does. But does that mean that it's impossible to reimplement an idea? I don't think so.
Just because something is trivial enough to copy does not mean it was trivial to conceive of and codify. Mens rea really does matter when we are talking about defrauding intellectual property holders and stealing their opportunity.
But then how can the FSF reimplement AT&T utilities? The FSF didn't invent grep. They wrote a new version of it from scratch under a different license.
The "clean room" aspect for that came in the way that the people writing the new implementation had no knowledge of the original source material, they were just given a specification to implement (see also Oracle v. Google).
If you're feeding an LLM GPL'd code and it "creates" something "new" from it, that's not "clean room", right?
At the end of the day the supposed reimplementation that the LLM generates isn't copyrightable either so maybe this is all moot.
I didn’t RTFA but I suppose that by clean room here they mean you feed the code to ”one” LLM and tell it to write a specification. Then you give the specification to ”another” LLM and tell it to implement the specification.