But whether you can actually be compelled to do that isn't well tested in court. Challenging that the GPL is enforcable in that way leads you down the path that you had no valid license at all, and for past GPL offenders that would have been the worse outcome. AI companies could change that
This is true when talking about the infringement of the copyrights of others. But when discussing the infringement of GPL copyleft, making a potentially infringing artifact publicly available likely satisfies the license conditions.
The evil is that this case was settled, and before being settled was decided in a way contrary to all previous copyright decisions. The courts decided that rap records had to clear every single sample, thereby basically destroying the art form, but now you can literally feed every book into a blender, piece another book together out of the pieces, and sell it.
Hip-hop when it peaked with the Bomb Squad was such a frenetic mix of so many recognizable, unrecognizable, and transformed sources that it doesn't resemble anything that was made after the decisions against Biz Markie and De La Soul. Afterwards, you just licensed one song, slightly cut it up, and rapped over it. It was just a new way to sell old shit to young people unfamiliar with it.
Now you can literally just train a machine on the same stuff, and it's legal. A machine transformation was elevated over human creativity, simply because rich people wanted it.
Ignoring the fact that the statement doesn't talk about FSF code in the training data at all, [0] are you sure about that? From the start of the last of three paragraph in the statement:
Obviously, the right thing to do is protect computing freedom: share complete training inputs with every user of the LLM, together with the complete model, training configuration settings, and the accompanying software source code. Therefore, we urge Anthropic and other LLM developers that train models using huge datasets downloaded from the Internet to provide these LLMs to their users in freedom.
This seems to me to be consistent with the FSF's stance of "You told the computer how to do it. The right thing to do is to give the humans operating that computer the software, input data, and instructions that they need to do it, too.".[0] In fact, it talks about the inclusion of a book published under the terms of the GNU FDL, [1] which requires distribution of modified copies of a covered work to -themselves- be covered by the GNU FDL.