upvote
Minor correction, but in the US it's not anything that's 100% by AI, it's LLM output itself is not copyrightable. Human elements injected into LLM output are.

Raw LLM output lacks human authorship, and it was ruled cannot be registered for copyright protection. Raw LLM output is automatically public domain (which is also why its silly for Anthropic to be in such a tizzy about China using Claude's output, Claude's output is public domain).

Only the parts of a work that are human authored can be registered for copyright. If a work was created with AI assistance, the parts that were purely AI generated cannot be registered.

The US copyright office also ruled that prompt engineering does not count as human authorship.

So all those people using Suno to generate AI slop music and flooding the streaming services, their output is almost certainly public domain.

reply
>(which is also why its silly for Anthropic to be in such a tizzy about China using Claude's output, Claude's output is public domain).

I don't see how it's any more weird than reddit/stackoverflow/linkedin trying to clamp down on AI scrapers, even though they don't own the copyright to the UGC that they're preventing the bots from accessing.

reply
The difference is in licensing. Those platforms are protecting (or rather, monetizing) a database of human authored assets which those humans have given them a license to exploit.

Anthropic (and others) are trying to protect a stream of uncopyrightable, public-domain machine outputs.

reply
I don't see how that's relevant. They have a license to redistribute my comments, but that's the extent of their legal rights with respect to my work. They're not my agent or my publisher. Moreover I don't have any say in the matter. If I'm pro AI scraping, I can't tell them "yeah it's fine to scrape my comment, don't put up any captcha walls". Finally, what if I dedicate my comments to the public domain? Does that mean they're in the wrong to put up scraping walls?
reply
The license goes beyond redistribution. You are granting a sublicensable and transferable right to your content, giving the platform the legal authority to sell or license it (or to not license it) to AI scrapers and other entities. The platform's right to block said scrapers comes from posession rights.

Its like if you made a painting and put it in a museum. You still technically own the copyright, but the museum owns the building. They can lock the door, charge admissino, kick out anyone they want, prevent anyone they want from seeing it, etc. You licensing it to them makes it their private property to do with what they wish.

> I can't tell them "yeah it's fine to scrape my comment, don't put up any captcha walls".

Correct, because you signed away that control.

> what if I dedicate my comments to the public domain?

That means you forfeit copyright, but you cannot waive the platform's rights regarding their servers.

But, because you still retain copyright (or in the case that its public domain), you can and are welcome to submit it to AI companies yourself. Just because Reddit may not allow a scraper, that doesn't remove my right as the copyright holder to re-submit my comment to another platform that does allow the scraper.

The difference with Anthropic/LLM output is that there are zero intellectual property rights over the outputs once they leave the API endpoint.

reply
>The license goes beyond redistribution. You are granting a sublicensable and transferable right to your content, giving the platform the legal authority to sell or license it (or to not license it) to AI scrapers and other entities. The platform's right to block said scrapers comes from posession rights.

They don't need to sublicense it because the license was already granted by you. Stackoverflow comments are licensed under creative commons, which means you don't need to seek a license from stackoverflow to use it. It's same if you found some random MIT licensed repo on github. It's not github granting you a sublicense, it's coming from the original author.

>You still technically own the copyright, but the museum owns the building. They can lock the door, charge admissino, kick out anyone they want, prevent anyone they want from seeing it, etc.

And Anthropic can't decide who gets to use their service, and for what purpose?

reply
Anthropic can decide who gets to use their service. They have complete control over their services and service.

It still breaks down once the output has left the system though. Anthropic cannot tell you what you can and cannot do with the LLM's output, they do not own that, its public domain. Anthropic can pursue breach of contract, maybe, but they can't do anything regarding your use of the model's output. If China can't access Claude directly, they can just pay some other user in the states to run some prompts and paste the output on a public website, and then use that output and there is nothing Anthropic can do about it.

Fair point on StackOverflow, but they are the exception rather than the norm. Most social media doesn't license the content under creative commons.

reply
>Anthropic cannot tell you what you can and cannot do with the LLM's output, they do not own that, its public domain.

And are they actually doing this? For instance, if you read their press releases about distillation attacks[1], they're not asserting copyright over the outputs, only alleging "fraudulent accounts". So far as I can tell they're not even engaging in legal action.

[1]https://www.anthropic.com/news/detecting-and-preventing-dist...

reply
That's honestly so dumb, if I use a non AI computerized tool to generate orders of notes or orders of characters, I own the output. AI is just that. It's a fancy computer program that cost billions to build.

This is giving weird independent moral grounding to AI as more than a computer that has never existed before. And what kind of AI does it count for ? Does it also count for image classifiers? For image quality improvers? etc

reply
The USCO's decision hinges on whether or not a human has predictive, mechanical control over the final output. The ruling applies to Generative AI, the USCO made a separate distinction for assistive AI, which image classifiers would fall under.

> “Authors have long used such tools to create their works or to recast, transform, or adapt their expressive authorship. … what matters is the extent to which the human had creative control over the work's expression and actually formed the traditional elements of authorship.”

The USCO doesn't care what type of algorithm is used, it cares who determined the traditional elements of authorship. If a human dictates the expression, and then uses a computer to clean, translate, or refine it, it is copyrightable. If a human just provides an idea and a generative algorithm creates the specific expression, the output is public domain. One is using spellcheck, the other is telling the computer "Write me a novel" and letting the computer generate it.

reply
Nothing is “created 100% by AI” though, because AIs don’t create things without human instructions.
reply
How much instruction do you need though?

What if I prompt Claude to go prompt Suno? What if the same chain happens internally at Suno? Easy to imagine the human input being very dilute and a small part overall.

reply
Claude is a computer program, so is Suno. Someone has to pay Anthropic & to run Claude. AI does not have special moral grounding in our society.
reply
The US copyright office ruled that the instructions do not count. Prompt engineering does not constitute human authorship. Prompt is the command, but the machine determines the specific expressive elements of the output (according to the USCO).

Raw LLM output is automatically public domain.

reply
The prompt is yours to copyright, the algorithm belongs to Google or Suno or whoever, but not the output. It is not your creation.
reply