upvote
If I understand your argument it's ethically ok to destill huge swathes of copyrighted work into a model without compensation, but then it is ethically wrong to use that model without compensation (well actually reduced pricing)?

I don't get the moral framework that you're applying. Could you elaborate?

reply
Over the air TV also isn’t public domain. It’s licensed to a station for broadcast. The output of an LLM has been deemed ineligible for copyright. Until you square that pickle your circle isn’t circling.
reply
Why is the ethical line specifically on model distillation for you?

Was it ethical for Anthropic/OpenAI to train their models by gobbling a treasure trove of copyrighted material?

reply
Free over-the-air network TV is (generally) copyrighted.

The output of LLMs cannot be copyrighted. This isn't a semantic game; it's literally the case that Anthropic cannot seek relief for people duplicating the output of an LLM.

reply
With you, but I suppose they could have a case for circumventing access restrictions under the DMCA aka leet hacking.
reply
The relief available to a licensor for violating a license use restriction is cancellation of the license. And they're free to do that, just like Alibaba is free to pay somebody in Hyderabad $20 to make another one.

DMCA can't apply in this case because (this is the "C" in its initialism) it is based on copyright protections, which the output of Claude is not eligible for.

reply
I won’t go too far into the weeds, because I’m not an internet lawyer, and I basically agree with you, but I do believe there are access restriction laws that are not only limited to copyright violation. People have gone to prison for enumerating sequential identifiers in URLs to access records they shouldn’t be able to. I don’t know if Anthropic could actually make a case there, but it seems plausible at least.
reply
An endless list of tangential analogies isn’t really a valid argument..

DMCA has as little to do with this as streaming copyrighted content

reply
Using a bunch of nonsensical/irrelevant analogies to somehow make a point seems worse than these “word games”? What does streaming copyrighted content have to do with LLM outputs (which are public domain)?
reply
> clearly ethically wrong

Ethics are subjective. That’s why we have courts judge based on the law and not ethics

reply