undefined

points

[-]

Derivative works can also run afoul of copyright. An LLM trained on a corpus of copyrighted code is creating derivative works no matter how obscure the process is.

by wareya10 hours ago|

parent|

[-]

This actually isn't what legal precedent currently says. The precedent is currently looking at actual output, not models being tainted. If you think this is morally wrong, look into getting the laws changed (serious).

by Georgelemental10 hours ago|

parent|

prev|

[-]

What about a human trained on having 30 years of experience working with copyrighted codebases?

by mftrhu7 hours ago|

parent|

[-]

Said human would likely not be able to create a clean-room implementation of any of the codebases they worked on.

by aeon_ai9 hours ago|

parent|

prev|

[-]

Judge Alsup -- U.S. District Judge William Alsup said Anthropic made "fair use" of books, deeming it "exceedingly transformative."

"Like any reader aspiring to be a writer, Anthropic's LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different"

by danlitt7 hours ago|

prev|

[-]

I disagree that information flow is required. Do you have a reference for that? Certainly it is an important consideration. But consider all the real literary works contained in the infinite library of babel.[1] Are they original works just because no copy was used to produce them?

[1]: https://libraryofbabel.info/

by Filligree6 hours ago|

parent|

[-]

Yes; the works are original.

However, describing the path you need to get there requires copyright infringement.

by BoredPositron11 hours ago|

prev|

[-]

Well discovery might be a fun exercise to see if the code is in the dataset of the llm.

by bjord11 hours ago|

parent|

[-]

if?