It was shown, in this case, that the llms wouldn’t generate accurate quotes more than 60 words in length.
This is not comparable to encoding a full video file.
Then what if their memory is so good, they repeat entire sections verbatim when asked. Does that violate it? I’d say it’s grey.
But that’s a very specific case - reproducing large chunks of owned work is something that can be quite easily detected and prevented and I’m almost certain the frontier labs are already going this.
So I think it’s just very not clear - the reality is this is a novel situation, the job of the courts is now to basically decide what’s allowed and what’s not. But the rational shouldn’t be ‘this can’t be fair use it’s just compression’. Because it’s clearly something fundamentally different and existing laws just aren’t applicable imo
> Then what if their memory is so good, they repeat entire sections verbatim when asked. Does that violate it? I’d say it’s grey.
That's an unambiguous "yes". Performing a copyrighted play or piece of music without the rights to do so is universally considered a copyright violation, even if the performers are performing from memory. It's still a copyright violation if they don't remember their parts perfectly and have to ad-lib sometimes, or if they don't perform the entire work from start to finish.
There's a whole related topic here in the realm of news (since it's shorter form), but it also has a much shorter half-life. Not sure what I think there yet.