upvote
reply
That's only a fraction of the training data.
reply
Quite amusing that the library of libgen is worth 1.5bil for unlimited access.

It's about the same valuation as bun, lol.

reply
$3,000 per title.
reply
Do you think many authors would give you rights to create derivative works en masse for that money?
reply
For endlessly reselling the whole work verbatim? Well, where can I buy such a license in the real world, because then I would like to buy a couple of those!
reply
Meta/Facebook got away with it though right?
reply
That's a great cost-benefit ratio. Can you and I steal and do illegal things and pay the same cost?
reply
Sure, but only if you get the same benefits
reply
looks like we can't today. Man it would be great to figure out how to be above the law just like how these other rich people in different social classes are.
reply
Being logically consistent isn’t as profitable as being aggressive and loud.
reply
And if it includes at least one GPL source, they should release the weights on GPL license.
reply
While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low, given the international positioning of the parties, and the... um... complex relationships involved.

Anthropic's actions seem performative. Others have already speculated on the likely audience(s).

reply
> While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low ...

As cited in a peer comment here[0]:

  In June 2025, Judge William Alsup of the U.S. District 
  Court for the Northern District of California ruled on 
  summary judgment that using books without permission to 
  train AI was fair use if they were acquired legally, but he 
  denied Anthropic’s request for summary judgment related to 
  piracy—finding that the piracy was not fair use.[1]
Of note in the judge's finding; "the piracy was not fair use".

0 - https://news.ycombinator.com/item?id=48667411

1 - https://authorsguild.org/advocacy/artificial-intelligence/wh...

reply