Many of us here are software developers by choice or hobby and we know it better than regular folks that scale changes everything and can break our assumptions and business if you design something for wrong scale.
Yet why do we still want to insist that a human and machine are the same and same rules apply when it comes to AI, though we know they operate at different speed and scale?
An LLM is just a really, really big, really, really elaborate "choose your own adventure" book.
You aren't a book.
But that's what makes the usual analogies with humans fail from the start. The laws were made with the assumption that they apply to humans which are a known quantity. This breaks down when you apply them with system with vastly increased (and ever increasing) capabilities.
> Anyway, I don't think that scanning is any different than photons hitting my retina.
If I ask you 10 years from now to give me a completely accurate depiction of what your retina registered yesterday at 5:52 PM, will you be able to? And can you give me a copy?
Let’s switch up your scenario. Let’s say the subject isn’t a human with machine-like qualities but instead a computer with human-like limitations. All the books were fed to that one computer, and for technical reasons it cannot be duplicated and can only answer one question at a time. Suddenly the infringement isn’t as problematic and the ways to commercially exploit that data are minimal.
Furthermore, even with perfect memory it would take time to read all those books, you’d never keep up with everything released in a single year. Nor would you be able to reproduce everything perfectly due to required time and lack of ability (perfectly recalling a painting or photograph does not mean you have the skills to make an exact copy).
All these comparisons are silly and useless anyway (though in your particular case I think you are arguing in good faith). Computers are not human. If a person was caught killing animals of an endangered species and used as a defence “but what about the natural predators in that habitat? I’m just doing the same as them”, we’d rightfully see through the bullshit and scoff at such an obviously flawed comparison.
And the systematic nature of the excerpt service makes the excerpts different from fair use quotes. A reference quote is not a service that can reproduce the entire work, and the reference quote cites the actual source of the insight/wisdom/research/poetry/etc.
The only thought experiment is why might someone even try to excuse this activity? I can think of a few.