It’s only the former definition that would allow an AI model to have been trained on someone else’s data
There are yet more definitions of "theirs". For example, data whose provenance can be traced back to Anna's Archive.
So the data is legally owned by the book authors, possessed by Anna's Archive, and downloaded for training usage by the AI companies. Every person in that chain could, linguistically speaking, correctly refer to the data as "theirs", or refer to the data of a different entity as "theirs".
You are being granted a license to use the data.
But no one else is obligated to ignore the definitions of words that you're choosing to ignore, so the rest of us will go on saying it's their data.
We're not talking abstract language concepts, this is a specific case. The data was taken without license/rights/approval. It's stolen. AA calling it "our data" is disingenuous. Legally it isn't theirs. While you could use "ours"/"theirs" loosely in English, they knew it wasn't true in a legal sense when publishing this.
I found an abandoned bicycle 10 years ago. I have since replaced nearly all parts of it. I would give it back if you can prove it is yours but who owns the bicycle of theseus is more of an opinion.
I refer to it as my bicycle.
The chop shop well might.
Or, if I steal your car, and then go on to use it daily for the next 10 years, at some point everyone I know will refer to it as "my" car even if they're all entirely aware it was stolen.
> they knew it wasn't true in a legal sense when publishing this
I'm not sure why you're expecting the operators of a pirate site to use legally rigorous terms to refer to themselves in a blog post. This is an error in your expectations, not their terminology.
That's incorrect. A license violation isn't theft. Theft deprives others of their property, that's not what's going on here. Intellectual property is a fictional "ownership" that provides value to society, but it is much newer and different than the actual ownership of property.
No one actually owns a collection of words or ideas or thoughts.
Possession is 9/10 of the law - if you have a copy, you have possession, and thus you have SOMETHING and LEGALLY it is considered yours (now whether you legally obtained it is a different story and THAT is where charges stem from.)