undefined

points

[-]

> We found our data in the outputs of their models but who can do anything about it...

If the crawlers refuse to voluntarily respect your robots.txt, then you are well within your rights to poison their data.

by hajile1 hours ago|

parent|

[-]

robots.txt seems like it should be a legally-binding terms of service which would make them outright copyright infringing.

Sue for $180,000 per infringement which should be calculated for each illegal API call.

by throw12345678911 hours ago|

parent|

[-]

Was your robots txt written by a lawyer? Does it hold up in the court?

by wang_li7 minutes ago|

parent|

[-]

It doesn't matter. Robots.txt is not a license, it's a set of computer parsable directives of how programs should access your site. The actual license doesn't have to be written for computers to parse to be legally binding.

A person should be able to write in a terms of use or license page on their website that says "do not include any content from this website in your AI training data. if you do you will be billed $100 billion dollars." And it should be enforceable. It just turns out that nerds like to say "oh that would be too hard or too expensive, so we're going to ignore it."

by shimman1 hours ago|

prev|

[-]

Why hasn't your company sued OpenAI and try to argue they're violating the computer abuse and fraud act? Would it really be impossible to argue this?

Unauthorized access, system damage, and maybe even extortion all apply here.

by rastrojero20001 hours ago|

prev|

[-]

Lawyers can. As long as that data is actually yours I mean, in a strictly legal sense.

by telotortium1 hours ago|

prev|

[-]

I mean, did you check the IPs and make sure they’re from OpenAI? Obviously a fly-by-night AI company is going to set their User Agent to be from a big player.