undefined

points

by pta20028 hours ago|

[-]

I assume a selfish benefit is that OpenAI and Google don't want the models to train on their own data. There is just /so much/ AI generated content online that they definitely need to filter it out somehow when assembling the training data. This is a pretty effective way to do that, with the nice bonus of being mostly good from a PR standpoint.

by sgc4 hours ago|

parent|

[-]

I immediately thought that was the real reason. Their models will quickly break without some sort of consensus on how to reliably exclude them.