So, mostly to fraudulent AI spam?
AI makes this problem worse in both directions. It makes it fantastically easy to produce ""content"". So if you're scraping content, or browsing content, you're going to run in to increasing amounts of AI. Micropayments makes this worse, because it's then a means of getting paid to produce spam. The problem comes when you want the ""content"" to be connected to real questions like "how does my dryer work" or "what is going to happen to oil availability six months from now".
AI trainers didn't pay book authors until forced to. $3,000 ended up being a pretty high value! But it was also a one-off. Everyone writing books from now on is going to have to deal with being free grist to the machine.
Spotify does not pay out mostly to AI spam.
Their pay scales by listens. The AI spam doesn’t collect many listens. The spammers do it because they can automate it and make it low effort, but it’s not a cash cow for the spammers.
As others said, Spotify pays shit for artists, but maybe that's the problem with the whole thing here. It should be more like how Bandcamp pays artists (80% to the artists, 20% for Bandcamp), but then the rapacious economy supporting the largest LLM providers would collapse and (wipes away a single tear) we'd all have to use simpler, cheaper, most likely local models.
That's probably not the best comparison. Spotify only benefits the big players resp. those with the most bots. If you actually want to support specific artists, you'd have to use Bandcamp or similar sites.
Would love to know exactly what the latest process is to keep slop out of training data.
There's way more value, if seeking out answers, in following the links to external sources, scraping books, and other sources that aren't "unwashed masses saying whatever they want".
> ...
> scraping books, and other sources that aren't "unwashed masses saying whatever they want".
The problem is there's a lot of knowledge that only exists as reddit comments, blog posts, or social Q&A.
:)
It has already done so, and we can be confident in saying that.
Verified content will always be relatively expensive when compared to AI content.
Visits to wikipedia and most sites have dropped. Rtings has gone full paywall. Ad revenue for producing Verified content will be too meager to allow for public consumption.
Theres jokes about GenAI being the great filter; while I doubt this, I do hope this is the final push that makes us think of how we want our information commons to be nurtured.
> Visits to wikipedia and most sites have dropped. Rtings has gone full paywall. Ad revenue for producing Verified content will be too meager to allow for public consumption.
AI is a technology that's going to further entrench inequality, by warping incentives to push us further away from democratization. Unless you've got $$$ to drop on verified content, you'll be served prolefeed slop and be that much more ignorant.
I'd argue that this is something that is more about the state of play, than tech itself.
As a software user I wish I could do the same for all the software I use.
This system is usually called taxes.
Which then pay for the universal healthcare, free education, affordable housing, libraries, parks,.. and so on.
LLM doesn't need to invent it, we should stop allowing them (people and companies behind LLM) to avoid it.