upvote
Not necessarily, as exhibited by the massive success of artificial data.
reply
Could you elaborate?
reply
EDIT: probably not relevant, after re-re-reading the comment in question.

Presumably littlestymaar is talking about all the LLM-generated output that's publicly available on the Internet (in various qualities but significant quantity) and there for the scraping.

reply