I will upgrade the "why it matters" to "and now AI output is part of the training data". A day is coming when the punched-up AI verbiage will be the norm and hard to distinguish unless you're from the previous generation. Sort of in the way that I miss some aspects of Usenet.
I could only follow up with, "that is a genuine insight."
Not a single person visibly flinched in pain.
Seems stifling. We'll need someway to reward human creativity and out-of-bounds thinking before our greatest corpus of human intellect is a bounded by whenever and whatever was trained on.
As a much more immediate practical matter, LLMs trained on LLM output makes them worse overall, they degrade from doing that. So the more LLM-prodoced content fills the web, the less useful it is as a data source for future LLM training. In addition to just being increasingly boring and vapid.
It's like staring down the barrel of a gun and taking the time to make quips about the type of paper the gun advertisement was printed on.
I can agree that snark probably isn't the type of comment that we generally value or encourage here on Hacker News, but neither is posting blatant advertisements and press releases, but here we are discussing one, so shrug ?
And obviously it's a problem that it's so much cheaper to produce writing without underlying substance, but I think when one of the leading Internet security/infrastructure companies is writing about the leading cybersecurity model, it's excessively flippant to say the writing on top is "the real question"
This is also why Claude Code is full of weird bugs and why their support says that it did refunds when it didn't and so on and so forth.