As far as I understand RL scaling (we've already maxxed out RLVR), these machines only get better as long as they have expert reasoner traces available.
Having an expert work with an LLM and successfully solve a problem is high signal data, it may be the only path forward?
My prior is that these companies will take this data without asking you as much as they can.
And importantly, this can be cross-lab/model too. I suspect there's a reason why e.g. Google has been offering me free Claude inference in Google Antigravity on a free plan...
Wouldn't this lead to model collapse?
Presumably littlestymaar is talking about all the LLM-generated output that's publicly available on the Internet (in various qualities but significant quantity) and there for the scraping.