undefined

points

[-]

Agreed, there's no doubt this will happen. It's likely already happening (it feels safe to assume that Anthropic is curating data from the data they record from Claude Code?)

As far as I understand RL scaling (we've already maxxed out RLVR), these machines only get better as long as they have expert reasoner traces available.

Having an expert work with an LLM and successfully solve a problem is high signal data, it may be the only path forward?

My prior is that these companies will take this data without asking you as much as they can.

by lxgr9 hours ago|

parent|

[-]

Exactly, or functionally equivalently, asking you in paragraph 37 of a 120-page PDF (bonus points: in an agreement update).

And importantly, this can be cross-lab/model too. I suspect there's a reason why e.g. Google has been offering me free Claude inference in Google Antigravity on a free plan...

by nhecker4 hours ago|

prev|

[-]

The site arena.ai does exactly this already, as far as I can tell. (In addition to the whole ranking thing.)

by the_af8 hours ago|

prev|

[-]

> Data sharing agreements permitting, today's inference runs can be tomorrow's training data. Presumably the models are good enough at labeling promising chains of thought already.

Wouldn't this lead to model collapse?

by littlestymaar8 hours ago|

parent|

[-]

Not necessarily, as exhibited by the massive success of artificial data.

by the_af5 hours ago|

parent|

[-]

Could you elaborate?

by nhecker4 hours ago|

parent|

[-]

EDIT: probably not relevant, after re-re-reading the comment in question.

Presumably littlestymaar is talking about all the LLM-generated output that's publicly available on the Internet (in various qualities but significant quantity) and there for the scraping.