undefined

points

[-]

You talk as if problem solving is a supervised (imitation) learning problem. No, it is a reinforcement learning problem, models learn by solving problems and getting rated. They generate their own training data. Optimal budget allocation is 1/3 cost pre-training, 1/3 for RL, and 1/3 on inference.

by XMPPwocky19 hours ago|

prev|

[-]

> The difference is that when a human reasoner goes to solve a problem, they'll think "this kind of proof usually goes this way" - following an explicit rule enforcement.

How is this different from "probabilistic pattern selection"?

by CamperBob218 hours ago|

parent|

[-]

Because... it's just different, that's all! OK?

by senordevnyc18 hours ago|

prev|

[-]

I don’t think there’s any evidence that “human reasoning” isn’t also based on probabilistic pattern selection.

by ieieue19 hours ago|

prev|

[-]

It’s amazing simple things have to be reiterated.

Perhaps it’s best if most admit they don’t have the fundamental ways of thinking to even participate in the conservation.

When all nuance is lost, the discussion must end.

by threethirtytwo3 hours ago|

parent|

[-]

You should leave this site. Comments like this are not good for this site. You should go somewhere else.