upvote
There is one level that these training data give examples of specific static reasoning chains.

Given exposure to enough reasoning chains, with training data that is designed around adversarial reasoning and teaching models to reason, these types of training data might be key to teaching models to reason beyond what they could gather from static data.

reply
> these types of training data might be key to teaching models to reason beyond what they could gather from static data.

I was under impression that every time LLMs try to be truly novel and they need to assume things in the area where they didn't have enough data points that there were trained on, results are not good, has that changed?

reply
If LLMs were already good at it, the AI labs wouldn't be paying this insane amount of money for people to generate training data to teach them.
reply
deleted
reply