upvote
> If LLMs cannot learn to beat not-that-difficult of games better than young teens, they are not intelligent.

I agree, with unresolved questions. Does it count if the LLM writes code which trains a neural network to play the game, and that neural network plays the game better than people do? Does that only count if the LLM tries that solution without a human prompting it to do so?

reply
I disagree that LLMs cannot solve "unsolved problems." This is already happening, and at fundamental mathematical and medical levels (the fields that are the most demanding when it comes to quality).

The idea that we haven't taught LLMs to come up with new answers... That doesn't even sound plausible. Just crank up the temperature, and an LLM will throw out so many ideas you'll exhaust yourself trying to sort through them.

So what haven't we taught LLMs?

- Have we not taught them to "filter"? We just haven't equipped them with experience and intuition, because we only feed them either "absolute fakes" or "verified facts." We don't feed them the actual path of problem-solving and research; those datasets simply don't exist.

- Have we not taught them to "double-check"? They are already excellent at verifying the credibility of our work.

- Have we not taught them to "defend" their ideas? They can justify ironclad logic and spot potentially "flaky" logic better than any human.

- Have we not taught them to "publish" and "present to the scientific community"? It's just that the previous steps aren't fully polished yet.

And if you look at the question of "creating completely new ideas" from this angle and in this level of detail... To me personally, it doesn't seem at all like LLMs are incapable of this kind of work.

We simply haven't taught them how to do it yet, purely because we don't have a sufficient volume of the right training materials.

reply
Solving an unsolved problem does not require necessitate learning, it may just require effort.

ARC is trying to test if LLMs can actually learn how to play the game.

reply
So your definition of intelligence would be exactly equal to a human or some subset of them you choose? Could a dog solve ARC-AGI? Probably not. I would not say they lack intelligence. Same with a fruit fly. What if the calculator is powered by actual living neurons? I think you need to know where you actually think the difference between organic machine and intelligence is before making blanket statements.

A modern LLM in a loop with a harness for memory and behavior modification in a body would probably fool me.

reply
"a harness for a memory" so it still requires external tools to work well. The whole point of this benchmark is to validate the systems can solve problems without any sort of outside help.
reply
> Airplanes don't have wings

???

reply