undefined

points

[-]

One major limitation of the LLM architecture is that even the failure mode varies unpredictably between inputs.

The set of 11-digit numbers with any given failure mode (or even successful output) has no discernable pattern, merely whatever randomness the training process baked into the model.

You can't predict ahead of time when they will fail spectacularly, nor draw a clear boundary around the failure cases. And early major example of this were the "glitch tokens" introduced into most LLMs by training on reddit data.

But there is an "in general"/"average failure rate across all inputs of a given size" answer: LLMs performance drops off a cliff once the input reaches too much complexity. (A "┐" shaped curve) In contrast to humans, where you can ask a child to add two N-digit numbers and the error rate will be approximately linear to N.

by varispeed7 hours ago|

parent|

[-]

Most humans struggle to compute 10 digit stuff. They use tools instead. Can LLM learn to use calculator? Sorry if that is a stupid question. Maybe brains are not well suited for calculations natively.

by nerdsniper45 minutes ago|

parent|

[-]

Yes. LLMs use calculators to great effect. More often, Python as a calculator.

Also, there exist autistic savants who prove that a human brain can be used to perform rote calculations on large numbers much faster than a human with a calculator can.

by 18al17 hours ago|

prev|

[-]

Depends on how the transformer has been trained. If it has seen 11 digit examples while training it might work, else the input will be out of distribution and it will respond with a nonsensical number.

For instance the current high score model (311 params [0]), when given 12345678900 + 1, responds with 96913456789.

An interesting experiment would be: what's the minimum number of parameters required to handle unbounded addition (without offloading it to tool calls).

Of course memory constraints would preclude such an experiment. And so a sensible proxy would be: what kind of neural-net architecture and training would allow a model to handle numbers lengths it hasn't been trained on. I suspect, this may be not be possible.

[0] https://github.com/rezabyt/digit-addition-311p