upvote
Your “and then” is doing a lot of work there. The steps between may or may not include some form of “learn to understand humans”, but you can’t just hide them behind “and then” if what we are doing is claiming some particular thing is not in the list.

Through training on human text, we are building implicitly in the weights a statistical model of what humans might write in response when presented with arbitrary pieces of text. It turns out that we can make these incredibly accurate.

If building an accurate internal model of something then using it to predict that thing’s behaviour is different to gaining understanding of that thing, we will need to pin down exactly what “understanding” means, or we are forever doomed to talk at cross purposes.

reply
It can't even natively understand how many letters there are in words - how will it understand the meaning?
reply
I wish people would do even the most basic amount of research into LLMs before opining about what they can or cannot do. There are very principled reasons why LLMs do not know how many letters are in words, and it says nothing about their facility for understanding meaning.

Tokens are the most basic input unit of an LLM. But tokens don't generally correspond to words or letters, rather sub-word sequences. So Strawberry might be broken up into two tokens 'straw' and 'berry'. It has trouble distinguishing features that are "sub-token" like specific letter sequences because it doesn't see letter sequences but just the token as a single atomic unit. 'Straw' and 'r' are two tokens but an LLM is entirely blind to the fact that 'straw' has one 'r' in it.

As an analogy, I might ask you to identify the relative activations of each of the three cone types on your retina as I present some solid color image to your eyes. But of course you can't do this, you simply do not have cognitive access to that information. Individual color experiences are your basic vision tokens.

The widespread mistake people keep making is assuming the development of intelligence in LLMs should follow the same trajectory that human intelligence takes as it develops into adult levels of intelligence. Thus deficiency in some capacity that we take for granted in humans is an indictment on LLM intelligence. But this is specious. LLMs are entirely alien; their developmental paths do not and should not look anything like ours. Your intuition from human intelligence just works against understanding the potential for intelligence in LLMs.

reply
>The widespread mistake people keep making is assuming the development of intelligence in LLMs should follow the same trajectory that human intelligence takes as it develops into adult levels of intelligence.

To be fair, almost everyone who claims LLMs are conscious tends to claim that they are conscious in exactly the way that humans are, to the point of stating that human brains are also just complex next-token prediction machines with a random seed. It's basically religious arguments on both sides.

reply
Really? That has not been my experience.

I have seen people say "you're a next token prediction machine" but only in a similar way one might say "you're a cup of old lard". Not actually meaning it literally.

I have seen people interpret the request to show that they are not next token prediction machines to be a claim that they are, but this is almost always an argument to show certainty is difficult in this area.

People like Hinton have declared that they believe them to be conscious, but clealy indicate that they do not mean just like us.

reply
Eh, I’ve seen it. I’m not entirely sure it’s entirely wrong either. Humans are certainly more than just next token predictors but it’s not clear that our typical language behavior is significantly different. We call it “stream of consciousness” when we just spew words out without thinking and that seems to be the default operating mode.
reply
Given the fact that large language models are trained on human language, it shouldn't be surprising that the text they output resembles human language. That is what they're designed to do after all. But similarity in output doesn't necessarily map to similarity in process.

And it seem obvious to me that language behavior does differ significantly between humans and LLMs based on the frequency and nature of failure states. LLMs routinely hallucinate, or get "AI strokes" or get obsessed about not talking about goblins, etc. This isn't typical language behavior for humans unless they have severe neurological or psychological impairment.

People tend not to "spew words out without thinking" and certainly not all the time by default - we call that glossolalia and (outside of some fringe Christian sects) it's considered a "bug" not a "feature" of the human brain. Human language by default always has intent behind it, even if that intent isn't readily apparent to the speaker. People can recite by rote memory, but that isn't blind token prediction, it's the neurological equivalent of muscle memory. People can have conversations then forget about them because their attention was focused elsewhere, but that doesn't indicate that they were simply "spewing words out without thinking" at the time.

reply
> LLMs routinely hallucinate, or get "AI strokes" or get obsessed about not talking about goblins, etc. This isn't typical language behavior for humans unless they have severe neurological or psychological impairment.

People imagine details all the time. Eyewitness testimony is notoriously untrustworthy.

Our brains seem wired to confidently fill in gaps. We all have a literal blind spot we aren’t aware of because our brains convincingly lie to us and fill in the gap.

I don’t know what an “AI stroke” is, but I’ve definitely seen human beings in good health be in the middle of talking and suddenly forget what they are going to say.

> People tend not to "spew words out without thinking" and certainly not all the time by default - we call that glossolalia and (outside of some fringe Christian sects) it's considered a "bug" not a "feature" of the human brain.

Glossolalia is spouting gibberish, not comprehensible speech.

Kind of weird that you speak so confidently when you don’t apparently know the difference between steam of consciousness and “speaking in tongues”. Almost like you’re AI hallucinating.

reply
> There are very principled reasons why LLMs do not know how many letters are in words, and it says nothing about their facility for understanding meaning. … Tokens are the most basic input unit of an LLM. But tokens don't generally correspond to words or letters, rather sub-word sequences. So Strawberry might be broken up into two tokens 'straw' and 'berry'.

This sounds like a description of a child who has not learned to read yet. You ask a child who is not aware of the alphabet and of "words" how many r's are in strawberry you'd get a non-sense answer too. So what you're really pointing out is that the LLMs have not been trained on "the english language" and how words are constructed and what they are composed of. That they operate by tokens that don't correspond to words or letters is irrelevant as an answer to why they can't count the letters in a word. It's not that I know how many r's are in strawberry because of how I'm understanding the word "strawberry", I know how many r's are in strawberry because I know how to spell strawberry. The LLM needs to be trained on this the same way someone who is learning to read would be trained on it. No one should be surprised that an LLM can't "read" in the same way no one should be surprised that a child can't "read".

reply
>That they operate by tokens that don't correspond to words or letters is irrelevant as an answer to why they can't count the letters in a word.

This interpretation takes things too far away from how LLMs are constituted and so misses important explanatory power. The issue of counting letters in a word isn't about an ability to spell, it's about the nature of one's perception. We perceive words as sequences of individual letters. LLMs do not. I can ask you to tell me how many r's are in some nonsense word sequence and you're fully capable of doing that. LLMs do not see sequences of letters so they are intrinsically at a disadvantage for this kind of question. But this says nothing about its capacity for intelligence anymore than not naturally being able to distinguish frequencies of photons hitting your retina has anything to say about human intelligence.

reply
This is kind of a like assuming someone with bad spelling is stupid.

Counting letters in a word seems to have little to do with understanding the word. Young kids can’t spell or count well at all but no one says that means they can’t understand.

reply
This is like saying because humans can't multiply 23472 by 1836736 in less than 5 nanoseconds that they can't possibly understand anything about maths.
reply
You can't natively understand how many of your photoreceptors cells are activated by the period at the end of this sentence. How could you possibly understand the sentence's meaning?
reply