And in fact LLMs can very well "reason based on prior data points". That's what a chat session is. It's just that this is transient for cost reasons.
That's just a claim. Why so? Who said that's the case?
>When you go about your day doing your tasks, do you require terajoules of energy?
That's the definition of irrelevant. ENIAC needed 150 kW to do about 5,000 additions per second. A modern high-end GPU uses about 450 W to do around 80 trillion floating-point operations per second. That’s roughly 16 billion times the operation rate at about 1/333 the power, or around 5 trillion times better energy efficiency per operation.
Given such increase being possible, one can expect a future computer being able to run our mental tasks level of calculation, with similar or better efficiency than us.
Furthermore, "turing machine" is an abstraction. Modern CPUs/GPUs aren't turing machines either, in a pragmatic sense, they have a totally different architecture. And our brains have yet another architecture (more efficient at the kind of calculations they need).
What's important is computational expressiveness, and nothing you wrote proves that the brains architecture can't me modelled algorithmically and run in an equally efficient machine.
Even equally efficient is a red herring. If it's 1/10000 less efficient would it matter for whether the brain can be modelled or not? No, it would just speak to the effectiveness of our architecture.
You are a fool if you think otherwise. Are we conscious beings? Who knows, but we’re more than a neural network outputting tokens.
Firstly, and most obviously, we aren’t LLMs, for Pete’s sake.
There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all? I don’t know, but the training humans get is coupled with the pain and embarrassment of mistakes, the ability to learn while training (since we never stop training, really), and our own desires to reach our own goals for our own reasons.
I’m not spiritual in any way, and I view all living beings as biological machines, so don’t assume that I am coming from some “higher purpose” point of view.
That's just stating a claim though. Why is that so?
Mine is reffering to the "brain as prediction machine" establised theory. Plus on all we know for the brain's operation (neurons, connections, firings, etc).
>There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all?
What parts aren't? Can those parts still be algorithmically described and modelled as some information exchange/processing?
>but the training humans get is coupled with the pain and embarrassment of mistakes
Those are versions of negative feedback. We can do similar things to neural networks (including human preference feedback, penalties, and low scores).
>the ability to learn while training (since we never stop training, really)
I already covered that: "The main difference is the training part and that it's always-on."
We do have NNs that are continuously training and updating weights (even in production).
For big LLMs it's impractical because of the cost, otherwise totally doable. In fact, a chat session kind of does that too, but it's transient.
They're biological neural networks. Brains are made of neurons (which Do The Thing... mysteriously, somehow. Papers are inconclusive!) , Glia Cells (which support the neurons), and also several other tissues for (obvious?) things like blood vessels, which you need to power the whole thing, and other such management hardware.
Bioneurons are a bit more powerful than what artificial intelligence folks call 'neurons' these days. They have built in computation and learning capabilities. For some of them, you need hundreds of AI neurons to simulate their function even partially. And there's still bits people don't quite get about them.
But weights and prediction? That's the next emergence level up, we're not talking about hardware there. That said, the biological mechanisms aren't fully elucidated, so I bet there's still some surprises there.
How exactly? Except via handwaving? I refer to the "brain as prediction machine theory" which is the dominant one atm.
>you can even ask an LLM and it will tell you our brains work differently to it
It will just tell me platitudes based on weights of the millions of books and articles and such on its training. Kind of like what a human would tell me.
>and that’s not even including the possibility that we have a soul or any other spiritual substrait.
That's good, because I wasn't including it either.
It isn’t because humans and current LLMs have radically different architectures
LLMs: training and inference are two separate processes; weights are modifiable during training, static/fixed/read-only at runtime
Humans: training and inference are integrated and run together; weights are dynamic, continuously updated in response to new experiences
You can scale current LLM architectures as far as you want, it will never compete with humans because it architecturally lacks their dynamism
Actually scaling to humans is going to require fundamentally new architectures-which some people are working on, but it isn’t clear if any of them have succeeded yet
True, but we have RAG to offset that.
> it architecturally lacks their dynamism
We'll get there eventually. Keep in mind that the brain is now about 300k years into fine-tuning itself as this species classified as homo sapiens. LLMs haven't even been around for 5 years yet.
In practice that doesn’t always work… I’ve seen cases where (a) the answer is in the RAG but the model can’t find it because it didn’t use the right search terms-embeddings and vector search reduces the incidence of that but cannot eliminate it; (b) the model decided not to use the search tool because it thought the answer was so obvious that tool use was unnecessary; (c) model doubts, rejects, or forgets the tool call results because they contradict the weights; (d) contradictions between data in weights and data in RAG produce contradictory or ineloquent output; (e) the data in the RAG is overly diffuse and the tool fails to surface enough of it to produce the kind of synthesis of it all which you’d get if the same info was in the weights
This is especially the case when the facts have changed radically since the model was trained, e.g. “who is the Supreme Leader of Iran?”
> We'll get there eventually. Keep in mind that the brain is now about 300k years into fine-tuning itself as this species classified as homo sapiens. LLMs haven't even been around for 5 years yet.
We probably will eventually-but I doubt we’ll get there purely by scaling existing approaches-more likely, novel ideas nobody has even thought of yet will prove essential, and a human-level AI model will have radical architectural differences from the current generation
AlphaZero isn’t a LLM. There are Feed Forward networks, recurrent networks, convolutional networks, transformer networks, generative adversarial networks.
Brains have many different regions each with different architectures. None of them work like LLMs. Not even our language centres are structured or trained anything like LLMs.
Language came after conceptual modeling of the world around us. We're surrounded by social species with theory of mind and even the ability to recognise themselves and communicate with each other, but none of them have language. Even the communications faculties they have operate in completely different parts of their brains than ours with completely different structure. Actually we still have those parts of the brain too.
Conceptual representation and modeling came first, then language came along to communicate those concepts. LLMs are the other way around, linguistic tokens come first and they just stream out more of them.
This is why Noam Chomsky was adamant that what LLMs are actually doing in terms of architecture and function has nothing to do with language. At first I thought he must be wrong, he mustn't know how these things work, but the more I dug into it the more I realised he was right. He did know, and he was analysing this as a linguist with a deep understanding of the cognitive processes of language.
To say that brains are language models you have to ditch completely what the term language model actually means in AI research.
That's irrelevant though, since all the above are still prediction machines based on weights.
If you're ok with the brain being that, then you just changed the architecture (from LLM-like), not the concept.
An LLM is a specific neural architectural structure and training process. Brains are also neural networks, but they are otherwise nothing at all like LLMs and don't function the ways LLMs do architecturally other than being neural networks.
We do not have all the answers or a complete understanding of everything.
I'm not claiming that to be the case, merely pointing out that you don't appear to have a reasonable claim to the contrary.
> not even including the possibility that we have a soul or any other spiritual substrait.
If we're going to veer off into mysticism then the LLM discussion is also going to get a lot weirder. Perhaps we ought to stick to a materialist scientific approach?
If by “functionally equivalent” you mean “can produce similar linguistic outputs in some domains,” then sure we’re already there in some narrow cases. But that’s a very thin slice of what brains do, and thus not functionally equivalent at all.
There are a few non-mystical, testable differences that matter:
- Online learning vs. frozen inference: brains update continuously from tiny amounts of data, LLMs do not
- Grounding: human cognition is tied to perception, action, and feedback from the world. LLMs operate over symbol sequences divorced from direct experience.
- Memory: humans have persistent, multi-scale memory (episodic, procedural, etc.) that integrates over a lifetime. LLM “memory” is either weights (static) or context (ephemeral).
- Agency: brains are part of systems that generate their own goals and act on the world. LLMs optimize a fixed objective (next-token prediction) and don’t have endogenous drives.
Both have mass, have carbon based, both contain DNA/RNA, both are suprinsingly over 50% water, both are food, and both can be tasty when served right.
From other aspects they are not.
In many cases, one or the other would do. In other cases, you want something more special (e.g. more protein, or less fat).
The person I replied to made a definite claim (that we are "very obviously not ...") for which no evidence has been presented and which I posit humanity is currently unable to definitively answer in one direction or the other.