upvote
The discretization of those tokens can be manipulated to get any result you want. If it meaningfully benefits the AI to have a more fine-grained discretization, then you can do that. AI only compresses as much as we want it to. I understand your sentiment, but the logical conclusion of what you’re saying is that no form of compression is ever valuable. That’s just not a defensible argument.

All information gets compressed. Even your own perception of reality gets compressed.

reply
Do you exist in reality? Or just in a virtual world made up of sensory signals? Do you have access to the Ding an sich any more than a (multimodal) LLM?
reply
> Do you exist in reality?

Yes.

> Or just in a virtual world made up of sensory signals?

No, definitely reality. Things affect my thought whether I sense them or not.

reply
How would you know? You have no external frame of reference; a virtual world of sensory signals would be identical from your perspective. (I agree that "reality" is the most parsimonious explanation by far, btw, but that's never been the point of the simulation thought experiment.)

I think the more interesting corollary of this article is that if we're living in a simulation, it's an impossibly, improbably detailed one. I really want some compute time on the HPC that's running it.

reply
> How would you know? You have no external frame of reference; a virtual world of sensory signals would be identical from your perspective.

Okay, lets go with that :-)

I might be living in a virtual reality, correct, I have no way of knowing.

What I do know is that the reality I am in is many thousands of times higher in resolution than the reality of the LLM.

As an analogy, the LLM is seeing a downscaled 32x32 pixel image while I see the original 8k image. Whether there is a larger 1b^2 image that I cannot see is not relevant to the question of whether the LLM can see my reality or not - it can't.

reply
Things affect LLMs besides tokens, like ECC errors or cosmic rays? …
reply
Come on, now. That's irrelevant.

Reality is by definition our physical reality, which is about an infinite number of levels more detailed than the, you know, _virtual_ digital world computers exist in.

Whatever world we construct for LLMs, no matter how detailed we make it, will always be a blocky projection of the real domain onto a virtual one.

It follows then that any insight gained in the virtual world is at best a rough approximation which can be quite useful at times but also utterly faulty on occasion.

How often it is useful vs. wrong is (partially) a function of how complete the real-to-virtual approximation for a given domain.

Certain domains, given their limited degrees of freedom, can be quite accurately modeled, such as a subway map.

But many domains cannot, and it's important to be aware of that inherent limitation in digital models including but not limited to LLM """reasoning"""

reply
>Whatever world we construct for LLMs, no matter how detailed we make it, will always be a blocky projection of the real domain onto a virtual one.

I don't know exactly why but I never really understood this argument. Might be some kind of control thing? Because for me it's pretty simple, it's basically free to give access to reality. Just add "sensory organs" as it were. I can argue you can make them perceive reality even better than we (humans) do, just enlarge the audio/video spectrums. Bam...more reality. The whole point of the argument is we're missing information.

Again, I get the need for controlling the environment for what LLM/AI/AGI/whatever will be, but that will always cost more than giving them access to like...reality. Same reason I don't really believe in the whole simulation argument, it's just more expensive all around, loses resolution, let alone control. I don't doubt there will be some people that would indulge in neverending hedonism but not all people. You need to give up control for that.

reply
There are two reasons.

First, reality is continuous whereas the digital world is discrete.

Second, data in the real world is many orders of magnitude more detailed than what we're able to model with today's computers.

reply
> Because for me it's pretty simple, it's basically free to give access to reality. Just add "sensory organs" as it were.

I dunno what you mean by "free". The model is trained on text. To "give" the model sensory organs it would need to be trained on those sensory organs.

Current models can predict text, because that's what the weights represent. Models with sensory organs will need to be trained on the output of those sensory organs.

That sounds close to impossible in the foreseeable future.

reply
>I dunno what you mean by "free".

Reality is free. You don't have to waste any resources to model it, you just need to capture it.

>The model is trained on text.

See in my previous reply:

>LLM/AI/AGI/whatever will be

LLMs don't even have a sense of time because they work differently to a human brain.

reply
Vision and audio is already in use in multimodal LLMs. So it's possible in the past.
reply
Who said anything about vision and audio?
reply