undefined

points

[-]

> The model definitely remembers previous exchanges within the same conversation.

No it doesn't. They get added to its context, and it reads them afresh when answering the next question. That's not remembering.

If your short-term memory completely malfunctioned one day, so you had no ability to remember what was said to you a minute ago, then you would have to find workarounds. For example, you could write down everything someone says to you, then read your notes of the previous exchanges in that conversation in order to continue the conversation. That would be a good way to work around the fact that your short-term memory was broken. And if your notes were invisible to other people and you could read them really fast, then you could even make most people believe that you remembered what they said a minute ago. But you don't actually have a working memory, you're just writing down what they said and re-reading it while coming up with your next response.

That's exactly what LLMs do. That's not memory.

by ACCount371 hours ago|

parent|

[-]

Continuous learning allows past behavior and past inputs to influence future inputs and future behavior. In humans.

Attention over KV cache allows past behavior and past inputs to influence future inputs and future behavior. In LLMs.

Until the cache runs out, that is. But even then, you could totally use any of 9000 methods of cache compression, truncation, dropping or streaming and get away with it.

The difference between continuous learning and in-context learning seems to be in capacity, not in principle. Both are doing a similar thing, but one has more length and depth to it.

by nomel17 minutes ago|

parent|

[-]

Maybe, every night, you send the AI off to "sleep" where it uses those in cache "memories" to influence the long term weights [1].

[1] https://www.pnas.org/doi/10.1073/pnas.2220275120

by in-silico15 hours ago|

parent|

prev|

[-]

This is really semantics, but I wouldn't call attending to the KV cache re-reading the context.

The model takes in the context, encodes it into a "memory" (the KV cache), and accesses that memory later. That fact doesn't change just because the KV cache grows in size with the context.

I don't know what memory would look like other than an encode-retrieve loop.

Relevant: Transformers are Multi-State RNNs - https://arxiv.org/abs/2401.06104

by fipar16 hours ago|

prev|

[-]

Not the model though. The model really only takes input text and produces output text. Memory within a conversation is achieved by the harness adding the conversation (or parts of it) to the input text. The LLM itself has no memory, it’s the augmented system of several orchestrated LLM calls that does.

by nomel14 minutes ago|

parent|

[-]

Wait until you hear about the hippocampus!!! [1]

I don't think physical integration within one contained is relevant to system level behavior.

[1] https://en.wikipedia.org/wiki/Neuroanatomy_of_memory

by CommieBobDole16 hours ago|

prev|

[-]

Right, but that's still external to the LLM, it's just a KV cache that's stored on the provider side for performance reasons, so that the client doesn't have to re-send the whole chat history with every subsequent call in the conversation.

It still generates every response using the model's pristine state with every new API call; whether the context is provided from the client or from a colocated cache server doesn't really change that.

by nprateem16 hours ago|

prev|

[-]

> The model definitely remembers previous exchanges within the same conversation.

Christ HN isn't what it used to be

by in-silico15 hours ago|

parent|

[-]

Care to elaborate?