upvote
I'd argue that the context window is analogous to short-term memory. It's functional but limited in duration, and if you overload it, it starts to fail.

It's the long-term memory (i.e. learned experiences feeding back and directly altering the content of the core brain, or model) that is missing.

reply
The context window is so flawed that I wouldn't consider it memory.

It feels like notes about the situation rather than it being in memory. Memory has more "attention". I think that "it starts to fail" is load bearing here.

I feel like memory has like 5 parts, and LLMs are missing 2 of them:

current working memory

short term what is immediately happening without it being in "RAM". I differentiate here vs working in like thinking fast and slow. Keeping things in working memory is work! You can vibe away short term memory. I had excellent short term memory while I was messed up, I could keep time well. I think LLMs can do this with notes.

mid term: Vague awareness of things like what day a week it is or what you did 2 hours ago. This is where my memory personally failed

long term memory of experiences. You can fake this with memory.md

generalized wisdom for pattern matching long term memories

LLMs seem to be missing that part I was missing. Im probably projecting and anthropomorphizing. But i relate: I would confabulate a ton and didn't know anything was wrong for a while but things seemed off.

Context is like working memory but not short term or mid term. I think you can imply short term with big enough context.

My categories are purely anthropomorphic to me but just wanted to say where I disagreed.

reply
It’s nonzero, because they carry state while performing inference, and in the surrounding processes like chain-of-thought and mixture-of-experts.
reply
I think they have working memory but not short term memory. I suppose that's pedantic or anthropomorphizing but it feels like I felt tbh
reply