undefined

points

[-]

I've never heard anyone say we can solve long-term memory by extending context to infinity. Curious about sources for this?

by maxaravind16 minutes ago|

parent|

[-]

here you go: https://www.youtube.com/watch?v=Z0x99Uu4rJc

by kleyd4 hours ago|

prev|

[-]

Your conclusion touches on this, but I think the brain analogy is stronger than the hardware/software dichotomy.

It is also my very uninformed intuition: https://news.ycombinator.com/item?id=44910353

Also interesting to think about: could a single system be generally intelligent, or is a certain bias actually a power. Can we have billions of models, each with their own "experience"

by maxaravind2 hours ago|

parent|

[-]

I think both the views have their merits. In my mind the hardware vs software analogy for weights vs context holds better because in most modern computing systems, the hardware is fixed and the software changes. What the system can do efficiently, in practice, is a function of both the limitations/capabilities of the hardware and the software their respective capability ceilings.

The brain theory also kind of says the same thing, but it's hard to say what stays fixed vs changes with experience in the brain ig.

by adityaathalye4 hours ago|

prev|

[-]

> Let me know how you think about this.

Well, I think of every Large Language Model as if it were a spectacularly faceted diamond.

More on these lines in a recent-ish "thinking in public" attempt by yours truly, lay programmer, to interpret what an LLM-machine might be.

Riff: LLMs are Software Diamonds

https://www.evalapply.org/posts/llms-are-diamonds/

by maxaravind2 hours ago|

parent|

[-]

lol nice analogy. LLMs are frozen diamonds forged in compute. We need then to be malleable in production and change with experience.