Does this imply that a completely untrained model (random weights) should show intelligent behavior only by providing enough context?
I spent the last weekend thinking about continual learning. A lot of people think that we can solve long term memory and learning in LLMs by simply extending the context length to infinity. I analyse a different perspective that challenges this assumption.
Let me know how you think about this.
It is also my very uninformed intuition: https://news.ycombinator.com/item?id=44910353
Also interesting to think about: could a single system be generally intelligent, or is a certain bias actually a power. Can we have billions of models, each with their own "experience"
The brain theory also kind of says the same thing, but it's hard to say what stays fixed vs changes with experience in the brain ig.
Well, I think of every Large Language Model as if it were a spectacularly faceted diamond.
More on these lines in a recent-ish "thinking in public" attempt by yours truly, lay programmer, to interpret what an LLM-machine might be.
Riff: LLMs are Software Diamonds