It’s interesting to me how similar attempting to understand LLMs is to neuroscience.
“When we turn this bit off, this other thing happens… if we change these weights the Eiffel Tower is now in Rome”
We’re basically just probing around and trying to reverse engineer an emergent system.
To your point, this system may be quite different from model to model (human to human) although some similarities likely occur.
The comment I was responding to tried to belittle the OP’s understanding of transformers, by mentioning that running an LLM at scale is much harder than the simple white board diagram.
My point was simply that we don’t know why they work, and all the extra optimizations isn’t the “thing” that makes it emergent.
Simply scaling the “GPT” is good enough to see it, so the OP’s awe should stand.
(On a side note, what other architectures can we scale to find similar emergent behavior?)
Adults are expected to have their world models approximately correct in terms of physical environment so they won’t accidentally kill themselves by falling off a cliff; then there are the social norms which adults are expected to conform to so everyone is kinda predictable to everyone else so adults don’t kill each other too often over food or mates. Understanding of neither is expected from children.
I think they're right that kids (at least in the US) are generally treated as less capable than they are, and it ends up slightly delaying their development.
My son is very worried about black holes lately when he learned anything that goes into one can't get out. He's pretty concerned astronauts could get stuck in one some day. So I explained to him that Hawking radiation does actually mean you can eventually get out; it just takes some time.
I didn't think it pertinent to mention spaghettification, the fact anywhere near a black hole will be really hot, or that cosmic censorship means whatever Hawking-radiates from a black hole wouldn't be an astronaut anymore.
It was also fun to hear Hawking speak. He wanted to know if Hawking was a robot. I said no, but he has a robot talk for him. Not quite true, but close enough.