undefined

points

[-]

This is essentially an open research question. ML theory is unfortunately very weak relative to where the empirics are. I think there's a relatively optimistic paper that was posted a while back here but I would also take it with a grain of salt.

https://arxiv.org/abs/2604.21691

There's of course empirical results and relatively weak theoretical results like the UAT but I also don't think that answers your question fully, especially since it seems impossible to definitively answer questions that the industry seems to betting on like whether or not there is a lower bound to their error rate or whether hallucination as a problem can be solved. We have much stronger ideas of what linear regression is doing relative to what LLMs are doing.

by sheeshkebab16 hours ago|

prev|

[-]

considering they work with any architecture/configuration given enough compute, just more or less efficiently - then maybe it's fundamental, in the same sense as why electricity works...

by krackers14 hours ago|

prev|

[-]

See Tegmark's "why does deep cheap learning work so well" (well not so cheap anymore...)

https://www.youtube.com/watch?v=5MdSE-N0bxs is remarkably prescient given that it was written before LLMs

by soupspaces15 hours ago|

prev|

[-]

Universal approximation theorem, embeddings, self-attention, gradient descent. And empirically, scaling laws.

by qsera8 hours ago|

prev|

[-]

Because there are patterns everywhere!

by skydhash15 hours ago|

prev|

[-]

Why does linear regression works? Why does computer works? Because it's about math and the encoding information. If we can encode words as numbers, then why can't we encode their order as a relation? It's just that neural networks are very apt at finding that relation even if it's noisy.