upvote
It's a necessary assumption for the universal approximation property; if you assume some structure then your LLM can no longer solve problems that don't fit into that structure as effectively.
reply
Neural nets are structured as matrix multiplication, yet, they are universal approximators.
reply
You're missing the non-linear activations.
reply
But language does have structure, as does logic and reasoning. Universal approximation is great when you don't know the structure and want to brute force search to find an approximate solution. That's not optimal by any stretch of the imagination though.
reply