upvote
The embedded programs can be connected to the other weights during training, in whatever way the training process finds useful. It doesn't just have to be arithmetic calculation. You can put any hard-coded algorithm in there, make the weights for that algorithm static, and let the training process figure out how to connect the other trillion weights to it.
reply
One of the big appeals of this is it gives a mechanism for "teaching" models a geometric intuition and better spacial reasoning.

Not necessarily pure number crunching but the boundary between rote algorithms and fuzzy intuition based models that humans in particular excel at.

reply
> Why would that be desirable?

If we never try, we'll never know. I wouldn't be surprised if there is something to gain from a form of deterministic computation which is still integrated with the NN architecture. After all, tool calls have their own non-trivial overhead.

reply
Trying, sure. That's what hackers do.

I'm asking whether it's a desirable end state.

reply
[flagged]
reply