upvote
My vision for what might happen: an LLM emits a "neural constraint satisfaction task" in latent space, kicks a "neural tool call" into a non-LLM architecture, runs that architecture, gets a latent answer back, attends to the answer to generate better text answers for problems that benefit from improved constraint-satisfaction.

But that's a very hard thing to implement, and the gains are uncertain. Thus "might".

reply