undefined

points

[-]

Each text token already represents the activation of certain neurons. There is nothing "more direct." And you cannot fully separate data and metadata if you want them to influence the output. At best you can clearly distinguish them and hope that this is enough for the model to learn to treat them differently.

by perlgeek10 hours ago|

parent|

[-]

Are there tokens reserved for tool calls? If yes, I can see the equivalence. If not, not so much.

by yorwba9 hours ago|

parent|

[-]

Yes, typically the tags used for tool calls get their own special tokens, e.g. https://huggingface.co/google/gemma-4-E4B-it/blob/main/token...

by dontlikeyoueith7 hours ago|

prev|

[-]

> LLMs are based on neural networks, so one could create an interface where activating certain neurons triggers tool calls, with other neurons encoding the inputs; another set of neurons could be triggered by the tokenized result from the tool call.

You can do this. It's just sticking a different classifier head on top of the model.

Before foundation models it was a standard Deep RL approach. It probably still is within that space (I haven't kept up on the research).

You don't hear about it here because if you do that then every use case needs a custom classifier head which needs to be trained on data for that use case. It negates the "single model you can use for lots of things" benefit of LLMs.

by zbentley7 hours ago|

prev|

[-]

I'm a novice in this area, but my understanding is that LLM parameters ("neurons", roughly?), when processed, encode a probability for token selection/generation that is much more complex and many:one than "parameter A is used in layer B, therefore suggest token C", and not a specific "if activated then do X" outcome. Given that, how would this work?