upvote
> In our construction, each instruction maps to only a handful of tokens (at most 5).

I don't see how this could work as an LLM given that, but the article is missing a huge amount of other crucial details too.

reply