There are tons of terms that aren't explained that some people (like me) might not understand. I think it's fine that some articles have a particular audience in mind and write specifically for those, in this case, it seems it's for "Apple mobile developers who make LLM inference engines" so not so unexpected there are terms I (and others) don't understand.
reply