Google has now provided small models for each of the previous Gemma 4 models, e.g. "gemma-4-26B-A4B-it-assistant" for "gemma-4-26B-A4B-it".
The difference vs. Qwen is that here each small model is not some general-purpose smaller model, but a model that has been optimized specifically for this task, to predict the output of the bigger model with which it is paired.
This specialization and optimization of the Google "gemma-4-*-assistant" models ensures that they are much smaller and thus much faster than general-purpose small models.
Interesting, must try tomorrow.