I don't think that many people have built apps against these models.
I mean, I use a heavily quantized version of qwen3 for image classification, caption generation, prompt expansion etc. for image generation, instruction-driven edits, and so on. You can go a long way when you don't need a lot.
A model that can do tool calls - any tool calls at all - can look reasonably cool once you put it in a harness where there's enough immediate context to take action. You can get carried away by anything happening at all. But golly gosh it's a long way short of intelligence available in the bigger models.
And the lighter you make your harness, giving the model more free reign, more autonomy, you get a big jump in capability combined with a big jump in failure modes when the model is dumb.