I wouldn't use Gemini 3 Flash or GPT 5.4 mini for anything except the most trivial work, although both are useful for basic exploratory work.
So I'm using a heavy model for the bulk of the work and the cost of that so far outweighs the light model that the light model cost is effectively irrelevant.
If one likes a model then it's capable of one-shotting entire apps.
Otherwise it's "only suitable for the most trivial tasks".
Never in between.
Personally my opinion in this regard is highly consistent over time.