Nowadays I also feel model performance matters less than the design of the tool harness, inference speed, and the other systems that surround a typical coding model.
I thought so myself, but after burning a lot of money on OpenRouter in a few days I just subscribed to Z.ai's Coding Pro plan and using the subscription is much, much friendlier with my wallet.
And? They aren't as good as SOTA models. Even the SOTA model provider's small models aren't worth using for many of my coding tasks.