These models are dumber and slower than API SoTA models and will always be.
My time and sanity is much more expensive than insurance against any risk of sending my garbage code to companies worth hundreds of billions of dollars.
For most, it's a downgrade to use local models in multiple fronts: total cost of ownership, software maintenance, electricity bill, losing performance on the machine doing the inference, having to deal with more hallucinations/bugs/lower quality code and slower iteration speed.
Sure but you're paying per-token costs on the SoTA models that are roughly an order of magnitude higher than third-party inference on the locally available models. So when you account for per-token cost, the math skews the other way.