I am trying super hard to use cheap models, and outside SOTA models, they have been more trouble than they are worth.
I really recommend trying the Qwen models - 3 coder next is really incredible. GLM 4.7 flash is also incredibly performant on modest hardware. Important things to consider is setting the temperature and top_p and top_k values etc based on what is recommended by the provider of the model - a thing as simple as that could result in a huge difference in performance.
The other big leap for me was switching to Zed editor and getting its agent stuff just seamlessly integrated. If you run LM Studio on your local machine it's super easy and even setting it up on a remote machine and calling out to LM Studio is dead simple.
Use case means everything. I doubt this model would fare well on a large codebase, but this thing is incredible.