What models and quantizations have you been trying? I've had great success with the larger Qwen 3.x models at 6-bit levels. Using 6 bit quantization is really the bare minimum to give local models a fair shot at agentic flows. Once you start pushing below that the models become more "dumb" from the limited bit space.
Cloud has tremendous value for speed, plug and play, and performance. You need to decide how those compete with the benefits of local - both today, and a year from now, e.g.