Full disclosure, I use OpenRouter and pay for models most of the time since it's more practical than 5-10 tokens per second, but the option to run it "If I had to, worst case" is good enough for me. We're also in a rapidly developing technology space and the models are getting smaller and better by the day, ever year the smaller models get better