Contrary to the popular opinion here, there are other services beyond Claude Code. These usage limits might even prompt (har har) people to notice that Gemini is cheaper and often better.
Fixed costs, exact model pinning, outage resistant, enshittification resistant, better security, better privacy, etc...
There are just so many compelling reasons to be on-prem instead of dependent on a 3rd party hoovering up all your data and prompts and selling you overpriced tokens (which eventually they MUST be, because these companies have to make a profit at some point).
If the only counterbalance is "well the api is cheaper than buying my own hardware"...
That's a short term problem. Hardware costs are going to drop over time, and capabilities are going to continue improving. It's already pretty insane how good of a model I can run on two old RTX-3090s locally.
Is it as good as modern claude? No. Is it as good as claude was 18 months ago? Yes.
Give it a decade to see companies really push into the "diminishing returns" of scaling and new models... combined with new hardware built with these workloads in mind... and I think on-prem is the pretty clear winner.
1/ https://github.com/google-gemini/gemini-cli/issues?q=is%3Ais...
It might be acceptable for some general tasks, but I haven’t EVER seen it perform well on non trivial programming tasks.
Has that BS stopped?
Oh well.
It's possible some people offload too much to LLMs but personally, my brain is still doing a lot of work even when I'm "vibecoding".
“Can you give me an example of how to read a video file using the Win32 API like it’s 2004?” - me trying to diagnose a windows game crashing under wine
On the other hand, there's people that generate tokens to feed into a token generator that generates tokens which feeds its tokens to two other token generators which both use the tokens to generate two different categories of tokens for different tasks so that their tokens can be used by a "manager" token generator which generates tokens to...
And so on. It's all so absurd.
"Thinking is the hardest work there is, which is why so few people do it" — attrib Henry Ford
Now we have tools that can appear to automate your thinking for you. (They don't really think, but they do appear to, so...)
There's many things to worry about but which LLM provider you choose doesn't really lock you in right now.
Note the word "any." Like cloud services there will be unique aspects of a tool, but just like cloud svc there is a shared basic value proposition allows for migration from one to another and competition among them. If Gemini or OpenAI or Ollama running locally becomes a better choice, I'll switch without a care.
Subscription sprawl is likely the more pressing issue (just remembered I should stop my GH CoPilot subscription since switching to Claude).