I think local AI will win in its niche by repurposing users' existing hardware, especially as cloud hardware itself gets increasingly bottlenecked in all sorts of ways and the price of cloud tokens rises. You don't have to care about "bad" performance when you've got dedicated hardware that runs your workloads 24/7. Time-critical work that also requires the latest and greatest model can stay on the cloud, but a vast amount of AI work just isn't that critical.
There will not ever be a monthly subscription for LLM tokens. The economics isn't there.
Local tokens will always be cheaper.
Well your thinking is completely vibes based and not cemented in any reality I exist in.
They're not smarter, they just know more stuff.
You probably don't need knowledge about Pokemon or the Diamond Sutra in your enterprise coding LLM.
The "smarts" comes from post-training, especially around tool use.