I'm running a 248B model on a paltry amount of hardware and getting plenty of good use out of it.
Sure, the most demanding tasks will demand the best models (and always will). There's still less demanding tasks for other models.
I think some people are fooling themselves that coding of all tasks is always going to requires the biggest models ever. Again, maybe some coding tasks will, but the majority of business CRUD apps probably don't. Same goes for virtually any other type of task. The biggest models are really only useful for the most complex tasks.
I primarily use it with my own harness for coding. I'm not going to say it will compete with Opus in the most challenging domains, because it won't, but I will say that there's a reasonable likelihood that Opus is used for tasks that a model like Flash could comfortably handle at 1/100th the cost.
So far I've only seen it struggle at tasks that I myself would struggle with. Tasks that I can describe the shape of the solution for, it has a high success rate at implementing.
Useful is going to be different for everyone. I'm not working on the hardest problems, I don't need the best models.
I do not see how being experienced in engineering, or having higher studies in computer science and economics should make that view less common.
Now this might not be the most cost effective (and may require a bit extra power), but you only need a datacenter for training or cost optimization.
I'd agree except that Big AI has made sure that most of us can't afford the hardware (RAM, NVMe, etc) to run it.
Some will take greater risks and win (or lose); others will play it safer and slowly accumulate wins (or be obsoleted).
Never mind the threat of letting these models write code that runs your business, or operate it agentically. Models trained by actors (corporate or nationstate) diametrically opposed to your interests.
Lots to take into account now, interesting time to be in business.
If a government entity bans a LLM provider due to a jailbreak concern, they can also ban an on-prem solution under the same guise. The jailbreak risk exists regardless of where it's hosted. You could defensibly argue the on-prem risk is higher since frontier model companies can justify safety spend due to their size, it's more difficult to combat bad actors if you're company is the only one using the model and you don't have economies of scale.
Private models in a low trust society means the government will come and seize the models. Competitive business will only be allowed through cronyism.
The better option is to opt for high trust. Yes the Gman can rip your servers apart, but they know they'll face consequences, legal and political. Laws and regulations are the answer, not locking down into smaller fiefdoms.
And infrastructure dominance is really the big picture here. Chinese models are going to become the standard setters because they're going to be what people are using. That means more research, more tooling, and a whole ecosystem developing around them.
And that was already starting to happen even before this fiasco with Chinese models now being the most used ones globally. https://www.indiatoday.in/amp/technology/features/story/clau...
Remember that there are degrees of banning. Slower tokens, dumber models, token caps, KYC for each model consumer, hurting specific companies that are not capitulating in a deal with a Chinese company, etc.
I see absolutely no reason why CPC would choose to kneecap themselves the way the USG just did. Keeping open access to the models means that the whole world will be using Chinese based AI stack going forward. Only a government run by absolute imbeciles would do what the US did.
The tricky part with banning Chinese models is that they're open. It'll be easy to ban access to service providers, but preventing people from running these models on prem is going to be really tough. Like are they going to go after Cursor for example given that their model is based on Kimi?
I very much agree it's going to be a futile endeavour in the end. It kind of reminds me of the time Microsoft tried to get Linux and open source banned when Linux started encroaching on Windows server market. This is going to end the same way.