I’d also say it keeps the frontier shops competitive while costing R&D in the present is beneficial to them in forcing them to make a better and better product especially in value add space.
Finally, particularly for Anthropic, they are going for the more trustworthy shop. Even ali is hosting pay frontier models for service revenue, but if you’re not a Chinese shop, would you really host your production code development workload on a Chinese hosted provider? OpenAI is sketchy enough but even there I have a marginal confidence they aren’t just wholesale mining data for trade secrets - even if they are using it for model training. Anthropic I slightly trust more. Hence the premium. No one really believes at face value a Chinese hosted firm isn’t mass trolling every competitive advantage possible and handing back to the government and other cross competitive firms - even if they aren’t the historical precedent is so well established and known that everyone prices it in.
Everything they have done so far indicates this.
Running your own is the only option unless you really trust them or unless you have the option to sue them like some big companies can.
Or if you don't really care then you can use the chineese one since it is cheaper.
What makes you trust Anthropic more than Alibaba?
That's a cryptic way to say "Only for vibe-coding quality at the margin matters". Obviously, quality is determined first and foremost by the skills of the human operating the LLM.
> No one really believes at face value a Chinese hosted firm isn’t mass trolling every competitive advantage possible
That's much easier to believe than the same but applied to a huge global corp that operates in your own market and has both the power and the desire to eat your market share for breakfast, before the markets open, so "growth" can be reported the same day.
Besides, open models are hosted by many small providers in the US too, you don't have to use foreign providers per se.
2) I think there is a special case for Chinese providers due to the philosophical differences in what constitutes fair markets and the regulatory and civil legal structure outside China generally makes such things existentially dangerous to do; hence while it might happen it is extraordinarily ill advised, while in China is implicitly the way things work. However my point is Ali has their own hosted version of Qwen models operating on the frontier that are at minimum hosted exclusively before released. Theres no reason to believe they won’t at some point exclusively host some frontier or fine tuned variants for purposes for commercial reasons. This is part of why they had recent turnover.
Also, have you considered that your trust in Anthropic and distrust in China may not be shared by many outside the US? There's a reason why Huawei is the largest supplier of 5G hardware globally.
Most code is not P99, but companies pay a premium to produce code that is. That’s my point.
But yes this is a non-sequitor. The original question was "What competitive advantage does OpenAI/Anthropic has when companies like Qwen/Minimax/etc are open sourcing models that shows similar (yet below than OpenAI/Anthropic) benchmark results?"
Even if you don't trust Chinese companies, and you want a hosted model, you can always pay a third party to host a Chinese open weight model. And it'll be a lot cheaper than OpenAI.
And in world where code generation costs are trending to zero, goodluck commanding a premium to produce any kind of code.
There is a whole bunch of P99 code that is open-source. What makes code P99 is not the model that produces it, but the people who verify/validate/direct it.
For some problems, sure, and when you are stuck, throwing tokens at Opus is worthwhile.
On the other hand, a $10/month minimax 2.7 coding subscription that literally never runs out of tokens will happily perform most day-to-day coding tasks
Claude also has other models which use less tokens.
If I build a super high quality context for something I'm really good at, I can get great results. If I'm trying to learn something new and have it help me, it's very hit and miss. I can see where the frontier models would be useful for the latter, but they don't seem to make as much difference for the former, at least in my experience.
The biggest issue I have is that if I don't know a topic, my inquiries seem to poison the context. For some reason, my questions are treated like fact. I've also seen the same behavior with Claude getting information from the web. Specifically, I had it take a question about a possible workaround from a bug report and present it as a de-facto solution to my problem. I'm talking disconnect a remote site from the internet levels of wrong.
From what I've seen, I think the future value is in context engineering. I think the value is going to come from systems and tools that let experts "train" a context, which is really just a search problem IMO, and a marketplace or standard for sharing that context building knowledge.
The cynic in me thinks that things like cornering the RAM market are more about depriving everyone else than needing the resources. Whoever usurps the most high quality context from those P99 engineers is going to have a better product because they have better inputs. They don't want to let anyone catch up because the whole thing has properties similar to network effects. The "best" model, even if it's really just the best tooling and context engineering, is going to attract the best users which will improve the model.
It makes me wonder of the self reinforced learning is really just context theft.
OpenAI & Anthropic are just lying to everyone right now because if they can't raise enough money they are dead. Intelligence is a commodity, the semiconductor supply chain is not.
Slower and worse is still useful, but not as good in two important dimensions.
It’s ludicrous to believe a small parameter count model will out perform a well made high parameter count model. That’s just magical thinking. We’ve not empirically observed any flattening of the scaling laws, and there’s no reason to believe the scrappy and smart qwen team has discovered P=NP, FTL, or the magical non linear parameter count scaling model.
It's kinda like saying a car with a 6L engine will always outperform a car with a 2L engine. There are so many different engineering tradeoffs, so many different things to optimize for, so many different metrics for "performance", that while it's broadly true, it doesn't mean you'll always prefer the 6L car. Maybe you care about running costs! Maybe you'd rather own a smaller car than rent a bigger one. Maybe the 2L car is just better engineered. Maybe you work in food delivery in a dense city and what you actually need is a 50cc moped, because agility and latency are more important than performance at the margins.
And if you're the only game in town, and you only sell 6L behemoths, and some upstart comes along and starts selling nippy little 2L utility vehicles (or worse - giving them away!) you should absolutely be worried about your lunch. Note that this literally happened to the US car industry when Japanese imports started becoming popular in the 80s...
That's an interesting analogy.
The point of open source models is that you host them locally. I trust neither Chinese nor American providers with this.
If it were to happen, Chinese law does offer recourse, including to foreign firms. It's not as if China doesn't have IP law. It has actually made a major effort over the last 10+ years to set up specialized courts just to deal with IP disputes, and I think foreign firms have a fairly good track record of winning cases.
> No one really believes at face value
This says a lot more about the prejudices and stereotypes in the West about China than it does about China itself.
Meanwhile I'm over here solving real world business problems with a model that I can securely run on-prem and not pay out the nose for cloud GPU inference. And then after work I use that same model to power my personal experiments and hobby projects.
There are no Chinese labs with different financial and political motivations, there's only "China" the monolith. The last thread for Qwen's new hosted model was full of folks talking about how "China" is no longer releasing open weights models, when the next day Moonshot AI releases Kimi 2.6. A few days later and here's Qwen again with another open release.
For some reason this country gets what I assume are otherwise smart Americans to just completely shut off their brains and start repeating rhetoric.
As opposed to an US-american shop? Yup, sure, why not? It's the same ballpark.
For coding, quality is not measurable and is based entirely on feels (er, sorry, "vibes").
Employers paying for SOTA models is nothing but a lifestyle status perk for employees, like ping-pong tables or fancy lunch snacks.
Wait five years and come back. Right now AI is 100% FOMO and lifestyle signaling and nothing more.
Now there's a word I haven't heard in a long, long time.
If you want to compare to a hosted model, look toward the GLM hosted model. It’s closest to the big players right now. They were selling it at very low prices but have started raising the price recently.
For coding $200 month plan is such a good value from anthropic it’s not even worth considering anything else except for up time issues
But competition is great. I hope to see Anthropic put out a competitor in the 1/3 to 1/5 of haiku pricing range and bump haiku’s performance should be closer to sonnet level and close the gap here.
Also, they are not exactly as good when you use them in your daily flow; maybe for shallow reasoning but not for coding and more difficult stuff. Or at least I haven't found an open one as good as closed ones; I would love to, if you have some cool settings, please share
The thing is the new OpenAI/Anthropic models are noticeably better than open source. Open source is not unusable, but the frontier is definitely better and likely will remain so. With SWE time costing over $1/min, if a convo costs me $10 but saves me 10 minutes it's probably worth it. And with code, often the time saved by marginally better quality is significant.
This is the competitive advantage. Being better.