There's still a lot of naivety on what the difference is between models and platforms, and its easier for a lot of these big companies to just make a blanket statement like "nothing DeepSeek" than for their procurement teams to try to understand and negotiate with each vendor. They don't see the potential benefit over the potential risk of somebody misinterpreting or getting it wrong, so they outright ban it.
Most people that approve or buy software simply also just don't understand how models are being trained or if it's possible/how far a model could go to "introduce backdoors." A backdoor could be, from a business perspective, a model which has been trained to give answers that could hurt western business in a "strict text mode" or produces payloads in a programmatic mode that are intentionally trained to introduce software vulnerabilities.
Anyone can make arguments against these for a variety of reasons (looking at the transparency of both sides and comparing, etc) but for many reasons today and for better or worse, many Chinese models are being banned on big software contracts, which gets back to the title of the article
China is leading in open source frontier models, so I don't really see how the US wins this one. At some point, companies and people will start running their own models in the cloud and locally, Chinese models will be everywhere.
That's not what anyone means when they say frontier models, don't change the definition. It's almost as bad as open weight being subsumed by open source when it comes to local models.
I've tried both Opus and GPT 5.4, they also hallucinate just like the rest at a much higher cost.
The more you use a model overtime, the better you become with it. It's really hard to measure, my main metric lately has been tokens per second/time to complete task.
At this point I've the feeling frontier models are optimizing for benchmarks and one shot prompts.
Because the models hosted in China are not trusted. This is 100% a part of what makes up commercialization.
Spoiler alert - they are all towards the bottom of the leaderboard. People come up with a wide variety of excuses for why they are not used despite being offered for significantly lower cost, but the answer is simply because they don't perform well enough for now.
I'd rather trust LLM arena leaderboard, which puts it on par with sonnet.
The ARCPrize leaderboard does have Deepseek V3.2, which only scored 4% on ARC-AGI 2 (while the top models score over 80%). It also Kimi and Qwen, but they also didn't perform well.
You agree they are winning though, right? China is known for not playing fair, stealing industrial secrets, etc... that reputation matters and it's a good reason why the US is winning. Is the US perfect? No. Does the US play fair? No. Spare me the whataboutism in the comments. The bottom line is most people think the US is a safer bet and that's why we're winning. I personally wouldn't trust either government, but if I had to choose, I feel like I at least have a chance at secrecy and due process with the US. Obviously that is being eroded day by day, but you literally have no due process in China.
You'd be surprised how useful it can be to fine tune it in enterprise.