There's enough money and scale on the line that software affinity like CUDA is no longer the deciding factor and there's margin for custom stacks.
Even more so after the USA GPU exports ban which is proving to have backfired by speeding up China's tech growth.
If west ai is too advanced can take over the world. So better go to war now on a same level playing field than later when you need to fight against a SGI
Instead, the US banned China from chips and lithography machines, giving China the legal excuse to start producing them domestically without violating WTO rules. Now China produces cheap chips and uses them with cheap electricity.
This was a dumb move by the US. Brought upon it by dumbf*ck aristocratic elites who grew up in isolated mansions and then received law degrees, with absolutely no understanding of technology and technology ecosystems. They thought they'd just make the rules and everybody would have to obey. It turns out in technology, they don't have to...
Nowhere near the value of having access to chips, at any cost. They have extremely deep pockets. They already pay 6x the cost per FLOP.
> Instead, the US banned China from chips and lithography machines, giving China the legal excuse to start producing them domestically without violating WTO rules. Now China produces cheap chips and uses them with cheap electricity.
You think without export restrictions China wouldn't be doing the exact same thing? China needs absolutely zero legal excuse. I mean sure they have compute available on grey market / domestically but at 6x the cost per FLOP. Access to NVIDIA chips would make it dramatically cheaper for them. Yes you get chip income but that is not even close to what you lose. The strategy is doing what it was always supposed to do: slow them down, bleed their resources to force them to spin their wheels catching up. China is doing a great job with this but they are fundamentally constrained by these export controls.
You are right that this greases the wheels, they are further along than they would have been without export restrictions, but they are still delayed even with the reduced friction. The alternative is that they move slightly slower _while having the same compute infrastructure available_ and at dramatically lower energy costs. That is a far worse position for the US to be in.
> This was a dumb move by the US. Brought upon it by dumbf-ck aristocratic elites who grew up in isolated mansions and then received law degrees, with absolutely no understanding of technology and technology ecosystems. They thought they'd just make the rules and everybody would have to obey. It turns out in technology, they don't have to...
I think this is too cynical. Neither one of us is in the room to actually observe the real decision making, but export restrictions as a strategy are not some "dumbf-ck aristocratic elite" thing. They are perfectly rational from a strategic standpoint and arguably doing what they're supposed to do.
The H200 was released Nov 2024.
Even allowing for Jensen exaggerating the risk there is no way China is 7-10 years behind.
Looking at manufacturing process nodes, SMIC N+3 is a a 5nm process. 5nm was introduced by Samsung and TSMC in 2020 so at most that is 6 years.
But the chips they can produce on it are roughly comparable to "roughly level with Android flagships from three years ago"[2]
TL;DR: China is more like 2-4 years behind than 7-10 years. If China developed EUV lithography then all bets are off.
[1] https://www.reddit.com/r/LocalLLaMA/comments/1kxw6b9/nvidia_... - see video.
[2] https://www.tomshardware.com/tech-industry/semiconductors/se...
The only difference between using a slower chip such as H100 (or Huawei's Ascend 750) vs NVIDIA's newer Blackwell chips (B200 etc) is that you need more of the slower chips to achieve the same total FLOPs in your cluster. It has zero effect on what models you can run on it.
To quote Pat Toulme:
There’s a big misconception about how GLM 5.2 was trained. Yes, they distilled Claude and GPT 5.5 — but distillation is not how they matched Opus quality. Distillation only fixed the cold start problem in RL.
RLing an agentic coding model isn’t rocket science. In simplified terms:
1. RL needs trajectories — rollouts where the model actually completed a task in some env
2. No successful trajectory on a task = zero gradient = you can’t RL it. This is the cold start problem
3. Distillation solves it. You seed your model with knowledge from a smarter one (Claude, GPT) on tasks it can’t do yet
4. Now it produces positive trajectories on those tasks
5. RL on those trajectories and hill climb agentic coding
6. At that point you no longer need to distill and can solely hill climb RL to better models
This is an interesting curve. I’d argue it’s harder to get to Opus 4.8 from scratch than to go from Opus 4.8 → Fable/Mythos tier.
GLM 5.2 is already producing positive trajectories, so they have plenty to RL on — they’ll keep climbing to Mythos quality without distilling any further. They no longer need American models.
https://x.com/PatrickToulme/status/2069211575437627743
Not exactly sure what the finish line in "the race to superintelligence" looks like and even moreso it's unclear why you think being there first is a critical benefit.