I'm extremely left leaning myself but I'd rather not be able to tell who won the last election cycle by looking at HN and seeing whether comments containing phrases like this for the president are upvoted or [dead]. The only thing it aids is convincing people the guidelines are for selective application. Everyone who doesn't like ${currentPresident} will be unchanged and those who do aren't going to be convinced by constant casual name calling across the site - probably the opposite.
I usually also expect to get called out as only saying this when ${currentParty} is in power or when it only benefits ${awfulThingsAboutCurrentParty}, regardless which that is and what those are at the moment. I've started including this note and the searchable token "reallynotpartyrelated" when commenting such things for later reference - this paragraph can otherwise be ignored :).
First the US blocked China from buying NVIDIA's H100, but allowed NVIDIA to sell them a China-special nerfed H100, the H800
Then the US blocked the H800
Then the US realized that China was indeed accelerating their US independence, so does a U-turn and has now approved the H200 (more powerful than both the H100 and H800) for sale to China, on a case-by-case basis
However - and here is the real kicker - China themselves are now blocking H200 purchases since they want the acceleration towards Chinese homegrown solutions to continue, and now we have Chinese models being served on Huawei Ascend chips, with next generation Ascend 750 chips (using CXMT made memory) targetting training currently in testing.
Now we have Apple asking the US government for permission to buy memory from CXMT given the global shortage!
Of course they have ways around this -- you can get black market GPUs and also API costs are SUPER cheap there -- they hack the subscription model, bundle a bunch of user accounts, and route API requests through them.
And yes they are getting to parity with US technology and will get there in a few years, they have decent chips but still not the quality of NVIDIA.
It's really a very complex situation
Without access to ASML EUV machines, the Chinese will be stuck on older less-dense chip manufacturing nodes, but in terms of building a cluster this is just a cost/efficiency issue - it means you need more chips, more electricity!
Deepseek and Kimi are writing paper after paper with substantial architecture improvements for efficiency, because they can't just throw more hardware at the problem.
And China is now doing something on the hardware axis; which it may have never explored were it not for the sanctions.
Gaining parity on the semiconductor fab front has been official government policy as part of their Five Year Plan for at least the last decade, straight from the Politburo. They were always going to go down this path, and with AI playing front and center on their upcoming plan, there’s even more pressure.
There was never a possibility of them not exploring it.
They have always been able to do this, but this time they did have the option to pass.
They’ve also invested in AI separately (before LLMs) in that time period but I’m less familiar with that sector.
The tradeoff is worth it. They’re even publishing papers which blows me away — their efficiency gains quickly become incorporated into frontier models because they are open sourcing them. They would be aggressively pursuing the same chip pipeline strategy as they are today.
US is lagging in efficiency work because the ROI is better elsewhere for us. We have the same tier of talent, once the script flips so can the research.
Let's face it - all bans were dumb. They just gave China the legal (per WTO rules) justification to start producing everything domestically. The bans work as a reverse tariff, as a protectionist measure that actually protects your competitor. If China did those, others could bring China to court at the WTO. But the US did that, so nobody can sue China.
There's enough money and scale on the line that software affinity like CUDA is no longer the deciding factor and there's margin for custom stacks.
Even more so after the USA GPU exports ban which is proving to have backfired by speeding up China's tech growth.
If west ai is too advanced can take over the world. So better go to war now on a same level playing field than later when you need to fight against a SGI
Instead, the US banned China from chips and lithography machines, giving China the legal excuse to start producing them domestically without violating WTO rules. Now China produces cheap chips and uses them with cheap electricity.
This was a dumb move by the US. Brought upon it by dumbf*ck aristocratic elites who grew up in isolated mansions and then received law degrees, with absolutely no understanding of technology and technology ecosystems. They thought they'd just make the rules and everybody would have to obey. It turns out in technology, they don't have to...
Nowhere near the value of having access to chips, at any cost. They have extremely deep pockets. They already pay 6x the cost per FLOP.
> Instead, the US banned China from chips and lithography machines, giving China the legal excuse to start producing them domestically without violating WTO rules. Now China produces cheap chips and uses them with cheap electricity.
You think without export restrictions China wouldn't be doing the exact same thing? China needs absolutely zero legal excuse. I mean sure they have compute available on grey market / domestically but at 6x the cost per FLOP. Access to NVIDIA chips would make it dramatically cheaper for them. Yes you get chip income but that is not even close to what you lose. The strategy is doing what it was always supposed to do: slow them down, bleed their resources to force them to spin their wheels catching up. China is doing a great job with this but they are fundamentally constrained by these export controls.
You are right that this greases the wheels, they are further along than they would have been without export restrictions, but they are still delayed even with the reduced friction. The alternative is that they move slightly slower _while having the same compute infrastructure available_ and at dramatically lower energy costs. That is a far worse position for the US to be in.
> This was a dumb move by the US. Brought upon it by dumbf-ck aristocratic elites who grew up in isolated mansions and then received law degrees, with absolutely no understanding of technology and technology ecosystems. They thought they'd just make the rules and everybody would have to obey. It turns out in technology, they don't have to...
I think this is too cynical. Neither one of us is in the room to actually observe the real decision making, but export restrictions as a strategy are not some "dumbf-ck aristocratic elite" thing. They are perfectly rational from a strategic standpoint and arguably doing what they're supposed to do.
The H200 was released Nov 2024.
Even allowing for Jensen exaggerating the risk there is no way China is 7-10 years behind.
Looking at manufacturing process nodes, SMIC N+3 is a a 5nm process. 5nm was introduced by Samsung and TSMC in 2020 so at most that is 6 years.
But the chips they can produce on it are roughly comparable to "roughly level with Android flagships from three years ago"[2]
TL;DR: China is more like 2-4 years behind than 7-10 years. If China developed EUV lithography then all bets are off.
[1] https://www.reddit.com/r/LocalLLaMA/comments/1kxw6b9/nvidia_... - see video.
[2] https://www.tomshardware.com/tech-industry/semiconductors/se...
The only difference between using a slower chip such as H100 (or Huawei's Ascend 750) vs NVIDIA's newer Blackwell chips (B200 etc) is that you need more of the slower chips to achieve the same total FLOPs in your cluster. It has zero effect on what models you can run on it.
To quote Pat Toulme:
There’s a big misconception about how GLM 5.2 was trained. Yes, they distilled Claude and GPT 5.5 — but distillation is not how they matched Opus quality. Distillation only fixed the cold start problem in RL.
RLing an agentic coding model isn’t rocket science. In simplified terms:
1. RL needs trajectories — rollouts where the model actually completed a task in some env
2. No successful trajectory on a task = zero gradient = you can’t RL it. This is the cold start problem
3. Distillation solves it. You seed your model with knowledge from a smarter one (Claude, GPT) on tasks it can’t do yet
4. Now it produces positive trajectories on those tasks
5. RL on those trajectories and hill climb agentic coding
6. At that point you no longer need to distill and can solely hill climb RL to better models
This is an interesting curve. I’d argue it’s harder to get to Opus 4.8 from scratch than to go from Opus 4.8 → Fable/Mythos tier.
GLM 5.2 is already producing positive trajectories, so they have plenty to RL on — they’ll keep climbing to Mythos quality without distilling any further. They no longer need American models.
https://x.com/PatrickToulme/status/2069211575437627743
Not exactly sure what the finish line in "the race to superintelligence" looks like and even moreso it's unclear why you think being there first is a critical benefit.