If you could steal all the designs at TSMC, and you had exactly the process that TSMC uses, you could definitely make counterfeits. If you didn't have TSMC's specific process, you could adapt the designs (to Intel or Samsung) with serious but not epic effort. If you couldn't make the processes similar (ie, want to fab on SMIC), you are basically back to RTL, and can look forward to the most expensive and time-consuming part of chip design.
This is nothing like copying a trivial, non-complex item like a car. Copying a modern jet engine is starting to get close (for instance, single-crystal blades), but even they are much simpler. I mention the latter because the largest, most resourced countries in the world have tried and are still trying.
Even if you had 'ai tools' guessing at component blocks on evaluation you would have to have some evaluation of the result.
And, thats assuming NVDA hasn't pulled a Masatoshi Shima type play on their designs (i.e. complex traps that could require lots of analysis to determine if they are real or fake)
Im not sure how much of a speedup even modern tooling/workflow could do reliably.
Even then,
The elephant in the room is that China is working on their own AI accelerators/etc, so while there can be benefit from -studying- the existing designs, however I think they do not want to clone regardless.
Same with chips, efficiency, speed, etc all depend on good design, and cutting edge factors, if the main reason your chip isn't faster is because of the distance between your L1 cache and your core is far, then having a bigger node process but bigger chip won't make it quicker.
They have alternatives, like the Tian supercomputer was originally built with Xeon Phi chips that have been replaced with their own domestic alternatives.
A big limitation is getting access to fab slots. Nvidia and Apple are very aggressive about buying up capacity from TSMC, etc, and China's own domestic fabs are improving fast but still not a real match, particularly for volume.
But there's a distinct time/value of investment equation with the current AI boom. The jury is at best still out on what that equation is for the goals of capital (it's increasingly looking like there's no moat), but if you're a national government trying to encourage local bleeding edge expertise in new fields like this it's quite a bit more clear.
With processors it's customary to use the "Fan out of 4" metric as a measurement of the critical paths. It's the notional display for a gate with fan out of 4, which is the typical case for moving between latches/registers. Microprocessor critical paths are usually on the scale of ~10 FO4.
The largest chip at the moment is Cerebras's wafer scale accelerator. There the tile is basically at the reticule limit, and they worked with TSMC to develop a method to wire across the gaps between reticules.
They can copy it. And no, the software moat is not there if someone choose the blatant copy route. They just can't build it in the scale they want yet.
> what if they just use 12nm and create GPUs with much bigger size but comparable performance
Physics do not work this way :/
you could certainly use a larger process and clone chips at an area and power penalty. but area is the main factor in yield, and talking about power is really talking about "what's the highest clockrate can you can still cool".
so: a clone would work in physics, but it would be slow and hot and expensive (low yield). I think issues like propagation delay would be second- or third-order (the whole point of GPUs is to be latency-tolerant, after all).