You'd be surprised how much of bias exists in easily extractable information. Now imagine how much of that happens during training, that you can't easily extract.
So this is largely a moot point. Yes, Chinese models will likely have some weird things injected into them. But so do the US models. Do I care? Not in the slightest. Models are my code monkeys, and if the code leaves my machine, I assume IP is leaked be it a Chinese model that clearly tells me they do use the data, or US models that pinky promise they don't.