PRs are not like this because a single bad PR can be catastrophic for your business in a way that a single bad e-vape cannot.
I would also argue that the current output from the AIs when sampled by software engineers regularly doesn't meet the bar of quality we want in our product, hence the need to review every PR and fix a substantial fraction.
If you can start to bound the impact of changes and the outputs begin to be generally acceptable unsupervised, such that all you're doing is double checking that nothing has regressed in the factory, then the sampling approach can work.