upvote
The mapping of the physical world onto a computer representation introduces idiosyncratic measurement issues for every data point. The idiosyncratic bias, errors, and non-repeatability changes dynamically at every point in space and time, so it can be modeled neither globally nor statically. Some idiosyncratic bias exhibits coupling across space and time.

Reconstructing ground truth from these measurements, which is what you really want to train on, is a difficult open inference problem. The idiosyncratic effects induce large changes in the relationships learnable from the data model. Many measurements map to things that aren't real. How badly that non-reality can break your inference is context dependent. Because the samples are sparse and irregular, you have to constantly model the noise floor to make sure there is actually some signal in the synthesized "ground truth".

In simulated physics, there are no idiosyncratic measurement issues. Every data point is deterministic, repeatable, and well-behaved. There is also much less algorithmic information, so learning is simpler. It is a trivial problem by comparison. Using simulations to train physical world models is skipping over all the hard parts.

I've worked in HPC, including physics models. Taking a standard physics simulation and introducing representative idiosyncratic measurement seems difficult. I don't think we've ever built a physics simulation with remotely the quantity and complexity of fine structure this would require.

reply
Is this like some scale-independent version of Heisenberg's Uncertainty Principle?
reply