And? The entire hallucination problem with text generators is "plausible sounding yet incorrect", so how does a human eyeballing it help at all?
You can also probably still use it for some kinds of evaluation as well since you can detect if two point clouds intersect presumably.
In much a similar way that LLMs are not perfect at translation but are widely used anyway for NMT.