Sure we should indeed expect that they do that. But look at enough data and you'll learn that those expectations are a path towards never-ending frustration. I've been there, spending >100 hours cleaning data... that never got published because I was too damn focused on the dozens of years of errors that many, many people created.
To be clear, I'm not saying that we should accept messy data. Just, reality is messy and it's naive to think we can catch and remove all of reality's messiness -- which includes the bureaucratic slop that led to the data being published in the first place.