Sorry I meant to say that usually it's not always possible to clean the data if the data is corrupt in the first place, because it was collected in a buggy manner. And having a few inexplicable outliers in datasets can often erode confidence in the rest.
Since this is not the data you collected, I understand you have to work with what you have, by the way very interesting post, and nice job!