So in the article you write that you found 20% errors in the data, but at what p...

thibaut-duguet · on April 17, 2020

There was indeed a manual review of the "potential errors" highlighted by our algorithm to determine is it was indeed an error in the data or if it was an error in the prediction. The 20% corresponds to the proportion of objects that was corrected with this manual review. So it's actually likely that some errors (that were not found by our algorithm) are still in our clean version of the dataset.