Isn't training against subset of data and then validating against rest a common ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Karliss on Feb 12, 2020 \| parent \| context \| favorite \| on: A popular self-driving car dataset is missing labe... Isn't training against subset of data and then validating against rest a common practice? It wouldn't detect all the mislabeling but should detect some indicating that manual inspection is required, assuming error isn't very systematic.

yeldarb on Feb 12, 2020 [–]

It is, and there are some interesting techniques published recently to help mitigate things like this. But if you don't have a good ground truth you're at the very least flying blind and at worst feeding garbage in and getting garbage out; your models will learn what you tell them to learn.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact