Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The paper says that: "Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data"

What is a generalization of a random labeling? Does the network become psychic? Is this saying "garbage works fine by these measures"?



The network learns how to map data into an intermediate representation, then learns a hash output for each of these intermediate values. So it learns two things compared to when the intermediate values correspond to the task.


Random labeling can’t generalize. But in the usual case, the labels are nonrandom and the learned map generalizes.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: