The paper says that: "Specifically, our experiments establish that state-of-the-...

visarga · on March 5, 2021

The network learns how to map data into an intermediate representation, then learns a hash output for each of these intermediate values. So it learns two things compared to when the intermediate values correspond to the task.

DavidSJ · on March 5, 2021

Random labeling can’t generalize. But in the usual case, the labels are nonrandom and the learned map generalizes.