In my experience analytical results have very little impact on practical deep le...

In my experience analytical results have very little impact on practical deep learning. Examples:

- Everyone “knows” that the Adam optimizer’s proof is incorrect, but we still use Adam because we don’t want to redo hyperparameter search with a different optimizer that’s proven to converge but probably performs worse.

- Everyone “knows” that the Wasserstein loss for GANs has a better convergence proof, but nobody uses it because the generated images look like crap compared to what you get from stylegan* with their default config.

It’d be nice if ML proofs led to better performance, but that’s not often the case. I see far more progress from better data preprocessing and from bringing in knowledge from other fields like signal processing.