This is interesting. Is it possible to run training for 10 minutes and check the loss, then again for another set of random weights? A bit like bitcoin mining. Then continue training for the best set of starting weights.
The study referenced is (2018):
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
https://arxiv.org/abs/1803.03635
The study referenced is (2018): The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks https://arxiv.org/abs/1803.03635