I don't think MobileNetV2 is designed to train on GPUs - according to this https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-depl... MobileNetV2 gets bigger gains from GPUs vs several CPUs than ResNet. You could argue the batch size doesn't fully use the V100 but these comparisons are tricky and this looks like fairly normal training to me.
It's pretty surprising to me that an M1 performs anywhere near a V100 on model training and I guess the most striking thing is the energy efficiency of the M1.
It's pretty surprising to me that an M1 performs anywhere near a V100 on model training and I guess the most striking thing is the energy efficiency of the M1.