Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is on a model designed to run faster on CPUs. It's like dropping a bowling ball on your foot and claiming excitement that you feel bruised after a few days.

Maybe there's something interesting there, definitely, but the overhype of the title takes away any significant amount of clout I'd give to the publishers for research. If you find something interesting, say it, and stop making vapid generalizations for the sake of more clicks.

Remember, we only can feed the AI hype bubble when we do this. It might be good results, but we need to be at least realistic about it, or there won't be an economy of innovation for people to listen to in the future, because they've tuned it out with all of the crap marketing that comes/came before it.

Thanks for coming to my TED Talk!



I don't think MobileNetV2 is designed to train on GPUs - according to this https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-depl... MobileNetV2 gets bigger gains from GPUs vs several CPUs than ResNet. You could argue the batch size doesn't fully use the V100 but these comparisons are tricky and this looks like fairly normal training to me.

It's pretty surprising to me that an M1 performs anywhere near a V100 on model training and I guess the most striking thing is the energy efficiency of the M1.


MV2 is memory-limited, the depthwise + groups + 1x1 convs has a long launch time on GPU. Shattered kernels are fine for CPU, but not for GPU.

Though per your note on the scales, that's really interesting empirical results. I'll have to look into that, thanks for passing that along.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: