Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

MV2 is memory-limited, the depthwise + groups + 1x1 convs has a long launch time on GPU. Shattered kernels are fine for CPU, but not for GPU.

Though per your note on the scales, that's really interesting empirical results. I'll have to look into that, thanks for passing that along.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: