Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The title gives the impression that handwritten assembly code is the cause of this performance improvement, when in fact it was use of AVX-512 SIMD instructions that really made the difference.

I've got to wonder how gcc's AVX-512 vectorization would compare ?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: