The title gives the impression that handwritten assembly code is the cause of th...

		HarHarVeryFunny on Nov 4, 2024 \| parent \| context \| favorite \| on: A 94x speed improvement demonstrated using handwri... The title gives the impression that handwritten assembly code is the cause of this performance improvement, when in fact it was use of AVX-512 SIMD instructions that really made the difference. I've got to wonder how gcc's AVX-512 vectorization would compare ?