Hacker News new | past | comments | ask | show | jobs | submit login

And neither of the bit twiddling is useful for ARM NEON as bit operations in vector form are very limited... (plus there are pipeline stalls)

Also multiply adds can be fused so if you're doing that it cab be faster to just multiply by a different number instead of bit twiddling.




I'm not sure this is true. Many of the bit twiddling hacks can be used in NEON and they have a few unusual instructions I'm dying to play with.

I'm not sure which bit hack you're talking about that's done with a multiply or multiply-add. There's a nice use involving De Bruijin sequences for doing lg2 of a single bit that's very instructive - is that what you meant?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: