Hacker News new | past | comments | ask | show | jobs | submit login

Wouldn't it be possible to optimize floating point multiplication by 2.0f (or any power of two) to an addition or subtraction on the exponent?



Only on architectures with such instructions. x86 doesn't have anything like that (well, maybe in x87), and I don't know of any others that do.


So would moving the floating point value into an integer register, shifting, incrementing, ORing, etc. ever be faster than a full FP addition? I imagine that modern desktop CPUs have effectively-single-cycle multiply, but what about something like ARM with VFP but no Neon?


Only in architectures where FP and integer values are kept in the same registers. This is true in SSE2 and Altivec, so you can do integer operations there - but since SIMD integer operations are limited too, it's pretty much only useful for flipping the sign bit.

Moving values between different register sets is INCREDIBLY slow since it involves at least two memory operations.

And faking FP operations with specialized integer code on something with soft-float like ARM might be worth it. I've never done it so I can't really say.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: