Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's interesting how much time is spent moving things around or preserving state vs computation. Lots of cycles dedicated to that means less is getting done than we might think.

This can be useful when building up expected performance models of a system.



Hardware designers are aware of this thou, if you take say register-register movs, this can be achieved without actually copying the data, via register renaming.

X86_64 ABI switched to being able to pass arguments in registers as well. And the top of the stack will almost certainly be in cache anyway for the old style.


Plus, most instructions are actually out-of-order on x86 (hence the out-of-order superscalar). So your "mov" will usually take (effectively) zero cycles -- unless there is a dependency. But compilers are usually smart enough to pipeline the generated code, so the problem is really not a problem at all!


Don't forget that reg->reg movs still take up instruction bandwidth and require decoding.

They are nowhere near as expensive as a naive analysis would suggest - but they are nowhere near a free lunch either.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: