Hacker News new | past | comments | ask | show | jobs | submit login

Thread scheduling (or waking up cores) is slow. Because of this, mutexes will look better on dumb benchmarks, as the contending threads keep going to sleep, while the single succesful owner has practically uncontended access



There are various degrees of slow, in addition to kernel being smarter about multiple cores and SMT siblings than your application.

Kernel can run your code on a cooled core, giving it higher clock, for example. Ultimately making it run faster.

Of course this won't show in a benchmark where all the threads do mostly calculation rather than contention, but that's not the typical case. That mostly shows up in compute such as multithreaded video where latency does not matter one bit.

Typically you have more of a producer/consumer pattern where consumer sleeps, and it's beneficial to run it on a cold CPU, assuming the kernel woke it up beforehand.

Source: hit some latency issues with an ancient kernel on a nastily hacked big little architecture ARM machine. It liked to overheat cores and put heavy tasks on the overheated ones for alleged power saving. (Whereas running a task quicker saves power.)




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: