Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Two things come to mind when reading this:

1. WTF::Lock is using sequential consistency for all of the slow path operations. Surely this isn't necessary and it could still use acquire/release for most of them (like it does in the fast path).

2. Ditching OS-provided mutexes has one possibly-important drawback on OS X and iOS: lack of priority donation. Recent versions of OS X and iOS have a Quality Of Service implementation in the kernel that classifies threads (and libdispatch queues) according to one of several QOS levels, and then handles normal thread priorities within those levels. For this reason you actually cannot safely implement your own spinlock, as I documented a few months ago (http://engineering.postmates.com/Spinlocks-Considered-Harmfu... it'll appear to work under many normal workloads, but once you start using the spinlock from threads of different QOS levels at the same time, you risk hitting a priority inversion livelock where a lower-QOS thread has the spinlock locked but is never scheduled because higher-QOS threads spinning on the lock are always given priority. The OS does actually have a safe spinlock (which does priority donation), but it's not public API and so can't be used outside of system code. Of course, this doesn't apply to adaptive spinlocks, since they'll just end up in the slow path of parking their thread, but this does still serve to highlight the issue they do have, which is if a high-QOS thread is parked waiting for a lock that's owned by a low-QOS thread, the high-QOS thread is effectively having its priority lowered to the low QOS until the owning thread releases the lock. This is obviously not what you want. The system-provided pthread_mutex_t does priority donation specifically to fix this issue, but WTF::Lock doesn't.



Weakening the consistency of operations on the slow path is not going to be a speed-up. Those paths are not dominated by pipeline effects inside the CPU. They are dominated by context switches and memory contention. So, SC is exactly right: it's harder to get wrong and exactly the same speed.

It's true that WTF::Lock doesn't support priority inheritance. That's fine for WebKit's threads.


I disagree strongly as too strong barriers on non TSO hardware (such as say, armv7 or armv8) will just cause even more of that memory contention for no good reasons.

Also on a contended lock, there's a very strong possibility that you will decide that the lock can be taken right before you block.

On intel of course, it doesn't matter.


It also doesn't matter on all other architectures. The slow path is dominated by OS operations like yielding, which already execute the strongest barriers possible.


1) indeed there are a couple of barriers that could be optimized

2) in addition to what Phil already said (which is that there's no priority inversion for Safari/WebKit to have as all threads taking these locks have the same priority to boot), the private lock you're talking about, named the handoff lock, has terrible unlock patterns, and is why it performs so poorly when WTF::Lock benched it.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: