Two things come to mind when reading this: 1. WTF::Lock is using sequential cons...

pizlonator · on May 6, 2016

Weakening the consistency of operations on the slow path is not going to be a speed-up. Those paths are not dominated by pipeline effects inside the CPU. They are dominated by context switches and memory contention. So, SC is exactly right: it's harder to get wrong and exactly the same speed.

It's true that WTF::Lock doesn't support priority inheritance. That's fine for WebKit's threads.

phabouzit · on May 10, 2016

I disagree strongly as too strong barriers on non TSO hardware (such as say, armv7 or armv8) will just cause even more of that memory contention for no good reasons.

Also on a contended lock, there's a very strong possibility that you will decide that the lock can be taken right before you block.

On intel of course, it doesn't matter.

pizlonator · on May 17, 2016

It also doesn't matter on all other architectures. The slow path is dominated by OS operations like yielding, which already execute the strongest barriers possible.

phabouzit · on May 10, 2016

1) indeed there are a couple of barriers that could be optimized

2) in addition to what Phil already said (which is that there's no priority inversion for Safari/WebKit to have as all threads taking these locks have the same priority to boot), the private lock you're talking about, named the handoff lock, has terrible unlock patterns, and is why it performs so poorly when WTF::Lock benched it.