Distributed Read-Write Mutex in Go

jaytaylor · on May 2, 2015

I thought "distributed" would mean distributed over the network, which I could definitely make use of! (Or perhaps I should just use zookeeper or equivalent for that ;)

Anyways, in this case the term "distributed" is being used to describe a mechanism which reduces memory contention when go is utilizing multiple cores on one machine.

I'd love to see more exhaustive analysis of the performance implications for this technique across a wide variety of usage scenarios.

epberry · on May 2, 2015

N00b question here but when people refer to mutex's, they are usually talking about multiple cores on a single machine right? Isn't there a different terminology for locks taken across the network in distributed systems?

yason · on May 3, 2015

Yes¹.

I don't think I've ever heard of using the word mutex, namely, for synchronizing access to data over a network. Mutex kind of implies that there's the synchronization primitive and there's the data, and both can be touched separately but it's just an agreement that we touch the mutex first. However, network protocols usually just transfer updates to/from the data and then let the server do the synchronization and access pairing for each connection.

I don't think it would make sense at all to run a "mutex service" on one server and asking clients to grab that mutex before accessing another "data service" which would obviously allow any data updates by anyone without any synchronization at all.

¹ or just multiple tasks, in general. Even a single-core machine needs mutexed access to shared data because several tasks could try to concurrently use that memory location and the scheduler could switch tasks at a critical point. However, if strictly limited to single-core, and the system allows to, you can just disable interrupts while you're manipulating the data.

twic · on May 3, 2015

> I don't think it would make sense at all to run a "mutex service" on one server and asking clients to grab that mutex before accessing another "data service"

And yet such things exist:

http://en.m.wikipedia.org/wiki/Distributed_lock_manager

I first came across one in the context of a distributed cache which sat on top of a database. I'm not at all convinced it was a good design, but it was there!

Matt3o12_ · on May 2, 2015

No. There are usually talking about 2 threads accessing the same code.

Let's suppose we have an int, and want to add 2, we would do:

1. int i = sharedInt

2. i = i + 2

3. sharedInt = i

(that's what the compile actually does when you write sharedInt += 2).

If the runtime decides two stop the current thread on line 2 and lets the second one run all three lines, we would have a problem if I didn't use a mutex.

hurin · on May 2, 2015

You can compile on x86 for atomic increments? (Go seems to support it https://golang.org/pkg/sync/atomic/)

Matt3o12_ · on May 2, 2015

I don't know but I think so. I just wanted to give him an oversimplified example (you could also use any complicated computation). I'm not sure, though, if every compiler/interpreter makes use of them (what about VMs)?

jzelinskie · on May 2, 2015

This is available only under certain architectures (x86 is one of them as you've noted), but I've also seen scenarios where the atomic increment instruction is actually slower than using a mutex. Don't consider this feature to be a magic bullet and always test your use cases!

As for the parent discussion, usually a mutex is talking about the threading construct while a "lock service" is how you'd refer to something like etcd or zookeeper.

YZF · on May 3, 2015

Mutexes are typically implemented over atomic instructions. So you'd do something like atomic compare/exchange to acquire the mutex and if there's no contention you got it. If there is contention you go to the OSes synchronization constructs which are typically much slower... An atomic increment should always be faster than acquiring a mutex and incrementing...

hurin · on May 3, 2015

How could it be slower than a mutex? Is it from compiling all your increments to use LOCK? What exactly were you timing? That sounds really strange..

bottled_poe · on May 3, 2015

> No. There are usually talking about 2 threads accessing the same code.

Don't you mean two threads accessing the same data?

Mutexes are about data protection. This usually manifests as exclusive exectution of code, but it is not the purpose.

jfroma · on May 2, 2015

Newbie here, I did too but on the other hand I've never read mutex in the context of distributed system but "consensus".

Jonhoo · on May 7, 2015

FWIW, this has now moved to https://github.com/jonhoo/drwmutex so you can use it directly in your projects.

1amzave · on May 3, 2015

Possible unintended side-effect: `taskset -p` might have "interesting" effects on your process.

Jonhoo · on May 3, 2015

I don't think taskset should affect this at all. It modified the CPU affinity, which my code respects, so everything should behave as expected.

1amzave · on May 3, 2015

So what happens when you find yourself running on a CPU that wasn't in your initial affinity mask?

Also, the "sleep for 1 ms" approach used in `init()` looks wrong -- if `sched_setaffinity()` doesn't guarantee that the calling task has been migrated to one of the target CPUs on return (which I suspect it does), I don't think sleeping for a millisecond is going to change anything.

Jonhoo · on May 3, 2015

You'll acquire CPU 0's lock (since a map lookup with an invalid key yields the zero value). I agree this isn't optimal when you change the affinity of a process after it has started. You could imagine a scheme where, if this happens, you create a new lock, but that would significantly complicate the scheme as you would now potentially need to take a read lock on the map in case it changes under you. It's annoying that CPUID values aren't guaranteed to be without holes, but that's what we're stuck with.

Yeah, the sleep is a leftover from an earlier version of the code that didn't use sched_setaffinity. I've remove it now.