CX: Wait-Free Universal Construction

amluto · on Nov 7, 2019

> It also consumes larges amounts of memory and in the very unlikely worst case, may require 2x MAX_THREADS replicas of the sequential object.

Depending on how many threads are involved, I have a simple construction [0] that is likely to scale much better. It uses multiple copies of the data structure, but each successive copy is used only half as often, resulting in logarithmic amortized complexity.

IMO these types of constructions only make sense when reads are far more common than writes. Using this for an std::set is ridiculous — if you really want to apply one of these constructions to a mapping, you should build a whole new optimized seat structure for each write. Or maybe you could cleverly use immutable data structures to improve memory and bandwidth usage.

[0] https://link.springer.com/chapter/10.1007%2F978-3-540-92221-...

gpderetta · on Nov 7, 2019

Herlihy[1] has a general transformation to make any arbitrary algorithm wait free, so I think that the claim of novelty is a bit overblown.

[1] The Art of Multiprocessor Programming; I seem to have misplaced my copy so I can't reference chapter and verse.

thekdude · on Nov 8, 2019

I believe you are talking about chapter 6 section 4, page 130 of the Revised First edition

DSingularity · on Nov 7, 2019

Yeah I remember it. Basically, compare-and-swap to enqueue an operation. In contention threads will cooperate to help threads finish operations. With this you bound the blocking and achieve wait-free concurrent operation.

kimi · on Nov 7, 2019

Oh yes, that's a Clojure atom. Been there, done that. https://www.braveclojure.com/zombie-metaphysics/

(BTW, in practice, Clojure atoms were initially thought of as the simple case of refs, where you have a full STM behind you. In practice, atoms end up being "good enough" in nearly all cases....)

dwohnitmok · on Nov 8, 2019

I think (?) the difference between this and an atomic reference (of which a Clojure atom is a specific example) is that it is geared towards mutable data structures rather than immutable ones. Although as far as I can tell it's more or less emulating immutable data structures by copying the mutable data structure before modification.

As for the ultimate lack of widespread use of STM in real-world Clojure code, I think in addition to the nice interface provided by atoms, it's also in part because Clojure's STM lacks a couple of combinators that limit its composability, and so a lot of other concurrency approaches make more sense. See the combinators provided by https://github.com/skejserjensen/eclojure and http://www.thattommyhall.com/2013/02/15/stm-clj-vs-haskell/

0xffff2 · on Nov 7, 2019

Is "Universal Construction" a well-known term? I don't recall ever encountering it in my Comp. Sci. degree, and some quick searching only leads me to more academic papers.

FisDugthop · on Nov 7, 2019

I only know them from category theory and abstract algebra. You could think of them as polymorphic constructions from some set of assumptions. As an example, given some type of elements, the type of lists of those elements is a monoid (the free monoid) on those elements.

rhacker · on Nov 7, 2019

It's horribly over garbled. The Github repo doesn't do a better job of explaining anything. This is a horrible news item.

ajkjk · on Nov 8, 2019

It's mentioned on https://en.wikipedia.org/wiki/Non-blocking_algorithm, at least.

FisDugthop · on Nov 7, 2019

How would this compare with RCU techniques, as used in Linux? Is it about getting finer-grained access without compromising speed?

scottlamb · on Nov 7, 2019

> How would this compare with RCU techniques, as used in Linux? Is it about getting finer-grained access without compromising speed?

I'm wondering this, too. To me it sounds the same. From the "bad news" section:

> CX serializes all mutative operations which means it's flat at best if the workload consists of only mutations on the sequential object. ... It also consumes larges amounts of memory and in the very unlikely worst case, may require 2x MAX_THREADS replicas of the sequential object.

That's roughly the same text I'd expect to see on an RCU implementation. (Maybe the exact number of replicas differs; I think with some RCU schemes it's GCed asynchronously and is theoretically unbounded.) At first glance, it's hard to understand what's new here. They say they used this to make "the world's first wait-free balanced binary tree". How does that description apply to this but not to RCU?

dmoldavanov · on Nov 7, 2019

They also have _comparison_ with RCU https://concurrencyfreaks.blogspot.com/2019/10/is-rcu-generi...

banachtarski · on Nov 7, 2019

> Make any data structure wait-free as long as it has a copy constructor

I'm going to hazard a guess that in many cases, a simple mutex (or even better, striped mutexes) will perform better.

gpderetta · on Nov 7, 2019

a simple mutex will never be wait free and as such is never appropriate where a wait free data structure is required (it is not about performance).

banachtarski · on Nov 7, 2019

Maybe not but I can count on one hand the number of times I've truly needed that property. Compare to a try-lock or a timed lock for example.

wrl · on Nov 8, 2019

In soft/firm/hard real-time, we can't use mutexes at all. Lock/wait-free is our bread and butter.

banachtarski · on Nov 8, 2019

Did you mean to include "soft" there? And I've never heard of "firm" real time.

chrisseaton · on Nov 7, 2019

...but that's not wait free? What happens if a processes crashes or a thread is de-scheduled while holding the lock? Or just isn't scheduled again very quickly?

eternalban · on Nov 7, 2019

Glancing at OP, it appears CX itself uses a RW lock.

"Several years ago, when Andreia first came up with the idea that we could use a reader-writer lock to achieve wait-freedom, I though that was crazy, but cool. After joining up forces with Pascal and a lot of work, we were able to turn that into a universal construction, CX."

"- wait-freedom can be achieved with locks"

chrisseaton · on Nov 7, 2019

I guess they must be using the lock in a careful way to avoid any contention, or to distribute people who need the lock across many versions of a data structure.

That's wouldn't be the same as a 'simple mutex', would it? Which is absolutely not wait-free (even if I don't understand how the OP is using locks.)

eternalban · on Nov 7, 2019

Your original criticism was that a process would crash while holding a lock.

I guess we need to wait for the arxiv to come back and read the paper for the details.

chrisseaton · on Nov 7, 2019

Yes - a lock can redirect people trying to acquire it, rather than queueing them. So processes need a lock to access a resource, but there are enough resources for everyone to get their own without waiting, and enough that if a process crashes holding one it doesn't cause anyone else to wait. That's not a 'simple mutex' though.

g0xA52A2A · on Nov 7, 2019

Sadly arXiv appears to be undergoing an outage at the moment so the link to the paper isn't reachable.

https://twitter.com/arxiv/status/1192508633092898816