Structured Concurrency

chadnickbok · on Feb 14, 2016

Great series of articles, I very much enjoy reading Martin Sustrik's thoughts on concurrency, mainly because he has the creds to back up his criticisms. Reminds me a lot of reading Poul-Henning Kamp's writing (the author of Varnish cache).

What's most interesting to me personally is that if you squint, Martin is essentially making an argument for a low-level parallel to Java's Thread.stop method (although in much more detail). I always thought that Java's deprecation of Thread.stop was a huge cop-out from actually solving the underlying causes of its issues, and very disappointing. But from a high-level view, there are often differences between whether a thread should be stopped when its ready, or if a thread is no longer useful and should be terminated immediately.

Its also interesting that go's solution to this is to pass a context with a 'done' channel through every function call dealing with a specific session, and every time a channel is waited on you also wait on the done channel. While its an effective method, its a bit of a shame that the concept wasn't built into go more deeply.

Some relevant articles:

- Why Thread.stop was deprecated: http://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPr...

- Go concurrency patterns: https://blog.golang.org/context

bmurphy1976 · on Feb 14, 2016

I'm not sold on Martin's proposal, however I recently spent some time getting clean cancellation into some code I was working on so I can sympathize with what he is going through. Here are a few tips that will hopefully help others who face a similar situation:

1. The simplest way to cancel something is to have a "cancelled" channel and close it, don't send a cancel message! This is important because closing the channel acts as a broadcast. Multiple goroutines can listen for the same signal and you don't have to spend time managing who's sending what message to whom and when. Close the channel and everybody is immediately notified.

2. You can select on both send and receive. You should check if the cancelled channel has been closed on sends so that your goroutines don't block forever. It's incredibly difficult to guarantee that somebody will be listening on the channel to unblock your goroutine after cancellation happens, so you should check on send as well.

Example receive:

  select {
    case message := <-c:
      // do something with message
    case <-cancelled:
      // cleanup and exit goroutine
  }

Example send:

  select {
    case c <- message:
    case <-cancelled:
      // cleanup and exit goroutine
  }

3. One commenter asked Martin why he doesn't use the standardized context package. Martin replied stating there is no way to distinguish between hard and soft cancels. While this is true, given a single context, there's nothing saying you can't use more than one context or cancellation channel. I'm not saying this is a particularly good idea, but these are not singletons and you can have as many as you need.

Doing cancellation cleanly is still a tough problem, but with some focus and especially #1 and #2 it's doable, but you have to roll up your sleeves and put some effort into it.

rumcajz · on Feb 15, 2016

Thanks for making the above clear. I should probably have made that explicit in the article but I was so focused on hard cancellation that all I've said about soft cancellation was, basically, "do it by hand", without going into details.

In fact, being able to share single channel between multiple goroutines and using "close" as a broadcast mechanism is one of the best -- if rarely mentioned -- ideas in Go.

eps · on Feb 14, 2016

> The protocol will be a simple sequence of messages, each message consisting of 32-bit size field in network byte order and the data. How hard it can be to implement such a simple specification? It turns out it can take months.

It could also take years, if not decades. It doesn't mean it should. This is a good article but this sort of straw-man shenanigans only serves to hurt its credibility.

rumcajz · on Feb 14, 2016

It comes from experience. I've tried to implement such a simple protocol and it literally takes months to get all the details right. Also see the quote from Joe Armstrong in the article.

eps · on Feb 15, 2016

You aren't the only person who's well-versed in the network development and that statement is patently absurd. If it did in fact take you several months to get a framing layer right, the it only means that it was an issue stemming from specifics (and complexity) of your implementation approach. Hence my original credibility remark - you are clearly over-extrapolating and over-generalizing your anecdotal data.

lkrubner · on Feb 14, 2016

Does anyone know what the 2nd "processes" is referring to, in the 2nd paragraph here?

---------------------------

In a sane world, specification 30 words long, like the one above, would be implementable in 100 lines of C code. If it requires several thousand lines of code, there's clearly something broken with our programming model.

As I've argued in part I of this essay, to get a simple programming model you have to spawn a thread per TCP connection. But given that threads are expensive (and processes even more so) and that sharing state between threads is insane, bordering on suicidal, what you need are lightweight green threads and a simple communication mechanism to send messages between them. In UNIX of old they were called processes and pipes. In Go they are called goroutines and channels. In Erlang they are known as processes and mailboxes.

-----------------------------

I understand this:

"But given that threads are expensive (and processes even more so)"

but I am confused by this 2nd use of "processes":

"In UNIX of old they were called processes"

Was there an old version of UNIX that had lightweight threads? What were they? How were they implemented? How were they different from the processes we have now?

When I think of processes on Unix I think "threads are expensive (and processes even more so)". Was this different at some point in the past?

As to implementing new communication protocols, none of us should underestimate the complexities. Jon Postel was not stupid, but even he made mistakes. Recall that by 1973 they felt they had TCP 3 in a mostly final form, but then in August of 1977 Postel said they had made a terrible mistake by lumping too much together in TCP. He then spent a year pulling TCP and IP apart into the 2 separate protocols that we have now.

That example is well known, but if you read through the old RFCs, you realize there are some ideas that were explored and then abandoned. Among the really wild ideas:

"we are contemplating allowing sender and receiver to specify different byte sizes and consider bits as the basic unit "

This is from:

https://tools.ietf.org/html/rfc128

And if you try to create your own communication protocol, you will also go through a similar period of trial and error.

to3m · on Feb 14, 2016

The 2nd type of process is just an ordinary Unix-style process. It refers to the way that on older hardware process switches were cheaper on account of the smaller amount of state that needed to be shuffled about and cheaper system calls. So no need to (as you might today) prefer green threads over OS threads, and/or OS threads over processes, on account of the overheads.

This is mentioned in part 1 near the bottom:

    When UNIX was still young ... you were supposed to fork a
    new instance of the process for each TCP connection

    I guess it made sense from performance point of view back
    then. All the stuff that makes process switching slow today
    haven't yet existed. ... In such an environment process
    switching was closer to what we call green threads today.

chubot · on Feb 14, 2016

There is some background in the first blog post on the topic:

http://250bpm.com/blog:69

I guess it made sense from performance point of view back then. All the stuff that makes process switching slow today haven't yet existed. Machines had single processor, there was no virtual addressing or memory caches. In such an environment process switching was closer to what we call green threads today.

So he is referring to hardware changes that made processes relatively more expensive than they used to be. I would think that the old machines still incurred some hit from a state change in the MMU. But I am far from an expert in that area.

Also, this statement seems suspect, because multiple cores properly used can make processes CHEAPER. Because there will be ZERO process-switching cost -- you have parallel processes. This is why I advocate an architecture based on heterogeneous threads or processes and message passing -- in contrast to process-per-connection, which implies homogeneous processes and little message passing.

The Go designers also advocate heterogeneous goroutines, toward the goal of program modularity rather than performance.

If you have heterogeneous threads/processes pinned to cores, that's pretty much the most efficient way of using your hardware. But writing this type of code is not so obvious without some planning, foresight, and working around language quirks.

AndrewNCarr · on Feb 14, 2016

He is referring to traditional chaining of programs through STDOUT and STDIN. For example, a "program" that counts number of occurrences of foo in a file: cat file.txt | grep "foo" | wc -w

PaulHoule · on Feb 14, 2016

I blame C more than I would praise Erlang.

Java has a threads implementation that actually works which is a lot more than you can say about C/pthreads, although lately they've been tuning up the C memory model.

It may not be easy to write multithreaded code in Java but it is definitely possible. It's actually pretty easy in a lot of cases if you use the Executor, I think most of the carping you will hear about Java these days is because people think they are being clever using Fork-Join or work stealing or actors when in fact they are just screwing up.

jkot · on Feb 14, 2016

And Java has async network IO. Single threaded example from article would be fairly simple to implement with Grizzly or Netty.

easytiger · on Feb 16, 2016

> I blame C more than I would praise Erlang.

I guess you will also blame ASM for using a linear execution paradigm

raiph · on Feb 14, 2016

cf "It's time to embrace structured approaches to parallel, asynchronous, and concurrent programming." ~~ http://jnthn.net/papers/2015-yapcasia-concurrency.pdf#p=76

nickpsecurity · on Feb 14, 2016

Like Hansens's Concurrent Pascal, monitors, and distributed model?

http://brinch-hansen.net/papers/

Or Eiffel's race-free SCOOP? Or Ada Ravenscar? Or ParaSail and Chapel custom made for it?

Plenty of such work with proven results. Just little uptake of anything different before Haskell, Go, and Rust.

PaulHoule · on Feb 14, 2016

More structured is probably going to look a bit like UML, Petri Nets, or RETE networks.

yxhuvud · on Feb 14, 2016

Indeed. See the for example the Disruptor pattern for something that look like a (cyclefree) Petri Net.

fizwhiz · on Feb 15, 2016

Could you expound a little further on why the Disruptor pattern looks like a cycle-free Petri Net?

yxhuvud · on Feb 15, 2016

Well, the disruptor have a set of requests that are processed in parallel. Each request have a counter associated with it. Each request also have a step associated with it. A step can depend on an arbitrary number of earlier steps. This dependency based setup of steps can easily be mapped to petri form.

Ok, maybe it isn't that close, but both are some kind of state machines that allow for arbitrary number of dependencies for each state and the mapping from a cycle free petri net to a disruptor is very trivial. Of course, the lack of cycles hobble the expressive power of the nets.

vmorgulis · on Feb 14, 2016

Or more abstract or declarative like Haskell or SQL...

halayli · on Feb 15, 2016

In lthread-cpp (wrapper around lthread) it's even easier because of RAII. Example:

http://lthread-cpp.readthedocs.org/en/latest/listener.html#e...

TCPListener closes the socket once its done and cleans up. And it also allows you to pass instance methods as the coroutine which has the advantage of accessing the instance context.