Durable Coroutines for Go

kitd · on Oct 20, 2023

Impressive work. This reminds me of an experimental JVM that was around about 20 years ago from a group called Velare. They could do durable coroutines just like this, but by throwing a particular exception rather than using `yield`. They also could return a value and allow the coro to be resumed with a value.

edit: here it is https://dl.acm.org/doi/10.1145/949344.949362 It was a built-in bytecode interpreter with suspendable threads

It changed significantly how you think about cooperating processes. It was like watching 2 processes have a conversation with each other without the controlling code getting involved in any way. Also, if you save the coros after each call, you can get step-by-step replays which are very helpful in debugging.

ncruces · on Oct 20, 2023

An open-source version of this I made for Google Summer of Code 2005: https://sourceforge.net/projects/jauvm/

skybrian · on Oct 20, 2023

The saved coroutine state is an implicitly-defined data structure and they don’t support any kind of migration when the code changes, so the durability will be quite limited.

chris6f · on Oct 20, 2023

We have a solution already, and we're exploring a few more.

The durable coroutine library is one part of a larger system we're releasing soon. See https://stealthrocket.tech/blog/fairy-tales-of-workflow-orch... for more information :)

achille-roussel · on Oct 20, 2023

You’re correct that code migration is still a to do, we’ll be exploring how to do that in the future.

As you can imagine there are a lot of challenges with it, but ideas are welcome!

skybrian · on Oct 20, 2023

It seems like generating some data structure definitions (similar to a protobuf) and checking them in along with the code would be useful? Then the compiler can tell you when they still match, or you made an incompatible change to the code.

This seems like a situation where defining the data structure in two different ways might be good?

dlock17 · on Oct 20, 2023

This seems like a good use case for Cap'n Proto.

tedunangst · on Oct 20, 2023

This seems really brittle.

https://github.com/stealthrocket/coroutine/blob/main/getg_am...

deadfa11 · on Oct 20, 2023

Eh, this one seems a reasonable trade-off to me at this point. If you try to handle every potential issue upfront, you'll never release, and this seems entirely fixable down the road. Persistent coroutines is a pretty challenging area to explore (tried this with Lua in a previous venture), and this seems a pretty minor point in that overall complexity. I'm curious, what else have you come across that justifies the "really brittle" conclusion?

tedunangst · on Oct 20, 2023

You do what you have to do, but as a user, I like to know how things will break and what my options will be. I was curious how it worked, and took a peek at the first file that seemed it might reveal something interesting.

ilyt · on Oct 20, 2023

They just seem pretty pointless as a tool in language that already have super-light green threads and good concurrency primitives.

pryz · on Oct 20, 2023

https://research.swtch.com/coro might be an interesting read.

ilyt · on Oct 21, 2023

Doesn't answer question "why?"

In other languages they were used coz they were far lighter than native threads but that's not the case in Go

achille-roussel · on Oct 20, 2023

Yes, we planned to update the coroc compiler to generate this value instead of having it hardcoded so it will be more future proof.

chris6f · on Oct 20, 2023

A source-to-source compiler, and a library that bundles runtime implementation details, was the path of least resistance. We'd love to integrate this with the Go compiler (`go build -durable`, vs. `coroc && go build -tags durable`), but the compiler is closed and maintaining a fork of Go is not feasible for us at this time.

gsora · on Oct 20, 2023

sctb · on Oct 20, 2023

It's hardcoding an implementation detail of the Go runtime, which is subject to change (and if it did, this library might have non-obvious failure behaviour).

hnav · on Oct 20, 2023

I feel like once serializing god-knows-what state across program invocations is a requirement, I'd much prefer writing this explicitly so I can at least have a chance of debugging it

  type Task struct {
    I int
  }

  func (t *Task) Next() (bool, int) {
    if t.I < 3 {
      return t.I++, true
    }
    return 0, false
  }

  var t Task
  t.Next()
  json.Marshal(t)

chris6f · on Oct 20, 2023

It's a good point, but the entire program would have to be written this way (you can't use the standard library, or any other dependencies).

What if there were tools to inspect and debug the coroutine state? That's an area we're exploring now.

hnav · on Oct 20, 2023

But doesn't the program have to be written explicitly anyway? Like what happens if I open a file or network connection, yield, and then resume on another system?

chris6f · on Oct 20, 2023

It's probably better to let the user/application decide what to do in these cases, and for this reason we allow them to register type-specific (de)serialization routines.

In the case of network connections, the user could instead serialize connection details and then recreate the connection when deserializing the coroutine state. Same thing for files, where instead of serializing unstable low-level details like the file descriptor, the user can instead serialize higher level information (path, open flags, etc) and recreate the open file when deserializing the coroutine state.

zellyn · on Oct 20, 2023

A succinct explanation of how durable coroutines are different (in practice) from Temporal would be useful.

crabmusket · on Oct 20, 2023

Temporal seems like durable coroutines + distributed execution + some abstractions to help avoid impurity?

ctvo · on Oct 21, 2023

From a user perspective: how is this different than what Temporal provides?

born-jre · on Oct 20, 2023

nice, but i had a separate idea.

what if u build a wasm runtime that can save and restore memory with and execution states, sounds much more full proof. or i might have misunderstood this idea :D.

achille-roussel · on Oct 20, 2023

We tried that actually, and it can work well but you make a different set of trade offs. For example, you can’t get the granularity that durable coroutines give you, you’re bound to snapshot and restore the entire state of the application.

WASM is also not as mature of a tech for server side systems, a lot is left to figure out so the type of applications that you can build with it remains limited.

I’d be excited to see support for something like this get built tho!

infogulch · on Oct 20, 2023

Why are channels insufficient for this use case?

lelandbatey · on Oct 20, 2023

A goroutine-and-channel implementation may not work as well for the "durable" goal, while this implementation is focused on being able to perfectly serialize the state of the Coroutines in order to resume them elsewhere (durable).

ramenmeal · on Oct 20, 2023

Honest question, why not use another technology to store the state between the two? Like a DB or queuing mechanism.

infogulch · on Oct 20, 2023

I guess the internal state of the coroutine won't be automatically preserved. Say you're emitting numbers from an rng. Theoretically with this you could restart the program while keeping the internal rng state so a restart would continue emitting new values from the same sequence; with channels you could preserve the emitted values, but you couldn't automatically save and restore the rng state so the values would diverge from a continuous run at some point.

chris6f · on Oct 20, 2023

The library provides a way to serialize coroutine state, and to later deserialize that state and resume a coroutine from its last yield point. Where you store this state (in a DB, in a queue, etc) is up to you!

varispeed · on Oct 20, 2023

Why not fork Go and implement this directly? (and also removing any telemetry Google might have installed while at it...)

pryz · on Oct 20, 2023

Forking Go means that you have to maintain a copy. That's a lot of work. Instead, the idea is to implement coroutine as a library and work toward an integration with the Go stblib.

zelly · on Oct 21, 2023

This is the kind of thing you can only trust the compiler to do. And we already have goroutines.

rweir · on Oct 20, 2023

very cool - looking forward to seeing the rest.