Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Durable Coroutines for Go (github.com/stealthrocket)
69 points by greenSunglass on Oct 20, 2023 | hide | past | favorite | 35 comments


Impressive work. This reminds me of an experimental JVM that was around about 20 years ago from a group called Velare. They could do durable coroutines just like this, but by throwing a particular exception rather than using `yield`. They also could return a value and allow the coro to be resumed with a value.

edit: here it is https://dl.acm.org/doi/10.1145/949344.949362 It was a built-in bytecode interpreter with suspendable threads

It changed significantly how you think about cooperating processes. It was like watching 2 processes have a conversation with each other without the controlling code getting involved in any way. Also, if you save the coros after each call, you can get step-by-step replays which are very helpful in debugging.


An open-source version of this I made for Google Summer of Code 2005: https://sourceforge.net/projects/jauvm/


The saved coroutine state is an implicitly-defined data structure and they don’t support any kind of migration when the code changes, so the durability will be quite limited.


We have a solution already, and we're exploring a few more.

The durable coroutine library is one part of a larger system we're releasing soon. See https://stealthrocket.tech/blog/fairy-tales-of-workflow-orch... for more information :)


You’re correct that code migration is still a to do, we’ll be exploring how to do that in the future.

As you can imagine there are a lot of challenges with it, but ideas are welcome!


It seems like generating some data structure definitions (similar to a protobuf) and checking them in along with the code would be useful? Then the compiler can tell you when they still match, or you made an incompatible change to the code.

This seems like a situation where defining the data structure in two different ways might be good?


This seems like a good use case for Cap'n Proto.



Eh, this one seems a reasonable trade-off to me at this point. If you try to handle every potential issue upfront, you'll never release, and this seems entirely fixable down the road. Persistent coroutines is a pretty challenging area to explore (tried this with Lua in a previous venture), and this seems a pretty minor point in that overall complexity. I'm curious, what else have you come across that justifies the "really brittle" conclusion?


You do what you have to do, but as a user, I like to know how things will break and what my options will be. I was curious how it worked, and took a peek at the first file that seemed it might reveal something interesting.


They just seem pretty pointless as a tool in language that already have super-light green threads and good concurrency primitives.


https://research.swtch.com/coro might be an interesting read.


Doesn't answer question "why?"

In other languages they were used coz they were far lighter than native threads but that's not the case in Go


Yes, we planned to update the coroc compiler to generate this value instead of having it hardcoded so it will be more future proof.


A source-to-source compiler, and a library that bundles runtime implementation details, was the path of least resistance. We'd love to integrate this with the Go compiler (`go build -durable`, vs. `coroc && go build -tags durable`), but the compiler is closed and maintaining a fork of Go is not feasible for us at this time.


Why?


It's hardcoding an implementation detail of the Go runtime, which is subject to change (and if it did, this library might have non-obvious failure behaviour).


I feel like once serializing god-knows-what state across program invocations is a requirement, I'd much prefer writing this explicitly so I can at least have a chance of debugging it

  type Task struct {
    I int
  }

  func (t *Task) Next() (bool, int) {
    if t.I < 3 {
      return t.I++, true
    }
    return 0, false
  }

  var t Task
  t.Next()
  json.Marshal(t)


It's a good point, but the entire program would have to be written this way (you can't use the standard library, or any other dependencies).

What if there were tools to inspect and debug the coroutine state? That's an area we're exploring now.


But doesn't the program have to be written explicitly anyway? Like what happens if I open a file or network connection, yield, and then resume on another system?


It's probably better to let the user/application decide what to do in these cases, and for this reason we allow them to register type-specific (de)serialization routines.

In the case of network connections, the user could instead serialize connection details and then recreate the connection when deserializing the coroutine state. Same thing for files, where instead of serializing unstable low-level details like the file descriptor, the user can instead serialize higher level information (path, open flags, etc) and recreate the open file when deserializing the coroutine state.


A succinct explanation of how durable coroutines are different (in practice) from Temporal would be useful.


Temporal seems like durable coroutines + distributed execution + some abstractions to help avoid impurity?


From a user perspective: how is this different than what Temporal provides?


nice, but i had a separate idea.

what if u build a wasm runtime that can save and restore memory with and execution states, sounds much more full proof. or i might have misunderstood this idea :D.


We tried that actually, and it can work well but you make a different set of trade offs. For example, you can’t get the granularity that durable coroutines give you, you’re bound to snapshot and restore the entire state of the application.

WASM is also not as mature of a tech for server side systems, a lot is left to figure out so the type of applications that you can build with it remains limited.

I’d be excited to see support for something like this get built tho!


Why are channels insufficient for this use case?


A goroutine-and-channel implementation may not work as well for the "durable" goal, while this implementation is focused on being able to perfectly serialize the state of the Coroutines in order to resume them elsewhere (durable).


Honest question, why not use another technology to store the state between the two? Like a DB or queuing mechanism.


I guess the internal state of the coroutine won't be automatically preserved. Say you're emitting numbers from an rng. Theoretically with this you could restart the program while keeping the internal rng state so a restart would continue emitting new values from the same sequence; with channels you could preserve the emitted values, but you couldn't automatically save and restore the rng state so the values would diverge from a continuous run at some point.


The library provides a way to serialize coroutine state, and to later deserialize that state and resume a coroutine from its last yield point. Where you store this state (in a DB, in a queue, etc) is up to you!


Why not fork Go and implement this directly? (and also removing any telemetry Google might have installed while at it...)


Forking Go means that you have to maintain a copy. That's a lot of work. Instead, the idea is to implement coroutine as a library and work toward an integration with the Go stblib.


This is the kind of thing you can only trust the compiler to do. And we already have goroutines.


very cool - looking forward to seeing the rest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: