Impressive work. This reminds me of an experimental JVM that was around about 20 years ago from a group called Velare. They could do durable coroutines just like this, but by throwing a particular exception rather than using `yield`. They also could return a value and allow the coro to be resumed with a value.
It changed significantly how you think about cooperating processes. It was like watching 2 processes have a conversation with each other without the controlling code getting involved in any way. Also, if you save the coros after each call, you can get step-by-step replays which are very helpful in debugging.
The saved coroutine state is an implicitly-defined data structure and they don’t support any kind of migration when the code changes, so the durability will be quite limited.
It seems like generating some data structure definitions (similar to a protobuf) and checking them in along with the code would be useful? Then the compiler can tell you when they still match, or you made an incompatible change to the code.
This seems like a situation where defining the data structure in two different ways might be good?
Eh, this one seems a reasonable trade-off to me at this point. If you try to handle every potential issue upfront, you'll never release, and this seems entirely fixable down the road. Persistent coroutines is a pretty challenging area to explore (tried this with Lua in a previous venture), and this seems a pretty minor point in that overall complexity. I'm curious, what else have you come across that justifies the "really brittle" conclusion?
You do what you have to do, but as a user, I like to know how things will break and what my options will be. I was curious how it worked, and took a peek at the first file that seemed it might reveal something interesting.
A source-to-source compiler, and a library that bundles runtime implementation details, was the path of least resistance. We'd love to integrate this with the Go compiler (`go build -durable`, vs. `coroc && go build -tags durable`), but the compiler is closed and maintaining a fork of Go is not feasible for us at this time.
It's hardcoding an implementation detail of the Go runtime, which is subject to change (and if it did, this library might have non-obvious failure behaviour).
I feel like once serializing god-knows-what state across program invocations is a requirement, I'd much prefer writing this explicitly so I can at least have a chance of debugging it
type Task struct {
I int
}
func (t *Task) Next() (bool, int) {
if t.I < 3 {
return t.I++, true
}
return 0, false
}
var t Task
t.Next()
json.Marshal(t)
But doesn't the program have to be written explicitly anyway? Like what happens if I open a file or network connection, yield, and then resume on another system?
It's probably better to let the user/application decide what to do in these cases, and for this reason we allow them to register type-specific (de)serialization routines.
In the case of network connections, the user could instead serialize connection details and then recreate the connection when deserializing the coroutine state. Same thing for files, where instead of serializing unstable low-level details like the file descriptor, the user can instead serialize higher level information (path, open flags, etc) and recreate the open file when deserializing the coroutine state.
what if u build a wasm runtime that can save and restore memory with and execution states, sounds much more full proof.
or i might have misunderstood this idea :D.
We tried that actually, and it can work well but you make a different set of trade offs. For example, you can’t get the granularity that durable coroutines give you, you’re bound to snapshot and restore the entire state of the application.
WASM is also not as mature of a tech for server side systems, a lot is left to figure out so the type of applications that you can build with it remains limited.
I’d be excited to see support for something like this get built tho!
A goroutine-and-channel implementation may not work as well for the "durable" goal, while this implementation is focused on being able to perfectly serialize the state of the Coroutines in order to resume them elsewhere (durable).
I guess the internal state of the coroutine won't be automatically preserved. Say you're emitting numbers from an rng. Theoretically with this you could restart the program while keeping the internal rng state so a restart would continue emitting new values from the same sequence; with channels you could preserve the emitted values, but you couldn't automatically save and restore the rng state so the values would diverge from a continuous run at some point.
The library provides a way to serialize coroutine state, and to later deserialize that state and resume a coroutine from its last yield point. Where you store this state (in a DB, in a queue, etc) is up to you!
Forking Go means that you have to maintain a copy. That's a lot of work. Instead, the idea is to implement coroutine as a library and work toward an integration with the Go stblib.
edit: here it is https://dl.acm.org/doi/10.1145/949344.949362 It was a built-in bytecode interpreter with suspendable threads
It changed significantly how you think about cooperating processes. It was like watching 2 processes have a conversation with each other without the controlling code getting involved in any way. Also, if you save the coros after each call, you can get step-by-step replays which are very helpful in debugging.