Saying this is reinventing BASIC in YAML seems to miss the point here; the progr...

mfateev · on Aug 31, 2020

Look at my project temporal.io. It does the same thing using general purpose programming languages. Currently Java and Go are supported with Python, Ruby and C# under development. This is absolutely valid workflow code:

     public void execute(String customerId) {
       activities.onboardToFreeTrial(customerId);
       try {
         Workflow.sleep(Duration.ofDays(60));
         activities.upgradeFromTrialToPaid(customerId);
         while (true) {
           Workflow.sleep(Duration.ofDays(30));
           activities.chargeMonthlyFee(customerId);
         }
       } catch (CancellationException e) {
         activities.processSubscriptionCancellation(customerId);
       }
    }

xiaq · on Aug 31, 2020

But what this does under the hood is still sending instructions to a workflow engine that does the actual work, as opposed to this code doing the actual work directly, right?

If that's the case, then what you could do is still restricted by the protocol of the workflow engine - building the instructions in code gives you some dynamicism, but not a whole world of difference, and complicates things that are easier done statically. It is definitely a valid approach, but it doesn't invalidate the approach of writing out the workflow definition statically, especially if the paradigm is "put all smartness inside HTTP service and only use the workflow as a glue".

mfateev · on Aug 31, 2020

It is a workflow engine. So the actual work happens inside the activities. So from this point of view it is the same. The difference is that the orchestration code has the full power of a programming language including threads and OO techniques. And all the tools like IDEs, debuggers, unit testing frameworks work out of the box and don't need to be reinvented for YAML based reimplementation of Java.

There are lot of advantages of using a general purpose programming language for implementing workflows. An incomplete list in no particular order:

   * Strongly typed for languages that support it
   * No need to learn a new programming language
   * Practically unlimited complexity of the code 
   * IDE support which includes code analysis and refactoring
   * Debuggers
   * Reuse of existing libraries and data structures. For example can YAML based definition support ordered maps or priority lists without any modification?
   * Standard error handling. In Java, for example, exception are used.
   * Easy to implement handling of asynchronous events
   * Standard toolchains just work. For example Gradle for Java and modules for Go.
   * Standard logging and context propagation can be supported

And so on. Any new language has to have a ton of tools, libraries and frameworks to be useful. And using an existing language allows to benefit from the existing ecosystem out of the box.

xiaq · on Aug 31, 2020

Yes, the orchestration code can be developed and debugged as normal code. I agree that it can be a pain to write and debug YAML "code".

But what's usually more interesting when it comes to workflows is inspecting and debugging the workflows themselves, and you'd still need custom tooling for that, regardless of how the workflow is built.

The actual appeal of using YAML here is that the "code" is amenable to static analysis. YAML is irrelevant; the DSL embedded in YAML is. For example, you can easily count how many steps there are, how many edges there are between steps, etc. If you build the workflow in code, in general you can only know these after the orchestration code has executed.

mfateev · on Aug 31, 2020

I'm not sure how useful all these counts are. If the static analysis that you described was that important we would write most of the software in YAML instead of C/C++/Java/Go/Python, etc. Linux kernel in YAML would be really cool :).

In my experience, no developer ever asked for this information, especially if the price is writing code in turing complete YAML/XML/JSON based language.

xiaq · on Aug 31, 2020

Well, that was a contrived example. Something more useful would be e.g. enumerating all the HTTP endpoints being depended on, or calculating the maximum resource consumption.

I think our disagreement really boils down to different approaches towards workflow configuration. Static configuration has its place in a system where all the "smartness " can easily fit somewhere else. This is often the case when you are building a workflow that glues many in-house components together, which tend to have uniform behavior and you can easily extend them. On the other hand, if you are working with many heterogeneous components that are clumsy to extend, having a smarter, more dynamic workflow configuration API definitely makes more sense.

mfateev · on Aug 31, 2020

In my opinion, there are two classes of workflow definition languages: domain-specific (DSL) and general-purpose.

Configuration based languages are awesome for domain-specific use cases. For example, AWS Cloud Formation or HashiCorp Terraform configuration language are good examples of domain-specific workflow definition languages. They solve just one specific problem that allows them to be mostly declarative and omit most procedural complexity. Even in this case, I'm pretty sure that Pulumi folks would not be 100% in agreement.

The general-purpose workflow definition languages are procedural. And I believe that procedural code in YAML/XML/JSON is a bad idea. It looks ugly, doesn't add much value, and never matches any of the general-purpose languages in expressiveness and tooling. Such configuration languages work in limited situations, but developers quickly hit their boundaries in most real use cases and have to look for real solutions.

BTW Temporal and its predecessor Cadence are perfect platforms for supporting custom DSL workflow definitions. Many production use cases run custom DSLs on top of them.

cle · on Aug 31, 2020

This looks very similar to Amazon Simple Workflow, which has been effectively superseded by Step Functions, which uses static workflow definitions that look very similar to GCP Workflows.

Why do you think AWS moved away from this model?

mfateev · on Aug 31, 2020

It is similar to the Amazon Simple Workflow as both founders of the Cadence and Temporal projects worked on the Simple Workflow. I actually was a tech lead of it.

I cannot comment on why AWS made certain decisions as I left Amazon soon after SWF launched.

I guess that SWF was hard to use for novices. When developing Cadence and later Temporal, we fixed the majority of rough edges that SWF had. And it turned out that the core "workflow as code" model was something developers love. And with improved developer experience, the adoption is going very strong.

Here is Hacker News thread that compares the two: https://news.ycombinator.com/item?id=19733880

lowercase1 · on Aug 31, 2020

Have you moved on from Cadence to this? What differences are there? Looks interesting.

mfateev · on Aug 31, 2020

Temporal is the Cadence fork by the founders of the project. You can think about it as a next generation of the project. The biggest difference is gRPC protocol and TLS support.

Here is the stackoverflow answer with more info about the new features: https://stackoverflow.com/questions/61157400/temporal-workfl...

roenxi · on Aug 31, 2020

That is good, but why code all that up in YAML? This looks to me like a programming language with great efforts made to obfuscate the fact.

We've had 70 years to figure out what good programming language layout looks like. I think there is almost a consensus that it doesn't look like the linked code-wearing-a-fake-moustache .

Are people supposed to code these state machines using some sort of real language then translate it to YAML?

xiaq · on Aug 31, 2020

Yeah, I agree that YAML is not the best language for embedding a DSL, but in general GCP seems to have a preference for using YAML in its configuration languages.

I can imagine that the developers chose YAML based on its popularity elsewhere (especially in the CI/CD space), not its merits.

(Oh, and a belated disclaimer: I work for GCP but not on this product. All opinions are my own, obviously.)

baq · on Sept 1, 2020

i think the issue here is that this is code, not configuration. yaml is manageable to use as a configuration language, though as soon as you put jinja in it, it stops making sense - but this thing here, it's not even close.

lowercase1 · on Aug 31, 2020

This is similar to AWS step functions which kinda descended from more general/powerful AWS Simple Workflow.

indymike · on Aug 31, 2020

Actually, it isn't even as capable as BASIC, but it is very cool and probably really useful.

xiaq · on Aug 31, 2020

A belated disclaimer: I work for GCP, but not this product.