I find the discussion of monolith vs microservices to be very unhelpful. It is a...

closeparen · on April 11, 2020

The "vectorizing" and "pipelining" here seem to work when describing changes/deployments made to the system, but that seems orthogonal to the data processed by the system.

If one part of your workload is suited to pipelining, and another part of your workload is suited to vectorizing, then that might be a reason to split the workload into different processes running on different clusters. Few but beefy nodes for the vectorized part. Many smaller nodes for the pipelined part.

js8 · on April 11, 2020

> The "vectorizing" and "pipelining" here seem to work when describing changes/deployments made to the system

That's not what I mean.

But you're correct, what you want to do with the data informs your decision of what should be separate processes, and you once you know, you might decide to split (modularize) the processing code accordingly.

Doing that other way around (i.e. to design the modules before you understand the data flow) is just going to cause more trouble.

closeparen · on April 12, 2020

Monolithic architecture implies the company is only willing to manage one production deployable. Someone solving a specific problem cannot introduce new processes / network boundaries even if warranted based on the characteristics of their problem.