Sure, you can spin out multiple executors and have it distribute work amongst them…
Provided you figure out the 3 million configs necessary to do this in a reasonable way, so our dependency, networking and storage issues, only for the runtime to decide that actually, it’d quite like to run your entire workload on only 1 of your machines, or something equally frustrating.
Given how frustrating it is to operate on, I’m shocked it has gained as much popularity as it has. I sort of wonder if that’s a bit by design from vested commercial interests- make it simultaneously exceedingly popular and so convoluted to run that you are basically obliged to pay for the likes of databricks just to make your life not awful.
Of course, once you’ve done this, you’ve now bought into a whole suite of other issues, but that’s a different discussion…
Sure, you can spin out multiple executors and have it distribute work amongst them…
Provided you figure out the 3 million configs necessary to do this in a reasonable way, so our dependency, networking and storage issues, only for the runtime to decide that actually, it’d quite like to run your entire workload on only 1 of your machines, or something equally frustrating.
Given how frustrating it is to operate on, I’m shocked it has gained as much popularity as it has. I sort of wonder if that’s a bit by design from vested commercial interests- make it simultaneously exceedingly popular and so convoluted to run that you are basically obliged to pay for the likes of databricks just to make your life not awful.
Of course, once you’ve done this, you’ve now bought into a whole suite of other issues, but that’s a different discussion…