depends on the current load. i've worked places where we would create nightly postgres dumps via pg_dumpall, then pipe through pigz to compress. it's great if you run it when load is otherwise low and you want to squeeze every bit of performance out of the box during that quiet window.
this predates the maturation of pg_dump/pg_restore concurrency features :)
Not to over state it, embedding the parallelism into the application drives to the logic "the application is where we know we can do it" but embedding the parallelism into a discrete lower layer and using pipes drives to "this is a generic UNIX model of how to process data"
The thing with "and pipe to <thing>" is that you then reduce to a serial buffer delay decoding the pipe input. I do this, because often its both logically simple and the component of serial->parallel delay deblocking on a pipe is low.
Which is where xargs and the prefork model comes in, because instead you segment/shard the process, and either don't have a re-unification burden or its a simple serialise over the outputs.
When I know I can shard, and I don't know how to tell the appication to be parallel, this is my path out.
this predates the maturation of pg_dump/pg_restore concurrency features :)