I have to ask - with 80 clusters, do you set up a cluster per customer? Or do you put multiple customers on the same cluster?
Can you talk a bit more about how you've set up your workload queues in the Redshift WLM? I can spot at least three different workloads in your post:
1) loads into Redshift
2) transformations once data is in Redshift
3) customer ad-hoc queries
And then what node type do you use for your 80 clusters? I'm asking because the smaller dense-compute clusters offer more I/O, which helps a lot when you're loading data into the cluster.
We offer our customers the option of managing their own cluster or having us manage it for them. About 80% of our customers choose the first option. In this case, cluster sizing and queue configuration is up to them.
When we manage the cluster, we set up a new cluster for each customer. The node type we choose depends on the customer - for larger customers we'll typically use the DS type nodes (with hard disk storage). The slower I/O isn't a huge deal for us, since getting the data from our database to S3 is usually the bottleneck in the process, rather than copying from S3 into Redshift.