I think you are confusing deduplication and compression provided by the filesystem vs. deduplication and compression provided by nix.
If you were just storing a static set of pregenerated NAR archives, you will not see any benefit from filesystem-provided compression or deduplication.
If you host a live nix store (i.e. uncompressed files under /nix/store), then you could benefit from filesystem-provided compression and deduplication. Also, nix itself can replace duplicates with hard links. But the downside is then that you have to generate the NAR archives on the fly when a client requests a derivation.
That might be worth it, especially since they get great hit rates from Fastly. But on the other hand, that means a lot more moving parts to maintain and monitor vs. simple blob storage.
Definitely. It's not hard to imagine that even if they end up rolling their own for this, it might make sense for it to be a bunch of off-the-shelf Synology NASes stuffed with 22TB drives in donated rackspace around the world, running MinIO. If the "hot" data is all still in S3/CDN/cache, then you're really just keeping the rest of it around on in-case basis, and for that, the simpler the better.
Yeah that's what I was thinking. Say 3 nodes in 3 different data centers all sync'd running enough drives of spinning rust to meet their needs and then some plus the overhead that ZFS requires (i.e the rule of thumb of not using more than 80%? of the usable storage to prevent fragmentation I think) and then exposing that via an S3 compatible API via minio + a CDN of their choice.