Are you agreeing or disagreeing with ars' claim that XZ provides a better compression ratio than Zstd? My data shows that it's true in at least one common use case (distribution of open-source software source archives).
I've seen similar comparative ratios from files up to the multi-gigabyte range, for example VM images. In what cases have you seen XZ produce worse compression ratios than Zstd?
Generally speaking, the top-end of xz very slightly beats the top-end of zstd. However, xz typically takes several times as long to extract. And generally I've seen xz take longer to compress than zstd, as well.
Example with a large archive (representative of compiled software distribution, such as package management formats):
$ time xz -T0 -9k usrbin.tar
real 2m0.579s
user 8m46.646s
sys 0m2.104s
$ time zstd -T0 -19 --long usrbin.tar
real 1m47.242s
user 6m34.845s
sys 0m0.544s
/tmp$ ls -l usrbin.tar*
-rw-r--r-- 1 josh josh 998830080 Jul 23 23:55 usrbin.tar
-rw-r--r-- 1 josh josh 189633464 Jul 23 23:55 usrbin.tar.xz
-rw-r--r-- 1 josh josh 203107989 Jul 23 23:55 usrbin.tar.zst
/tmp$ time xzcat usrbin.tar.xz >/dev/null
real 0m9.410s
user 0m9.339s
sys 0m0.060s
/tmp$ time zstdcat usrbin.tar.zst >/dev/null
real 0m0.996s
user 0m0.894s
sys 0m0.065s
Comparable compression ratio, faster to compress, 10x faster to decompress.
And if you do need a smaller compression ratio than xz, you can get that at a cost in time:
That seems fine -- it's a tradeoff between speed and compression ratio, which has existed ever since compression went beyond RLE.
Zstd competes against Snappy and LZ4 in the market of transmission-time compression. You use it for things like RPC sessions, where the data is being created on-the-fly, compressed for bandwidth savings, then decompressed+parsed on the other side. And in this domain, Zstd is pretty clearly the stand-out winner.
When it comes to archival, the wall-clock performance is less important. Doubling the compress/decompress time for a 5% improvement in compression ratio is an attractive option, and high-compression XZ is in many cases faster than high-compression Zstd even delivering better ratios.
---
EDIT for parent post adding numbers: I spot-tested running zstd with `-22 --ultra` on files in my archive of source tarballs, and wasn't able to find cases where it outperformed `xz -9`.
I think you're missing the point that in terms of tradeoffs people are willing to make: absolute compression ratio loses to 80% of the compression ability with big gains to decompression speed (aka include round trip cpu time if you want something to agree / disagree with, we're not talking about straight compression ratios).
Arch Linux is a case study in a large distributor of open source software that switched from xz compressed binaries to zstd and they didn't do it for teh lulz[0].
Yup and how much better is about 1%.
"zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup."
I've seen similar comparative ratios from files up to the multi-gigabyte range, for example VM images. In what cases have you seen XZ produce worse compression ratios than Zstd?