> Redpanda end-to-end latency of their 1 GB/s benchmark increased by a large amo...

comet-engine · on May 15, 2023

It's pretty ironic considering they blame JVM garbage collection for bad latency, but ignore their own disk garbage collection that also seems to cause some pretty bad latency.

chaotic-good · on May 17, 2023

The disk thrashes because of fsyncs (Kafka doesn't perform any fsync's). But you can provision more disk space to mitigate this problem. And it looks like the test was set up this way to make Redpanda look worse.

chaotic-good · on May 17, 2023

You have to provision you disk space accordingly. NVMe needs some free space to have good performance. In this case I think that in Redpanda benchmarks the disk space was available and in case of benchmarks done by Confluent guy the system was provisioned to use all disk space.

With page cache it's OK, because the FTL layer of the drive will work with 32MiB blocks but in case of Redpanda the drive will struggle because FTL mappings are complex and GC has more work. If Kafka would be doing fsync's the behaviour would be the same.

Overall, this looks like a smearing campaign against Redpanda. The guy who wrote this article works for Confluent and he published it on his own domain to look more neutral. The benchmarks are not fair because one of the systems is doing fsyncs and the other does not. Most differences could be explained by this fact alone.