Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Redpanda end-to-end latency of their 1 GB/s benchmark increased by a large amount once the brokers reached their data retention limit and started deleting segment files. Current benchmarks are based on empty drive performance.

This seems really disingenuous to use empty drive performance, since anyone who cares about performance is going to be caring about continuous use.



It's pretty ironic considering they blame JVM garbage collection for bad latency, but ignore their own disk garbage collection that also seems to cause some pretty bad latency.


The disk thrashes because of fsyncs (Kafka doesn't perform any fsync's). But you can provision more disk space to mitigate this problem. And it looks like the test was set up this way to make Redpanda look worse.


You have to provision you disk space accordingly. NVMe needs some free space to have good performance. In this case I think that in Redpanda benchmarks the disk space was available and in case of benchmarks done by Confluent guy the system was provisioned to use all disk space.

With page cache it's OK, because the FTL layer of the drive will work with 32MiB blocks but in case of Redpanda the drive will struggle because FTL mappings are complex and GC has more work. If Kafka would be doing fsync's the behaviour would be the same.

Overall, this looks like a smearing campaign against Redpanda. The guy who wrote this article works for Confluent and he published it on his own domain to look more neutral. The benchmarks are not fair because one of the systems is doing fsyncs and the other does not. Most differences could be explained by this fact alone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: