TPC-C is much, much harder to do well on than that benchmark, even if you are using special hardware and even if you are using "tricks" like intentionally misreading the TPC-C invariants in order to avoid coordination. It's not directly comparable to what you posted; while TPC-C performs joins, reads, insertions, and updates, including ones that necessarily must cross partition boundaries some percentage of the time, and specifies invariants that must be maintained in order to qualify as a correct run, the benchmark you posted (flexAsynch) does only reads or only writes in the tested configuration--both options can be trivially executed without any coordination at all on as many nodes as you feel like. As such, it's more of a stress test of the overhead of the implementation than it is an indication of the actual performance you will get out of a normal workload on such a cluster.
And while I'd like to say that MySQL Cluster is nonetheless exhibiting very impressive performance, I can't really say that; they are using expensive networking and hardware that most people don't have available, microoptimizing both the client and server sides, and using a low-level API specifically designed for doing well in these sorts of benchmarks, but they still lag far behind the per-core performance of state of the art key / value stores in similar circumstances. For example, MICA can do 80 million key/value requests per second of a comparable size to the ones they listed on a single commodity server, with only regular NiCs and 10GiB ethernet (and in fact can saturate the Ethernet link). Granted, MySQL Cluster is a full-fledged database and MICA just does key/value, but I can pretty much guarantee you that on real requests MySQL Cluster's performance collapses, and in multi-master mode it's known to be inconsistent anyway.
If you really need hundreds of millions of k/v requests per second, you'll pay a lot less buying three servers and hiring all the researchers who wrote the MICA paper to manage your key / value store than you will buying MySQL Cluster :P Or, if you want a real database, you can play the same game with the many, many database systems that do much better than Oracle's cluster on TPC-C; the same person who wrote the MICA paper released one last year about Cicada, which can handle over 2 million serializable TPC-C transactions per second on a single 28-core server. Or you can try Calvin, which can do around 500,000 georeplicated TPC-C transactions per second on a commodity EC2 100-node cluster (2.5 compute units each back in 2014) and can operate on top of any existing key-value store. The database world has advanced a lot in the past ten years, and people who really need performance have no shortfall of options that aren't Oracle.
Great, thanks for the in-depth reply. Just saw it now. I agree that TPC-C is much harder than flexasynch, which is just a KV benchmark.
Here's a different benchmark. NDB got around 8m ops/sec on 8 datanodes when performing 1.2m HDFS ops/sec in https://www.usenix.org/conference/fast17/technical-sessions/... .
That is not a KV workload. It is, in fact, a very mixed workload with every transaction being a cross-partition transaction, lots of batched PK ops, lots of partition-pruned index scans. Transactions containing lots of ops. Full-table scans. Index scans. And get this, it's open-source. Research has, of course, advanced since NDB, but nothing open-source still comes close to it for some (sometimes complex) OLTP workloads - anything subscriber oriented. Not TPC-C workloads, they suck on NDB.
I'd never heard of MICA, will read up on it. Calvin, though, is extremely limited. They tried to build CalvinFS on top of Calvin, but it didn't work. Subtree ops didn't scale (at all), as the DB wasn't flexible enough. Features and maturity sometimes matter, and NDB has both.
And while I'd like to say that MySQL Cluster is nonetheless exhibiting very impressive performance, I can't really say that; they are using expensive networking and hardware that most people don't have available, microoptimizing both the client and server sides, and using a low-level API specifically designed for doing well in these sorts of benchmarks, but they still lag far behind the per-core performance of state of the art key / value stores in similar circumstances. For example, MICA can do 80 million key/value requests per second of a comparable size to the ones they listed on a single commodity server, with only regular NiCs and 10GiB ethernet (and in fact can saturate the Ethernet link). Granted, MySQL Cluster is a full-fledged database and MICA just does key/value, but I can pretty much guarantee you that on real requests MySQL Cluster's performance collapses, and in multi-master mode it's known to be inconsistent anyway.
If you really need hundreds of millions of k/v requests per second, you'll pay a lot less buying three servers and hiring all the researchers who wrote the MICA paper to manage your key / value store than you will buying MySQL Cluster :P Or, if you want a real database, you can play the same game with the many, many database systems that do much better than Oracle's cluster on TPC-C; the same person who wrote the MICA paper released one last year about Cicada, which can handle over 2 million serializable TPC-C transactions per second on a single 28-core server. Or you can try Calvin, which can do around 500,000 georeplicated TPC-C transactions per second on a commodity EC2 100-node cluster (2.5 compute units each back in 2014) and can operate on top of any existing key-value store. The database world has advanced a lot in the past ten years, and people who really need performance have no shortfall of options that aren't Oracle.