Confirmed. Also, ClickBench is working from a batch-loaded data set, which is kind of antithetical to a real-world, real-time analytical database workload.
[Disclosure: I work at StarTree, and we're powered by Apache Pinot.]
We are currently considering / evaluating different methodologies to benchmark more realistic situations for real-time analytics. Potential consideration for your own benchmarking / POCs, or for a future industry benchmark spec:
1. Some sort of "freshness" (data latency) measurement: time for streaming ingestion / indexing / data ready for query. Is it consistent, or are there pauses in ingestion?
2. Some sort of "ingestion scaling" measurement: how many objects per second can you get to before you choke IO? What happens to ingested objects at different payload sizes? (This interacts with "freshness" above; i.e., you might [or might not] be able to throttle ingestion to improve freshness.)
3. Query concurrency/throughput: does your query capacity scale linearly or non-linearly? What happens at 10 QPS? 100 QPS? 1000 QPS? 10000 QPS? 100000 QPS? (Or when does it top out?)
4. Data volume: Are you querying against 1TB? 10TB? 100TB? 1 PB? (More?) This interacts with query concurrency. Because driving 100 QPS against 1 PB is a totally different case than driving 100000 QPS against 1 TB.
5. Data storage type: Are you running against local NVMe, EBS, or S3 buckets? (Maybe even HDD?) Is it all uniform storage, or is it in a tiered storage topology? If tiered, what's the % mix of the different storage types? This is just an increasing reality all vendors need to deal with. Customers want to optimize their spend per use case.
6. Query complexity: Before talking simple "latencies," you have to understand what sort of queries you're running. These aren't simple atomic row CRUD operations like a Redis or a ScyllaDB. How are you doing aggregates? Are you running queries against denormalized data in a single table, or are you doing single JOINs or multiple table complex JOINs?
7. Indexing: As pointed out by shadow28, indexes are vital for best performance. Which type of index was used? (Apache Pinot supports about a dozen different types).
And my personal favorite to throw into the mix:
8. Cluster resilience: Great. All of the above worked on a fully-provisioned stable cluster. Now knock out a node. Do it. See what happens. How long before the cluster rebalances and quiesces? What happens to your QPS and latencies during the rebalance, and then after quiescence? Measure that. Now knock out a 2nd node. Maybe a third. How many nodes can you bring down before performance goes non-linear, or the cluster is rendered as utterly unreliable.
This latter I call the "Torpedo test;" and I've been preaching about it for years[1]. How many "torpedos" can you cluster take before it sinks under the waves. It's not specific to real-time OLAP. You can use this kind of methodology to test resilience of any distributed system. And you should probably do this before you hit production.
Looks like they don't configure any indexes for Pinot in their benchmarks, which is one of Pinot's main selling points on the performance front - https://github.com/ClickHouse/ClickBench/issues/37.