Haven't, but it's worth noting that hardware is probably attributable to them edging out Mapd since they're on a 5-node minsky cluster featuring nvlink, hence as arnon said, are benefiting from 9.5x faster transfer from disk than than PCIe 3.0. That blog has not yet tested Mapd's IBM Power version-- would be interesting to see how it compared on that cluster.
- They've built their database on Postgres for query planning, but for any query which does not match what they've accelerated on GPU, they do not have the ability to failover to utilizing postgres on the CPU.
https://youtu.be/oL0IIMQjFrs?t=3260
- Data is brought into GPU memory at table CREATE time, so the cost of transferring data from disk->host RAM->GPU RAM is not reflected. Probably wouldn't work if you want to shuffle data in/out of GPU RAM across changing query workloads. https://youtu.be/oL0IIMQjFrs?t=1310
Note that it's at the top of the list probably because it's running on a cluster. It would be awesome to see such a comparison on some standard hardware, like a large AWS GPU instance (eg1.2xlarge).
Also note that the dataset is 600GB, so it won't fit a sinlge GPU, not even close.
And the Postgres run was on 16GB of RAM and a rather slow SSD in a single drive configuration. Would have been interesting to see the results of either in memory or on a faster storage system.
The cost of GPUs doesn't make sense for the compute they offer.
According the benchmark, the fastest 8 GPU node takes about 0.5 seconds. The cost of that node on AWS is about 24$/hour. The 21 node spark cluster takes 6 seconds. But, it only costs 4$/hour.
An additional benefit with Spark is that it can be used for a lot more variety of operations than a GPU.
This cost disadvantage restricts GPU processing to niche use cases.
> According the benchmark, the fastest 8 GPU node takes about 0.5 seconds. The cost of that node on AWS is about 24$/hour. The 21 node spark cluster takes 6 seconds. But, it only costs 4$/hour.
Using your numbers, the GPU solution has half the cost for similar performance? How does that not make sense?
> This cost disadvantage restricts GPU processing to niche use cases.
> The cost of GPUs doesn't make sense for the compute they offer.
This assumes AWS pricing. You build a farm of GPUs and buy in bulk, you get much better cost basis. GPU farms are becoming more and more of a thing now and definitely less 'niche'.
[1] https://tech.marksblogg.com/benchmarks.html
Edit: Unfortunately, each run is on different hardware, but it at least gives you an idea of what's possible.