Hacker News new | past | comments | ask | show | jobs | submit login

You can't use 10TFlops of compute on a GPU if you can't even feed it data quickly enough. The state of the art for throughput is Nvidia's NVLink and you're capped to a theoretical max of 160GB/sec. Given how trivial most analytics workloads are (computing ratios, reductions like sums, means, and variances, etc.) there's simply no way you're going to effectively max out the compute available on a GPU.

Searching sorted arrays is actually very common in these workloads. Why? Analytics workloads typically operate on timestamped data stored in a sorted fashion where you have perfect or near perfect temporal and spatial locality. Thus even joins tend to be cheap.

With Skylake and AMD Epyc nearing in on 300GB/sec and much better cost efficiency per GB of memory vs. GPU memory the case for GPUs in this application seems dubious.

I will grant you that GPUs have a place in more complex operations like sorts and joins with table scans. They also blow past CPUs when it comes to expensive computations on a dataset (where prefetching can mask latencies nicely).




Yeah, it will definitely depend on the workload.

A good example of a dense sort + join GPU workload would be looking for "Cliques" of Twitter or Facebook users. A Clique of 3 would be three users, A, B, and C, where A follows B, B follows C, and C follows A.

You'd perform this analysis by performing two joins: the follower-followee table on itself three times.

----------

So it really depends on your workload. But I can imagine that someone who is analyzing these kinds of tougher join operations would enjoy GPUs to accelerate the task.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: