> GPU databases are brilliant for cases where the working set can live entirely within the GPU's memory.
Probably true for current computer software. But there are numerous algorithms that allow groups of nodes to work together on database-joins, even if they don't fit in one node.
Consider Table A (1000 rows), and Table B (1,000,000 rows). Lets say you want to compute A Join B, but B doesn't fit in your memory (lets say you only have room for 5000 rows). Well, you can split Table B into 250 pieces, each with 4000-rows.
TableA (1000 rows) + TableB (4000 rows) is 5000 rows, which fits in memory. :-)
You then compute A join B[0:4000], then A join B[4000:8000], etc. etc. In fact, all 250 of these joins can be done in parallel.
----------
As such, its theoretically possible to perform database joins on parallel systems, even if they don't fit into any particular node's RAM.
Probably true for current computer software. But there are numerous algorithms that allow groups of nodes to work together on database-joins, even if they don't fit in one node.
Consider Table A (1000 rows), and Table B (1,000,000 rows). Lets say you want to compute A Join B, but B doesn't fit in your memory (lets say you only have room for 5000 rows). Well, you can split Table B into 250 pieces, each with 4000-rows.
TableA (1000 rows) + TableB (4000 rows) is 5000 rows, which fits in memory. :-)
You then compute A join B[0:4000], then A join B[4000:8000], etc. etc. In fact, all 250 of these joins can be done in parallel.
----------
As such, its theoretically possible to perform database joins on parallel systems, even if they don't fit into any particular node's RAM.