The abstract of OPs link mentions "Processing-Using-DRAM (PUD)" as exactly that, using off the shelf components. I do wonder how they achieve that, I guess fiddling with the controller in ways that are not standard but get the job (processing data in memory) done.
Edit: Oh and cpldcpu linked the ComputeDRAM paper that explains how to do it with off the shelf parts.