Hacker News new | past | comments | ask | show | jobs | submit login
Programming on Parallel Machines: GPU, Multicore, Clusters and More (ucdavis.edu)
165 points by rahiel on March 15, 2016 | hide | past | favorite | 14 comments



Professor Matloff can add the FGPA+CPU to a session of this book.

A lot of interesting research can be done with FPGA+CPU in parallel computing.


I feel I can talk about this! My master's thesis was computation of a neural network using FPGA + CPU. The original SNN code was in C++, my thesis implemented in in OpenCL. This was using the Altera OpenCL to FPGA implementation

Essencially taking the inner-most loop (that computed if a neuron would spike or not) and implementing it as a kernel in OpenCL.

Step 1 was showing increase from single-thread C++ to OpenCL kernel. Increase was 6-10x using a i7-2600k and running on all logical cores. Step 2 was implementing in FPGA. This means pre-shipping data to the FPGA while CPU calculated other things, and beginning computation on the FPGA, and receiving responses back on CPU. Performance was 75x compared to single-thread C++ code.

Important notes that I didn't expect: Bottleneck was memory transfer bandwidth across PCI-E. Power consumption was less on FPGA compared to CPU. Development time was significantly lessened. Altering the design is simple when going from OpenCL > FPGA, compared to Verilog > FPGA


In the late nineties I had the rare opportunity to work with a very exotic “FPGA hypercomputer” (yes, the marketing makes me cringe) that basically consisted of an array of FPGAs that instantiated logic at the hardware level and dynamically readjusted if required. It was a prototype built by now-defunct Starbridge Systems and designed, amongst others, by Fagin of Intel microprocessor fame.


Interesting....

What kind tool do you use for OpenGL to FPGA?



>Bottleneck was memory transfer bandwidth across PCI-E

This is why Nvidia are working on NVLink to replace PCI-E.


Is your thesis online where we can look at it?


I took this class IRL with Matloff when I was an undergrad. Highly recommended, great course from a great professor <3


This kind of book is always my favorite kind of books. When you talk about the whole topic and devote a chapter to each topic, you are essentially given a broad and firm grasp of what's going on in your field to your readers. These kind of books are perfect introductory books.

Sincerely, Thank you.


Thank you so much for this resource! Just the thing I've been studying lately. :)


If I read this book, would it be practical to build an Erlang VM targeting GPUs? The Erlang GPU work I've seen provides access via NIFs, but as I understand it, those are going to continue to hit the PCI-E bottleneck. I'm speculating about the feasibility of Erlang putting its "processes" onto the GPU cores, and the data staying on the GPU until it needed to do network, disk, or other OS mediated access.

If the question is ignorant, I plead guilty.


GPUs are best under SIMD conditions: single instruction, multiple data. You're talking about running `eval` thousands of times. Each unit of execution is going to have different data, because each process is executing different code (especially when you consider different branches of a conditional statement).

So, it wouldn't work that well :-)


I confusingly used 'data' in two senses there... the second was is in the sense of 'code is data'.


My only complaint is that the PDF is formatted for print and uses typographic ligatures. To me, HTML-first makes sense for anything intended for widespread digital sharing. Reflowing and screen reading text are minimum design accommodations for a large number of people because of physical or hardware limitations.

Or to put it another way, I think that over the long run, such features are more important than the license because there are non-technical work arounds for the license that don't require duplicated effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: