Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good question.

    Bandwidth of dual channel DDR4-3600: 48 GB/s
    Bandwidth of PCIe 4 x16: 26 GB/s
    Bandiwdth of 3090 GDDR6X memory: 935.8 GB/s
Since neural network evaluation is usually bandwidth limited, it's possible that pushing the data through PCI-E from CPU to GPU is actually slower than doing the evaluation on CPU only for typical neural networks.

https://www.microway.com/knowledge-center-articles/performan...

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_proces...



And that's without even taking into account latency of accessing main memory through PCIe, which would make matters even worse.


Ok, but at least it would run.


What's the point of running it on the GPU if to do so you need to make it slower tham running in the CPU? Just run it on the CPU at that point.


I once tried to start Firefox (back in the 2.5-3.0 days >:D) on a Celeron with 64MB RAM.

It worked perfectly fine, with the sole exception that the HDD LED was on solid the whole time, a single window took just over a literal half an hour to open, and loading a webpage took about 1-2 minutes.

But it worked.


It already does, on the CPU.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: