Hybridizer simply creates CUDA C++ code from C# which is then compiled to PTX it...

pjmlp · on March 27, 2021

It is like everyone arguing about C++ for AAA studios, as if everyone was doing Crysis and Fortnight clones, while forgetting the legions of people making money selling A games.

Or forgetting the days when games written in C were actually full of inline Assembly.

It is still CUDA, regardless if it goes through PTX or CUDA C++ as implementation detail for the high level code.

dogma1138 · on March 27, 2021

You aren’t seeing the forest for the trees.

The market for these secondary implementations is tiny, and that is coming from someone who worked at a company that had CUDA executed from a spreadsheet.

The C#/Java et. al isn’t what made CUDA popular nor what would make OneAPI succeed or fail.

CUDA became popular because of its architecture, using an intermediate assembly to allow backward and forward compatibility, it had exe Elle to support across the entire NVIDIA GPU stack which means that it could run on everything form bargain bin laptops with the cheapest dGPU to HPC cards. It came with a large library of high performant libraries and yes the C++ programming model is why it was adopted so well by the big players.

And even arguably more importantly is that when ML and GPU compute exploded and that wasn’t that long ago NVIDIA from a business perspective was the top dog in town, CUDA could’ve been dog shit but when AMD could barely launch a GPU that could compete with NVIDIA’s mid range for multiple generations it wouldn’t have mattered.

nemothekid · on March 28, 2021

>CUDA could’ve been dog shit but when AMD could barely launch a GPU that could compete with NVIDIA’s mid range for multiple generations it wouldn’t have mattered.

This is really the only point to be made. Intel could release open source GPU drivers and GPGPU frameworks for every language under the sun, personally hold workshops in every city and even give every developer a back massage and everyone would likely still use CUDA.

The performance gap is still so large.

dogma1138 · on March 28, 2021

Intel has one huge advantage tho, OneAPI already supports their existing CPUs and GPUs (Gen9-12 graphics), and it’s already cross platform available on Linux, MacOS and Windows this was the biggest failure of AMD no support for consumer graphics, no support for APUs which means laptops are cut out of the equation and Linux only which limits your commercial deployment to the datacenter and a handful of “nerds”.

The vast majority of CUDA applications don’t need 100’s of HPC cards to execute, consumers want their favorite video or photo editor to work, they want to be able to apply filters to their Zoom calls, students and researchers want to be able to develop and run POCs on their laptops as long as Adobe and the likes adopt OneAPI and as long as Intel will provide a backend for common ML frameworks like Pytorch and TF (which they already do) performance at that point won’t matter as much as you think.

Performance at this scale is a business question if AMD had a decent ecosystem but lacked performance they could’ve priced their cards accordingly and still captured some market share. Their problem was that they couldn’t actually release hardware in time, their shipments were tiny and they didn’t had the software to back it up.

Intel despite all the doom and gloom still ships more chips than AMD and NVIDIA combined if OneAPI is even remotely technically competent and from my very limited experience with it it is looking rather good Intel can offer developers a huge addressable market overnight with a single framework.

pjmlp · on March 28, 2021

I am not denying that C++ is very relevant for CUDA (since version 3.0), it is also why I never bothered to touch OpenCL.

And when Khronos woke up for that fact, alongside SPIR, it was already too late for anyone to care.

Regarding the trees, I guess my point is that regardless of tiny they are, the developers behind those stacks rather bet on CUDA and eventually collaborate with NVidia than going after to the alternatives.

So the alternatives to CUDA aren't even able to significally atract those devs to their platforms, given the tooling around CUDA to support their efforts.

dannyw · on March 28, 2021

Yep, CUDA running on literally anything and everything definitely helped its success. So many data scientists, ML engineers, who got into cuda by playing with their gaming GPUs.

dogma1138 · on March 28, 2021

Which is exactly the advantage Intel has over AMD, they aren’t locked to Linux only and they support iGPUs, ROCm is essentially an extension of the Linux display driver stack at this point and barely supports any consumer hardware and most importantly APUs.

I would really want to be able to find the people at AMD who are responsible for the ROCm roadmap and ask them WTF were they thinking...