Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Correction: Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia.

NVIDA open-sourced their CUDA implementation to the LLVM project 5 years ago, which is why clang can compile CUDA today, and why Intel and PGI have clang forks compiling CUDA to multi-threaded and vectorized x86-64 using OpenMP.

That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.



> Do you work for AMD

I do not. And I use NVidia hardware regularly for GPGPU. But I hate fanboyism.

> NVIDA open-source their CUDA implementation to the LLVM project 5 years ago

Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.

Nothing is "stable" nor "branded" in this work.... Consequently, 99% of public Open Source CUDA-using software still compile ONLY with the CUDA proprietary toolchain ONLY on NVidia hardware. And this is not going to change anything soon.

> one from PGI, that compile CUDA to multi-threaded x86-64 code using OpenMP.

The PGI compiler is proprietary and now property of NVidia. It was previously properitary and independant but mainly used for its GPGPU capability through OpenACC. OpenACC backend targets directly the nvidiaptx (proprietary) format. Nothing related with CUDA.

> Intel being the main vendor pushing for a parallel STL in the C++ standard

That's wrong again.

Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group (http://stellar-group.org/libraries/hpx/).

They are pretty smart people and deserve at least respect and parent-ship for what they have done.

More information from Hermut Kaiser (Very Nice Guy btw) here (https://www.youtube.com/watch?v=6Z3_qaFYF84).

They have been the precursor of the idea of parallel "algorithms" in the STL and the concept of "Execution policy" you have in C++17 comes from them.

To the defense of Intel (and up to my knowledge) they have provided the first OSS implementation for compilers for it.


> But I hate fanboyism.

"The only excuse I can see to this attitude is greed" sounds pretty fanboyish to me. :-)

I've never understood why Microsoft, or Adobe, or Autodesk, or Synopsys, or Cadence or any other pure software company is allowed to charge as much as the market will bear for their products, often more per year than Nvidia's hardware, but when a company makes software that runs on dedicated hardware, it's called greed. I don't think it's an exaggeration when I say that, for many laptops with a Microsoft Office 365 license, you pay more over the lifetime of the laptop for the software license than for the hardware itself. And it's definitely true for most workstation software.

When you use Photoshop for your creative work, you lock your design IP to Adobe's Creative Suite. When you use CUDA to create your own compute IP, you lock yourself to Nvidia's hardware.

In both cases, you're going to pay an external party. In both cases, you decide that this money provides enough value to be worth paying for.


> Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.

This is widely inaccurate.

While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today. The PTX backend that Google contributed was removed some time later.

> Nothing is "stable" nor "branded" in this work.

This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend, in strong contrast with the other PTX backend that LLVM used to have (it actually had both for a while).

> OpenACC backend targets directly the nvidiaptx (proprietary) format.

This is false. Source: I've used the PGI compiler on some Fortran code, and you can mix OpenACC with CUDA Fortran just fine, and compile to x86-64 using OpenMP to just target x86 CPUs. No NVIDIA hardware involved.

> That's wrong again. > > Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group

This is also widely inaccurate. The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries [0]. The author of Thrust was the Editor of the Parallelism TS, and is the chair of the Parallelism SG. The members of the STELLAR group that worked on HPX started collaborating more actively with ISO once they started working at NVIDIA after their PhDs. One of them chairs the C++ library evolution working group. The Concurrency working group is also chaired by NVIDIA (by the other nvidia author of the original parallelism TS.

AMD is nowhere to be found in this type of work.

[0] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n372...


> While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today.

You more or less reformalized what I said. It might become used one day behind a proprietary blob, rebranded blob of NVidia, but fact is that today, close to nobody use it for production in the wild and it is not even supported officially.

> This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend.

It does not mean it's stable or used. I do not now a single major GPGPU software in existence that ever used it in an official distribution. Like I said.

> CUDA Fortran just fine

CUDA fortran, yes you said it, CUDA fortran. The rest is OpenACC.

> The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries

My apologies for that. I was ignoring this precedent work.

> AMD is nowhere to be found in this type of work.

I do not think I ever said anything about AMD.


> CUDA fortran, yes you said it, CUDA fortran. The rest is OpenACC.

You can also mix C, OpenACC, and CUDA C, and compile to x86-64. So I'm really not sure about what point you are trying to make here.

You were claiming that OpenACC and CUDA only runs on nvidia's hardware, yet I suppose you now agree that this isn't true I guess.

I do agree that PGI is still nvidia owned, but there are other compilers that do what PGI does.


> You were claiming that OpenACC and CUDA only runs on nvidia's hardware, yet I suppose you now agree that this isn't true I guess.

I do not think I ever said that OpenACC runs only on NVidia hardware. However CUDA I still affirm that CUDA runs only on NVidia hardware yes. For anything else, it is based on code converter in best case.


> That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.

Using a branded & under patent concurrent proprietary technology and copying its API for your own implementation is Maddness that will lead you for sure in front of a court.

It seems that even Google understood that the hard way (https://en.wikipedia.org/wiki/Google_v._Oracle_America)


How come? There is a CUDA C++ and CUDA C toolchains available under a MIT license, large part s of which are contributed by NVIDIA.

How can they sue you for using something that they give you with a license that says "we allow you to do whatever you want with it" ?


the MIT license doesn't have an express patent grant. If Nvidia has a patent on some technology used by the open source code, they could sue you for patent infringement if you use it in a way that displeases them. What they can't do is sue you for copyright infringement.


Google v Oracle is still unsettled.

Most other legal precedent was that it was fine to clone an API.


> Most other legal precedent was that it was fine to clone an API.

CUDA is more than an API. It is a technology under copyright and very likely patented too. Even the API itself contains multiple reference to "CUDA" in function calls and variable name.


None of that protects it from being cloned under previous 9th circuit precedent except maybe patents, but I'm not aware of any patents that'd protect another against CUDA implementation.


>Intel and PGI have clang forks compiling CUDA to multi-threaded and vectorized x86-64 using OpenMP.

Where are these forks?



For PGI, all pgi compilers can do this, just pick x86-64 as the target. There are also other forks online (just search for LLVM, CUDA, x86 as keywords), some university groups have their forks on github, where they compile CUDA to x.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: