> They make good hardware, but there's a lot of lock-in, and not a lot of transparency.
This sounds like you'd like NVIDIA to open-source all their software. I see this type of request a lot, but I don't see it happening.
NVIDIA's main competitive advantage over AMD and Intel is its software stack. AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it? AMD software offer is just poor, of course they open-source everything, they don't make any software worth buying.
ARM and Intel make great software (the Intel MKL, Intel SVML, ... libraries, icc, ifort, ... compiler), and it doesn't open-source any of that either for the same reasons as NVIDIA.
Intel and NVIDIA employ a lot of people to develop their software stacks. These people aren't probably very cheap. AMD strategy is to save a lot of money in software development, maybe hoping that the open-source communities or Intel and NVIDIA will do it for free.
I also see these requests that Intel and NVIDIA should open-source everything together with the explanation that "I need this because I want to buy AMD stuff". That, right there, is the reason why they don't do it.
You want to know why NVIDIA has 99% of the Cloud GPGPU hardware market and AMD 1%? If you think 10.000$ for a V100 is expensive, do the math on how much does an AMD MI50 costs: 5000$ for the hardware, and then a team of X >100k$ engineers (how much do you think AI GPGPU engineers cost?) working for N years just to play catch on the part of the software stack that NVIDIA gives you with a V100 for free. That goes into multiple million dollars more expensive really quickly.
> AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it?.
Correction: Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia. They can not even change even if they want to: it is pure madness to reprogram an entire GPGPU software stack every 2 years just to change your hardware provider.
And I think it will remain like that until NVidia get sued for anti-trust.
> ARM and Intel make great software [..] doesn't open-source any of that either for the same reasons as NVIDIA.
That's propaganda and it's wrong.
Intel and ARM contribute a lot to OSS. Most of the software they release nowadays is Open Source. This includes compiler support, drivers, libraries and entire dev environment: mkl-dnn, TBB, BLIS, ISPC, "One", mbedTLS.... ARM has even an entire foundation only to contribute to OSS (https://www.linaro.org/) .
Near to that, NVidia does close to nothing.
There is no justification to NVidia's attitude related to OSS. It reminds me the one of Microsoft at its darkest days.
The only excuse I can see to this attitude is greed.
I hope at least they do not contaminate Mellanox with their toxic policies. Mellanox was an example of successful Open Source contributor/company (up to now) with OFabric (https://www.openfabrics.org/).
It would be dramatic if this disappear.
Amd doesn't even have software for GPGPU on some of their cards. I have an rx5700xt and I cant use it for anything but gaming because ROCm doesn't support navi cards, a whole year after its release.
It gets even worse. There was recently a regression in the 5.4, 5.5, and 5.6 kernels that hit me hard for a week or so on Manjaro last month. System just decided to lock up or restart. Thought the graphics card had died when it happened once on Windows. Working fine now-these drivers have been out for 10 months now.
Even worse, AMD has locked down the releases of some of their 'GPUOpen' software.
I don't necessarily have an opinion either way in this discussion but wanted to point out that Intel's latest MKL library does seem to be done as an open source project https://github.com/oneapi-src/oneMKL
GPUs are immensely complex systems. Look at an API like Vulcan, plus it's shading language, and tell me again it's simple. And that's a low-level interface.
Now add to that the enormous amount of software effort that goes into implementing efficient libraries like cuBLAS, cuDNN, etc. There's a reason other vendors have struggled to compete with NVidia.
Part of Nvidia's advantage comes from building the hardware and software side by side. No one was seriously tackling GPGPU until Nvidia created Cuda, and if you look at the rest of the graphics stack Nvidia is the one driving the big innovations.
GPUs are sufficiently specialized in both interface and problem domain that GPU enhanced software is unlikely to appear without a large vendor driving development, and it would be tough for that vendor to fund application development if there is no lock in on the chips.
which leads to the real question. What business model would enable GPU/AI software development without hardware lock-in? Game development has found a viable business by charging game publishers.
Would you agree that that your observations somewhat imply that a competitive free market is not a fit for all governable domains (and don't mistake governable for government there, we're still talking about shepherding of innovation)?
Early tech investments are risky, but if your competition has tech 10 years more advanced than yours, there is probably no amount of money that would allow you to catch up, surpass, and make enough profits to recover the investment, mainly because you can't buy time, and your competitor won't stop to innovate, they are making a profit and you aren't, etc.
So to me the main realization here is that in tech, if one competitor ends up with tech that's 10 years more advanced than the competition, it is basically a divergence-type of phenomenon. It isn't worth it for the competition to even invest in trying to catch up, and you end up with a monopoly.
This is a good callout, unlike manufacturing the supply chain is almost universally vertically integrated for large software projects. While it's possible to make a kit car that at least some people would buy, most of the big tech companies have reached the point of requiring hundreds of engineers for years to compete.
Caveat that time has shown that the monopolies tend to decay over time for various reasons, the tech world is littered with companies that grew too confident in their monopoly.
The problem with vertically integrated technology is that if a huge advancement appears at the lowest level of the stack that would require a whole re-implementation of the whole stack, a new startup building things from scratch can overthrown a large competitor that would need to "throw" their stack away, or evolve it without breaking backward compatibility, etc.
Once you have put a lot of money into a product, it is very hard to start a new one from scratch and let the old one die.
I think you would need to take a fine tooth comb to the definitions here. I could see a few different options emerge for non-Nvidia software including
- Cloud providers wishing to provide lower CapEx solutions in exchange for increased OpeX and margin.
- Large Nvidia customers forming a foundation to shepherd Open implementations of common technology components
From a free market perspective both forms of transaction would be viable and incentivized, but neither option necessarily leads to an open implementation.
I have been stating similar thing on GPU for a very long time.
The GPU hardware is ( comparatively ) simple.
It is the software that sets GPU vendors apart. For Gaming, that is Drivers. For Compute that is CUDA.
On a relative scale, getting a decent GPU design may have a difficulty of 1, getting a decent Drivers to work well on all existing software is 10, getting the whole ecosystem system around your Drivers / CUDA + Hardware is likely in the range of 50 to 100.
As far as I can tell, under Jensen's leadership, the chance of AMD or even Intel to shake up Nvidia's grasp in this domain is partially zero in the foreseeable future.
That is speaking as an AMD shareholder and really wants AMD to compete.
Letting AMD or Intel port themselves everything that has been developed in CUDA like it was done for Pytorch is not substainable and will always lag behind.
It can only help to create a monopoly on the long term.
As Hip continues to implement more of CUDA, I think we'll see more developers doing it themselves when the barrier to porting is smaller. AMD has a lot of work to do, and I don't know whether they'll succeed or not, but IMO they have the right strategy.
> Correction: Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia.
NVIDA open-sourced their CUDA implementation to the LLVM project 5 years ago, which is why clang can compile CUDA today, and why Intel and PGI have clang forks compiling CUDA to multi-threaded and vectorized x86-64 using OpenMP.
That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.
I do not. And I use NVidia hardware regularly for GPGPU. But I hate fanboyism.
> NVIDA open-source their CUDA implementation to the LLVM project 5 years ago
Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.
Nothing is "stable" nor "branded" in this work.... Consequently, 99% of public Open Source CUDA-using software still compile ONLY with the CUDA proprietary toolchain ONLY on NVidia hardware. And this is not going to change anything soon.
> one from PGI, that compile CUDA to multi-threaded x86-64 code using OpenMP.
The PGI compiler is proprietary and now property of NVidia. It was previously properitary and independant but mainly used for its GPGPU capability through OpenACC. OpenACC backend targets directly the nvidiaptx (proprietary) format. Nothing related with CUDA.
> Intel being the main vendor pushing for a parallel STL in the C++ standard
That's wrong again.
Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group (http://stellar-group.org/libraries/hpx/).
They are pretty smart people and deserve at least respect and parent-ship for what they have done.
"The only excuse I can see to this attitude is greed" sounds pretty fanboyish to me. :-)
I've never understood why Microsoft, or Adobe, or Autodesk, or Synopsys, or Cadence or any other pure software company is allowed to charge as much as the market will bear for their products, often more per year than Nvidia's hardware, but when a company makes software that runs on dedicated hardware, it's called greed. I don't think it's an exaggeration when I say that, for many laptops with a Microsoft Office 365 license, you pay more over the lifetime of the laptop for the software license than for the hardware itself. And it's definitely true for most workstation software.
When you use Photoshop for your creative work, you lock your design IP to Adobe's Creative Suite. When you use CUDA to create your own compute IP, you lock yourself to Nvidia's hardware.
In both cases, you're going to pay an external party. In both cases, you decide that this money provides enough value to be worth paying for.
> Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.
This is widely inaccurate.
While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today. The PTX backend that Google contributed was removed some time later.
> Nothing is "stable" nor "branded" in this work.
This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend, in strong contrast with the other PTX backend that LLVM used to have (it actually had both for a while).
> OpenACC backend targets directly the nvidiaptx (proprietary) format.
This is false. Source: I've used the PGI compiler on some Fortran code, and you can mix OpenACC with CUDA Fortran just fine, and compile to x86-64 using OpenMP to just target x86 CPUs. No NVIDIA hardware involved.
> That's wrong again.
>
> Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group
This is also widely inaccurate. The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries [0]. The author of Thrust was the Editor of the Parallelism TS, and is the chair of the Parallelism SG. The members of the STELLAR group that worked on HPX started collaborating more actively with ISO once they started working at NVIDIA after their PhDs. One of them chairs the C++ library evolution working group. The Concurrency working group is also chaired by NVIDIA (by the other nvidia author of the original parallelism TS.
> While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today.
You more or less reformalized what I said. It might become used one day behind a proprietary blob, rebranded blob of NVidia, but fact is that today, close to nobody use it for production in the wild and it is not even supported officially.
> This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend.
It does not mean it's stable or used. I do not now a single major GPGPU software in existence that ever used it in an official distribution. Like I said.
> CUDA Fortran just fine
CUDA fortran, yes you said it, CUDA fortran. The rest is OpenACC.
> The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries
My apologies for that. I was ignoring this precedent work.
> AMD is nowhere to be found in this type of work.
> You were claiming that OpenACC and CUDA only runs on nvidia's hardware, yet I suppose you now agree that this isn't true I guess.
I do not think I ever said that OpenACC runs only on NVidia hardware. However CUDA I still affirm that CUDA runs only on NVidia hardware yes. For anything else, it is based on code converter in best case.
> That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.
Using a branded & under patent concurrent proprietary technology and copying its API for your own implementation is Maddness that will lead you for sure in front of a court.
the MIT license doesn't have an express patent grant. If Nvidia has a patent on some technology used by the open source code, they could sue you for patent infringement if you use it in a way that displeases them. What they can't do is sue you for copyright infringement.
> Most other legal precedent was that it was fine to clone an API.
CUDA is more than an API. It is a technology under copyright and very likely patented too. Even the API itself contains multiple reference to "CUDA" in function calls and variable name.
None of that protects it from being cloned under previous 9th circuit precedent except maybe patents, but I'm not aware of any patents that'd protect another against CUDA implementation.
For PGI, all pgi compilers can do this, just pick x86-64 as the target. There are also other forks online (just search for LLVM, CUDA, x86 as keywords), some university groups have their forks on github, where they compile CUDA to x.
People who are into RISC-V and other side projects/open stacks obviously have not worked on mission critical problems.
When you have a Jet engine hoisted up for a test rig, and something fails in your DSP library, you don't hesitate to call Matlab engineering support to help on within next 30 mins. Try that with some python library. People give a lot of flak to Matlab for being closed source but there is a reason they exist. Not for building a stupid toy project, but for real things where big $$$ is on the line. Python is also used in production everywhere, but if your application is a niche one and using PyVISA library to connect to some DSP hardware that you git cloned is not very "production" ready. You need solid deps.
Don't get me wrong - open source software runs in prod all the time - PostgreSQL/Linux, etc. The smaller the application domain (specific DSP libraries or analysis stacks for wind turbines and such), the lower the availability of high quality open source software (and support).
My point is that reality hits you hard when it is anything where a lot of $$$ or people's time depend on it. Don't blame their engineers for using closed source tools.
It would suffice for NVIDIA to open-source enough specifications and perhaps some subset of core software to enable others to build high quality open source (or even proprietary) software that targets NVIDIA's architecture. They can't hire every programmer in the world; if other programmers can build high-performance software that takes advantage of their platform, that increases the value of their hardware.
Your comparison to Intel isn't valid: most software that runs on Intel processors isn't built with icc, and customers have a choice: they can use icc, gcc, clang, or a number of other compilers. The NVIDIA world isn't equivalent.
Anyone is free to target PTX and do their own compiler on top.
In fact, given that it is there since version 3, there are compilers available for almost all major programing languages, including managed ones.
While OpenCL is a C world, and almost no one cares about the C++ extensions and even less vendors care about SPIR-V.
Also the community doesn't seem to be bothered that for a long time, the only SYCL implementation was a commercial one from CodePlay, trying to extend their compilers outside the console market.
> the community doesn't seem to be bothered that for a long time, the only SYCL implementation was a commercial one
Bothered has nothing to do with it. Implementing low level toolchains generally seems to require both a gargantuan effort and an incredible depth of knowledge. If it didn't, I think tooling and languages in general would be significantly better across the board.
What am I supposed to do, implement a SYCL compiler on my own? Forget it - I'll just keep writing GLSL compute shaders or OpenCL kernels until someone with lots of resources is able to foot the initial bill for a fully functional and open source implementation.
This is wrong - triSYCL is roughly the same age as ComputeCpp, and hipSYCL is only slightly younger. There has been a lot of academic interest in SYCL, but as with any new technology (especially niche technologies) it's always going to take time to get people on board.
Also, from a quick look at your profile, you seem to have quite a lot of comments criticizing or commenting on CodePlay. Do you have some sort of relationship or animosity with them?
I wish all the luck to CodePlay, the more success the better for them.
They are well appreciated among game developers, given their background.
My problem is how Khronos happens to sell their APIs, and let everyone alone to create their own patched SDKs and then act surprised that commercial APIs end up winning the hearts of the majority.
The situation has hardly changed since I did my thesis with OpenGL in late 90's, porting a particles visualization engine from NeXTSTEP to Windows.
Nothing that compares with CUDA, Metal, DirectX, LibGNMX, NVN tooling.
Hence my reference to CodePlay, as for very long time their SDK was the only productive way to use SYCL.
Khronos likes to oversell the eco-system, and usually the issues and disparities across OEMs tend to be "forgotten" on their marketing materials.
This has literally been a back and forth argument since a 100 point post on slashdot was a groundbreaking event. I don't see it changing any time soon - honestly if anything on tech forums this argument frequently overshadows just how well NVIDIA is doing.
>> NVIDIA's main competitive advantage over AMD and Intel is its software stack. AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it?
I always wonder why it is so hard for AMD to develop a true competitor to CUDA, but for AMD hardware? Not try to solve GPGPU programming through open standards like OpenCL, just copy the concept of CUDA wholesale. They could still build it on top of LLVM etc and release the whole thing as open-source, but have the freedom to not have to deal with design-by-committee frameworks like OpenCL, so they can ensure focus on GPU programming and nothing else, and only on those platforms where the majority of the demand is. There is not much wrong with OpenCL, it's just not nearly as good/capable/easy-to-use as CUDA if all you are interested in is GPGPU programming.
AMD is a big company with a lot of revenue, especially recently, so why would it be so hard to have a team working full-time on creating a direct CUDA knock-off ASAP?
1. AMD has struggled in the past and even today on being profitable with their GPUs. Makes it difficult to entice an army of knowledgeable devs without consistent cash flow. Granted, the tide is turning with their profitable CPU business and equity has shot up.
2. More importantly I think that, being the underdog, AMD has to have a cheaper, open solution to compete. Why would a customer choose to go with AMD’s nascent and proprietary stack over Nvidia’s well established and nearly ubiquitous proprietary stack?
To be clear, I don’t think the problems are insurmountable. AMD won a couple HPC deals recently which should afford them the opportunity to build up their software and invest in a competitive hardware solution.
This sounds like you'd like NVIDIA to open-source all their software. I see this type of request a lot, but I don't see it happening.
NVIDIA's main competitive advantage over AMD and Intel is its software stack. AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it? AMD software offer is just poor, of course they open-source everything, they don't make any software worth buying.
ARM and Intel make great software (the Intel MKL, Intel SVML, ... libraries, icc, ifort, ... compiler), and it doesn't open-source any of that either for the same reasons as NVIDIA.
Intel and NVIDIA employ a lot of people to develop their software stacks. These people aren't probably very cheap. AMD strategy is to save a lot of money in software development, maybe hoping that the open-source communities or Intel and NVIDIA will do it for free.
I also see these requests that Intel and NVIDIA should open-source everything together with the explanation that "I need this because I want to buy AMD stuff". That, right there, is the reason why they don't do it.
You want to know why NVIDIA has 99% of the Cloud GPGPU hardware market and AMD 1%? If you think 10.000$ for a V100 is expensive, do the math on how much does an AMD MI50 costs: 5000$ for the hardware, and then a team of X >100k$ engineers (how much do you think AI GPGPU engineers cost?) working for N years just to play catch on the part of the software stack that NVIDIA gives you with a V100 for free. That goes into multiple million dollars more expensive really quickly.