> They make good hardware, but there's a lot of lock-in, and not a lot of transp...

adev_ · on May 14, 2020

> AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it?.

Correction: Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia. They can not even change even if they want to: it is pure madness to reprogram an entire GPGPU software stack every 2 years just to change your hardware provider.

And I think it will remain like that until NVidia get sued for anti-trust.

> ARM and Intel make great software [..] doesn't open-source any of that either for the same reasons as NVIDIA.

That's propaganda and it's wrong.

Intel and ARM contribute a lot to OSS. Most of the software they release nowadays is Open Source. This includes compiler support, drivers, libraries and entire dev environment: mkl-dnn, TBB, BLIS, ISPC, "One", mbedTLS.... ARM has even an entire foundation only to contribute to OSS (https://www.linaro.org/) .

Near to that, NVidia does close to nothing.

There is no justification to NVidia's attitude related to OSS. It reminds me the one of Microsoft at its darkest days.

The only excuse I can see to this attitude is greed.

I hope at least they do not contaminate Mellanox with their toxic policies. Mellanox was an example of successful Open Source contributor/company (up to now) with OFabric (https://www.openfabrics.org/). It would be dramatic if this disappear.

Polylactic_acid · on May 14, 2020

Amd doesn't even have software for GPGPU on some of their cards. I have an rx5700xt and I cant use it for anything but gaming because ROCm doesn't support navi cards, a whole year after its release.

csdreamer7 · on May 14, 2020

As a 5700 owner, I agree.

It gets even worse. There was recently a regression in the 5.4, 5.5, and 5.6 kernels that hit me hard for a week or so on Manjaro last month. System just decided to lock up or restart. Thought the graphics card had died when it happened once on Windows. Working fine now-these drivers have been out for 10 months now.

Even worse, AMD has locked down the releases of some of their 'GPUOpen' software.

https://www.phoronix.com/scan.php?page=news_item&px=Radeon-R...

https://www.phoronix.com/scan.php?page=news_item&px=GPUOpen-...

I did not expect the second one to be open source; just not on their GPUOpen website.

I did expect the first one to 'stay' open source. Not to be made proprietary on their 'GPUOpen' website.

I am definitely keeping an eye on Intel graphics now.

ksec · on May 15, 2020

I think at this point AMD wants anything Compute to concentrate on CDNA, and graphics remain on RDNA.

BeetleB · on May 15, 2020

>> ARM and Intel make great software [..] doesn't open-source any of that either for the same reasons as NVIDIA.

> That's propaganda and it's wrong.

Very convenient of you to have omitted what was in the square brackets:

> Intel MKL, Intel SVML, ... libraries, icc, ifort, ... compiler

Show me the open source MKL, Intel SVML, icc and ifort.

Some (all?) of it may be free, but it's not open source.

rodburns · on May 15, 2020

I don't necessarily have an opinion either way in this discussion but wanted to point out that Intel's latest MKL library does seem to be done as an open source project https://github.com/oneapi-src/oneMKL

amelius · on May 14, 2020

> Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia.

But numpy can be ported. So can pytorch.

I don't think the lock-in is that big of an issue. GPUs do only simple things, but do them fast.

david-gpu · on May 14, 2020

> GPUs do only simple things, but do them fast

GPUs are immensely complex systems. Look at an API like Vulcan, plus it's shading language, and tell me again it's simple. And that's a low-level interface.

Now add to that the enormous amount of software effort that goes into implementing efficient libraries like cuBLAS, cuDNN, etc. There's a reason other vendors have struggled to compete with NVidia.

Disclaimer: currently employed at NVidia.

lumost · on May 14, 2020

Part of Nvidia's advantage comes from building the hardware and software side by side. No one was seriously tackling GPGPU until Nvidia created Cuda, and if you look at the rest of the graphics stack Nvidia is the one driving the big innovations.

GPUs are sufficiently specialized in both interface and problem domain that GPU enhanced software is unlikely to appear without a large vendor driving development, and it would be tough for that vendor to fund application development if there is no lock in on the chips.

which leads to the real question. What business model would enable GPU/AI software development without hardware lock-in? Game development has found a viable business by charging game publishers.

diffrinse · on May 14, 2020

Would you agree that that your observations somewhat imply that a competitive free market is not a fit for all governable domains (and don't mistake governable for government there, we're still talking about shepherding of innovation)?

fluffything · on May 15, 2020

Early tech investments are risky, but if your competition has tech 10 years more advanced than yours, there is probably no amount of money that would allow you to catch up, surpass, and make enough profits to recover the investment, mainly because you can't buy time, and your competitor won't stop to innovate, they are making a profit and you aren't, etc.

So to me the main realization here is that in tech, if one competitor ends up with tech that's 10 years more advanced than the competition, it is basically a divergence-type of phenomenon. It isn't worth it for the competition to even invest in trying to catch up, and you end up with a monopoly.

lumost · on May 15, 2020

This is a good callout, unlike manufacturing the supply chain is almost universally vertically integrated for large software projects. While it's possible to make a kit car that at least some people would buy, most of the big tech companies have reached the point of requiring hundreds of engineers for years to compete.

Caveat that time has shown that the monopolies tend to decay over time for various reasons, the tech world is littered with companies that grew too confident in their monopoly.

- Cisco - Microsoft Windows - IBM

etc.

fluffything · on May 18, 2020

The problem with vertically integrated technology is that if a huge advancement appears at the lowest level of the stack that would require a whole re-implementation of the whole stack, a new startup building things from scratch can overthrown a large competitor that would need to "throw" their stack away, or evolve it without breaking backward compatibility, etc.

Once you have put a lot of money into a product, it is very hard to start a new one from scratch and let the old one die.

lumost · on May 15, 2020

I think you would need to take a fine tooth comb to the definitions here. I could see a few different options emerge for non-Nvidia software including

- Cloud providers wishing to provide lower CapEx solutions in exchange for increased OpeX and margin. - Large Nvidia customers forming a foundation to shepherd Open implementations of common technology components

From a free market perspective both forms of transaction would be viable and incentivized, but neither option necessarily leads to an open implementation.

ksec · on May 15, 2020

I have been stating similar thing on GPU for a very long time.

The GPU hardware is ( comparatively ) simple.

It is the software that sets GPU vendors apart. For Gaming, that is Drivers. For Compute that is CUDA.

On a relative scale, getting a decent GPU design may have a difficulty of 1, getting a decent Drivers to work well on all existing software is 10, getting the whole ecosystem system around your Drivers / CUDA + Hardware is likely in the range of 50 to 100.

As far as I can tell, under Jensen's leadership, the chance of AMD or even Intel to shake up Nvidia's grasp in this domain is partially zero in the foreseeable future.

That is speaking as an AMD shareholder and really wants AMD to compete.

adev_ · on May 14, 2020

> But numpy can be ported. So can pytorch.

Letting AMD or Intel port themselves everything that has been developed in CUDA like it was done for Pytorch is not substainable and will always lag behind.

It can only help to create a monopoly on the long term.

slavik81 · on May 15, 2020

As Hip continues to implement more of CUDA, I think we'll see more developers doing it themselves when the barrier to porting is smaller. AMD has a lot of work to do, and I don't know whether they'll succeed or not, but IMO they have the right strategy.

gnufx · on May 14, 2020

Intel don't release BLIS, though there is some Intel contribution. Substitute libxsmm, which originally beat MKL.

fluffything · on May 14, 2020

> Correction: Nobody will be able to use the AMD hardware (outside of computer graphics) because everybody has been locked-in with CUDA on Nvidia.

NVIDA open-sourced their CUDA implementation to the LLVM project 5 years ago, which is why clang can compile CUDA today, and why Intel and PGI have clang forks compiling CUDA to multi-threaded and vectorized x86-64 using OpenMP.

That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.

adev_ · on May 14, 2020

> Do you work for AMD

I do not. And I use NVidia hardware regularly for GPGPU. But I hate fanboyism.

> NVIDA open-source their CUDA implementation to the LLVM project 5 years ago

Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.

Nothing is "stable" nor "branded" in this work.... Consequently, 99% of public Open Source CUDA-using software still compile ONLY with the CUDA proprietary toolchain ONLY on NVidia hardware. And this is not going to change anything soon.

> one from PGI, that compile CUDA to multi-threaded x86-64 code using OpenMP.

The PGI compiler is proprietary and now property of NVidia. It was previously properitary and independant but mainly used for its GPGPU capability through OpenACC. OpenACC backend targets directly the nvidiaptx (proprietary) format. Nothing related with CUDA.

> Intel being the main vendor pushing for a parallel STL in the C++ standard

That's wrong again.

Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group (http://stellar-group.org/libraries/hpx/).

They are pretty smart people and deserve at least respect and parent-ship for what they have done.

More information from Hermut Kaiser (Very Nice Guy btw) here (https://www.youtube.com/watch?v=6Z3_qaFYF84).

They have been the precursor of the idea of parallel "algorithms" in the STL and the concept of "Execution policy" you have in C++17 comes from them.

To the defense of Intel (and up to my knowledge) they have provided the first OSS implementation for compilers for it.

TomVDB · on May 14, 2020

> But I hate fanboyism.

"The only excuse I can see to this attitude is greed" sounds pretty fanboyish to me. :-)

I've never understood why Microsoft, or Adobe, or Autodesk, or Synopsys, or Cadence or any other pure software company is allowed to charge as much as the market will bear for their products, often more per year than Nvidia's hardware, but when a company makes software that runs on dedicated hardware, it's called greed. I don't think it's an exaggeration when I say that, for many laptops with a Microsoft Office 365 license, you pay more over the lifetime of the laptop for the software license than for the hardware itself. And it's definitely true for most workstation software.

When you use Photoshop for your creative work, you lock your design IP to Adobe's Creative Suite. When you use CUDA to create your own compute IP, you lock yourself to Nvidia's hardware.

In both cases, you're going to pay an external party. In both cases, you decide that this money provides enough value to be worth paying for.

fluffything · on May 14, 2020

> Correction: Google developped an internal CUDA implementation for their own need based on LLVM that Nvidia barely supported it for their own need afterwards.

This is widely inaccurate.

While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today. The PTX backend that Google contributed was removed some time later.

> Nothing is "stable" nor "branded" in this work.

This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend, in strong contrast with the other PTX backend that LLVM used to have (it actually had both for a while).

> OpenACC backend targets directly the nvidiaptx (proprietary) format.

This is false. Source: I've used the PGI compiler on some Fortran code, and you can mix OpenACC with CUDA Fortran just fine, and compile to x86-64 using OpenMP to just target x86 CPUs. No NVIDIA hardware involved.

> That's wrong again. > > Most of the work done for the parallel STL and by the C++ committee originate from work from HPX and the STELLAR Group

This is also widely inaccurate. The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries [0]. The author of Thrust was the Editor of the Parallelism TS, and is the chair of the Parallelism SG. The members of the STELLAR group that worked on HPX started collaborating more actively with ISO once they started working at NVIDIA after their PhDs. One of them chairs the C++ library evolution working group. The Concurrency working group is also chaired by NVIDIA (by the other nvidia author of the original parallelism TS.

AMD is nowhere to be found in this type of work.

[0] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n372...

adev_ · on May 14, 2020

> While Google did developed a PTX backend for LLVM, the student that worked on that as part of a GSOC got later hired by NVIDIA, and ended up contributing the current NVPTX backend that clang uses today.

You more or less reformalized what I said. It might become used one day behind a proprietary blob, rebranded blob of NVidia, but fact is that today, close to nobody use it for production in the wild and it is not even supported officially.

> This is false. The NV part of the backend name (NVPTX) literally brands this backend as NVIDIAs PTX backend.

It does not mean it's stable or used. I do not now a single major GPGPU software in existence that ever used it in an official distribution. Like I said.

> CUDA Fortran just fine

CUDA fortran, yes you said it, CUDA fortran. The rest is OpenACC.

> The Parallel STL work actually originated with the GCC parallel STL, the Intel TBB, and NVIDIA Thrust libraries

My apologies for that. I was ignoring this precedent work.

> AMD is nowhere to be found in this type of work.

I do not think I ever said anything about AMD.

fluffything · on May 15, 2020

> CUDA fortran, yes you said it, CUDA fortran. The rest is OpenACC.

You can also mix C, OpenACC, and CUDA C, and compile to x86-64. So I'm really not sure about what point you are trying to make here.

You were claiming that OpenACC and CUDA only runs on nvidia's hardware, yet I suppose you now agree that this isn't true I guess.

I do agree that PGI is still nvidia owned, but there are other compilers that do what PGI does.

adev_ · on May 15, 2020

> You were claiming that OpenACC and CUDA only runs on nvidia's hardware, yet I suppose you now agree that this isn't true I guess.

I do not think I ever said that OpenACC runs only on NVidia hardware. However CUDA I still affirm that CUDA runs only on NVidia hardware yes. For anything else, it is based on code converter in best case.

adev_ · on May 14, 2020

> That you can't compile CUDA to AMD GPUs isn't NVIDIA's fault, it's AMD, for deciding to pursue OpenCL first, then HSA, and now HIP.

Using a branded & under patent concurrent proprietary technology and copying its API for your own implementation is Maddness that will lead you for sure in front of a court.

It seems that even Google understood that the hard way (https://en.wikipedia.org/wiki/Google_v._Oracle_America)

fluffything · on May 14, 2020

How come? There is a CUDA C++ and CUDA C toolchains available under a MIT license, large part s of which are contributed by NVIDIA.

How can they sue you for using something that they give you with a license that says "we allow you to do whatever you want with it" ?

Teknoman117 · on May 14, 2020

the MIT license doesn't have an express patent grant. If Nvidia has a patent on some technology used by the open source code, they could sue you for patent infringement if you use it in a way that displeases them. What they can't do is sue you for copyright infringement.

monocasa · on May 14, 2020

Google v Oracle is still unsettled.

Most other legal precedent was that it was fine to clone an API.

adev_ · on May 14, 2020

> Most other legal precedent was that it was fine to clone an API.

CUDA is more than an API. It is a technology under copyright and very likely patented too. Even the API itself contains multiple reference to "CUDA" in function calls and variable name.

monocasa · on May 14, 2020

None of that protects it from being cloned under previous 9th circuit precedent except maybe patents, but I'm not aware of any patents that'd protect another against CUDA implementation.

mambru · on May 14, 2020

>Intel and PGI have clang forks compiling CUDA to multi-threaded and vectorized x86-64 using OpenMP.

Where are these forks?

pjmlp · on May 15, 2020

For Intel, https://software.intel.com/content/www/us/en/develop/tools/o...

fluffything · on May 15, 2020

For PGI, all pgi compilers can do this, just pick x86-64 as the target. There are also other forks online (just search for LLVM, CUDA, x86 as keywords), some university groups have their forks on github, where they compile CUDA to x.

fermienrico · on May 14, 2020

People who are into RISC-V and other side projects/open stacks obviously have not worked on mission critical problems.

When you have a Jet engine hoisted up for a test rig, and something fails in your DSP library, you don't hesitate to call Matlab engineering support to help on within next 30 mins. Try that with some python library. People give a lot of flak to Matlab for being closed source but there is a reason they exist. Not for building a stupid toy project, but for real things where big $$$ is on the line. Python is also used in production everywhere, but if your application is a niche one and using PyVISA library to connect to some DSP hardware that you git cloned is not very "production" ready. You need solid deps.

Don't get me wrong - open source software runs in prod all the time - PostgreSQL/Linux, etc. The smaller the application domain (specific DSP libraries or analysis stacks for wind turbines and such), the lower the availability of high quality open source software (and support).

My point is that reality hits you hard when it is anything where a lot of $$$ or people's time depend on it. Don't blame their engineers for using closed source tools.

pcwalton · on May 14, 2020

> People who are into RISC-V and other side projects/open stacks obviously have not worked on mission critical problems.

"People who are into RISC-V" nowadays includes folks like Chris Lattner, who has worked on more mission-critical problems than most everyone here.

pjmlp · on May 15, 2020

Yes, and not all of them were turned into gold. I don't have any hopes on Swift for Tensorflow.

not2b · on May 14, 2020

It would suffice for NVIDIA to open-source enough specifications and perhaps some subset of core software to enable others to build high quality open source (or even proprietary) software that targets NVIDIA's architecture. They can't hire every programmer in the world; if other programmers can build high-performance software that takes advantage of their platform, that increases the value of their hardware.

Your comparison to Intel isn't valid: most software that runs on Intel processors isn't built with icc, and customers have a choice: they can use icc, gcc, clang, or a number of other compilers. The NVIDIA world isn't equivalent.

pjmlp · on May 14, 2020

Anyone is free to target PTX and do their own compiler on top.

In fact, given that it is there since version 3, there are compilers available for almost all major programing languages, including managed ones.

While OpenCL is a C world, and almost no one cares about the C++ extensions and even less vendors care about SPIR-V.

Also the community doesn't seem to be bothered that for a long time, the only SYCL implementation was a commercial one from CodePlay, trying to extend their compilers outside the console market.

Reelin · on May 14, 2020

> the community doesn't seem to be bothered that for a long time, the only SYCL implementation was a commercial one

Bothered has nothing to do with it. Implementing low level toolchains generally seems to require both a gargantuan effort and an incredible depth of knowledge. If it didn't, I think tooling and languages in general would be significantly better across the board.

What am I supposed to do, implement a SYCL compiler on my own? Forget it - I'll just keep writing GLSL compute shaders or OpenCL kernels until someone with lots of resources is able to foot the initial bill for a fully functional and open source implementation.

pjmlp · on May 15, 2020

Which is why CUDA won, most researchers can't be bothered to keep writing C based shaders with printf debugging.

14113 · on May 15, 2020

This is wrong - triSYCL is roughly the same age as ComputeCpp, and hipSYCL is only slightly younger. There has been a lot of academic interest in SYCL, but as with any new technology (especially niche technologies) it's always going to take time to get people on board.

Also, from a quick look at your profile, you seem to have quite a lot of comments criticizing or commenting on CodePlay. Do you have some sort of relationship or animosity with them?

pjmlp · on May 15, 2020

I wish all the luck to CodePlay, the more success the better for them.

They are well appreciated among game developers, given their background.

My problem is how Khronos happens to sell their APIs, and let everyone alone to create their own patched SDKs and then act surprised that commercial APIs end up winning the hearts of the majority.

The situation has hardly changed since I did my thesis with OpenGL in late 90's, porting a particles visualization engine from NeXTSTEP to Windows.

Nothing that compares with CUDA, Metal, DirectX, LibGNMX, NVN tooling.

Hence my reference to CodePlay, as for very long time their SDK was the only productive way to use SYCL.

Khronos likes to oversell the eco-system, and usually the issues and disparities across OEMs tend to be "forgotten" on their marketing materials.

fluffything · on May 14, 2020

Rust has a PTX backend.

taurath · on May 14, 2020

This has literally been a back and forth argument since a 100 point post on slashdot was a groundbreaking event. I don't see it changing any time soon - honestly if anything on tech forums this argument frequently overshadows just how well NVIDIA is doing.

pjmlp · on May 14, 2020

It is just like game forums as well.

The culture here and on those forums couldn't be further apart.

w0utert · on May 15, 2020

>> NVIDIA's main competitive advantage over AMD and Intel is its software stack. AMD could release a 2x powerful GPGPU tomorrow for half the price and most current NVIDIA users wouldn't care because what good is that if you can't program it?

I always wonder why it is so hard for AMD to develop a true competitor to CUDA, but for AMD hardware? Not try to solve GPGPU programming through open standards like OpenCL, just copy the concept of CUDA wholesale. They could still build it on top of LLVM etc and release the whole thing as open-source, but have the freedom to not have to deal with design-by-committee frameworks like OpenCL, so they can ensure focus on GPU programming and nothing else, and only on those platforms where the majority of the demand is. There is not much wrong with OpenCL, it's just not nearly as good/capable/easy-to-use as CUDA if all you are interested in is GPGPU programming.

AMD is a big company with a lot of revenue, especially recently, so why would it be so hard to have a team working full-time on creating a direct CUDA knock-off ASAP?

QuixoticQuibit · on May 15, 2020

Two thoughts that come to mind:

1. AMD has struggled in the past and even today on being profitable with their GPUs. Makes it difficult to entice an army of knowledgeable devs without consistent cash flow. Granted, the tide is turning with their profitable CPU business and equity has shot up.

2. More importantly I think that, being the underdog, AMD has to have a cheaper, open solution to compete. Why would a customer choose to go with AMD’s nascent and proprietary stack over Nvidia’s well established and nearly ubiquitous proprietary stack?

To be clear, I don’t think the problems are insurmountable. AMD won a couple HPC deals recently which should afford them the opportunity to build up their software and invest in a competitive hardware solution.

shaklee3 · on May 15, 2020

To be fair, Nvidia has open sourced some key libraries lately. See cutlass and cufftdx.