We have thought of supporting the tensor cores in cuda devices as well, we could...

ein0p · 2024-05-18T18:54:42 1716058482

CUDA is a bit of a well-trodden ground, you aren’t going to do much better there (if at all) than cuBLAS and cuDNN. But I get what you’re saying, gotta pick one’s battles.

neonsunset · 2024-05-18T19:39:05 1716061145

My understanding is it's less about competing with cuBLAS and cuDNN directly but rather offering the features they expose in a better and more idiomatic way - there's a reason it's less fun and more tedious to write C++ AMP code.

ein0p · 2024-05-18T20:29:00 1716064140

Why would anyone write C++ AMP code when AMP is deprecated, and e.g. Triton exists though?