Hacker News new | past | comments | ask | show | jobs | submit login

We have thought of supporting the tensor cores in cuda devices as well, we could probably use the same abstractions we need for that for the amx support. Unfortunately we mainly focus on cuda support because most cases people are using cuda for gpu compute purposes.



CUDA is a bit of a well-trodden ground, you aren’t going to do much better there (if at all) than cuBLAS and cuDNN. But I get what you’re saying, gotta pick one’s battles.


My understanding is it's less about competing with cuBLAS and cuDNN directly but rather offering the features they expose in a better and more idiomatic way - there's a reason it's less fun and more tedious to write C++ AMP code.


Why would anyone write C++ AMP code when AMP is deprecated, and e.g. Triton exists though?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: