Hacker Newsnew | past | comments | ask | show | jobs | submit | find0x90's commentslogin

Much of the hype around DeepSeek is due to their extraordinarily low training and inference costs. They achieved this by optimizing their training code, apparently using PTX in addition to CUDA. PTX is kind of an intermediate assembly language for NVIDIA GPUs and people are eager to see how it was used.


Yes, there's some in the csrc/kernels directory. Search for 'asm' to find uses of it.


I don't see any use of PTX, might be in one of the other repos they plan to release.


right, I think PTX use is a bigger deal than its getting coverage for. this opens an opening for other vendors to get their foot in with PTX to LLVM-ir translation for existing cuda kernels.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: