Hacker News new | past | comments | ask | show | jobs | submit login

Their atomic operations used to be extremely costly from a utilization perspective. They would shut down all other threads in a warp while the thread performing the atomic ran alone. Is that still the case?



Fixed as of Maxwell to the best of my knowledge. But even then, I found them more efficient for reduction operations in global memory than any other method (using fixed-point math for places where a deterministic sum was required).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: