I think the theory is that that's what OpenCL is, but for most people if you're building a dedicated cluster, you're buying Nvidia hardware anyway, and if you already are using the hardware for it, you might as well go native and get the best performance and library support.
Stuck in pure C, printf style debugging with graphical debuggers that never properly handled everything.
CUDA, polyglot GPGPU development environment, graphical debuggers that allow for single stepping and conditional breakpoints in GPGPU code, interoperability with graphical APIs.
OTOY just replaced their rendering code in Octane Render from Vulkan to CUDA (via Optix 7).