My perspective is that OpenCL is indeed in that bad shape, though it does have defenders. Both AMD (ROCm) and Intel (oneAPI) have ways to run workloads originally written to run on CUDA, but they're nowhere near the level of polish as CUDA.
I believe an open stack can and will emerge, but it will take time and effort on all levels of the stack. It's possible to do pretty amazing things with Vulkan compute shaders, but the programming model is different than CUDA (it's not single-source), and the tooling support is not quite there.
In time, I am hopeful that WebGPU will gather more momentum, and be officially supported even in places where Vulkan requires janky adapter layers. But in its current form, it's very immature and far from being usable for real workloads.
OneAPI is in a rather good state considering it’s barely a release candidate now I’ll put my money on Blender support Intel GPUs sooner than AMD ones with Cycles X unless AMD will adopt OneAPI.
“Works” and actually works are different things.
ROCm isn’t in a state that i would define actually working atm, considering just how broken their CUDA to HIP stuff is I’m not going to hold my breath.
SYCL can be targeted directly to HIP without going through Cuda first, but I agree that it's far from perfect. IMO though, it's as useable as OpenCL by now.
I believe an open stack can and will emerge, but it will take time and effort on all levels of the stack. It's possible to do pretty amazing things with Vulkan compute shaders, but the programming model is different than CUDA (it's not single-source), and the tooling support is not quite there.
In time, I am hopeful that WebGPU will gather more momentum, and be officially supported even in places where Vulkan requires janky adapter layers. But in its current form, it's very immature and far from being usable for real workloads.