The first GPU models were basically stateless (via pixel and vertex shaders with texture input and outputs), but this was incredibly inefficient for many GPGPU tasks, so compute shaders and CUDA have ways to load from and store to arrays. The memory model is a bit funky, but I’m not sure how going back to functional is viable for GPU programming.