The 1TB/sec is between the GPU and the GPU memory, which is the bottleneck.
You don’t need that much bandwidth for loading inputs and outputting results, just for random access during compute.
The 1TB/sec is between the GPU and the GPU memory, which is the bottleneck.
You don’t need that much bandwidth for loading inputs and outputting results, just for random access during compute.