But as you observe, we are stuck in a local optimum where GPUs are optimized for throughput and CPUs for latency sensitive work.
But as you observe, we are stuck in a local optimum where GPUs are optimized for throughput and CPUs for latency sensitive work.