Indeed. Pretty much all silicon today comes on 12" or so wafers, broken into chip sized pieces, and each chip is tested and the ones that failed are thrown away.
Cerebras uses the entire 12" and builds in redundancy so that with current defect rates a large fraction of the wafers are usable. This allows a huge level of parallelism, a large amount of on board ram, and the removal of the need to move data on/off the wafer. So the available bandwidth is insane and inference is mostly bandwidth limited.