The key features of the m3 ultra is 512GB of shared GPU/CPU ram, and ultra fast LAN over peripheral cabling.
Once an NVIDIA card caches a model into its VRAM, than it doesn't get hit with the memory data copy cost over the bus.
Yet as many people have noticed, who cares if the m3 ultra takes four times as long if the faster alternative simply won't fit the larger models. YMMV =3
Once an NVIDIA card caches a model into its VRAM, than it doesn't get hit with the memory data copy cost over the bus.
Yet as many people have noticed, who cares if the m3 ultra takes four times as long if the faster alternative simply won't fit the larger models. YMMV =3