Nvidia's SODIMM compute module interface can prove this concept already. I have two 7W ARM Turing RK1s arriving soon, each with PCIe 3x4 at 4GB/s, and the Turing Pi 2 cluster board can fit four in an ITX form factor. I'm expecting over 3Gbps per watt at a total cost of 820USD
PCIe lanes are the bottleneck so far - even my $90 2TB SSDs are rated at 7GB/s on PCIe 4x4. So I don't think SBCs are the optimal solution yet. Looks like Ampere's Altra line can do PCIe 4x128 at 40W so a 1U blade with 100G networking could be interesting. I've seen lots of bugs and missing optimisations with ARM though, even in a homelab, so this kind of solution might not be ready for datacenters yet
PCIe lanes are the bottleneck so far - even my $90 2TB SSDs are rated at 7GB/s on PCIe 4x4. So I don't think SBCs are the optimal solution yet. Looks like Ampere's Altra line can do PCIe 4x128 at 40W so a 1U blade with 100G networking could be interesting. I've seen lots of bugs and missing optimisations with ARM though, even in a homelab, so this kind of solution might not be ready for datacenters yet