> I'd vote to undo these setbacks by moving to local data processing, where a la...

> I'd vote to undo these setbacks by moving to local data processing, where a large number of cores each have 1/N of the total memory, shared by M memory busses. Memory controllers would manage shuffling data to where it's needed so that the memory appears as 1 contiguous address space to any process.

Cheap RAM is DDR. Fast RAM would be on-die but that would be very expansive, or maybe now on package (but with some tech to be developed). But appart from decoupling latencies of accesses, I don't really see the point of having N busses (from local core to its local memory), especially if you need a very large number of cores. More memory channels seems good enough. The bandwidth is already hard to saturate on well-designed SoC like the M1 Pro and above, probably improvement to the latency could yield to better benefits than trying to increase the bandwidth more.

> In other words, this would look identical to the desktop CPUs we have today, just with a large number of cores (over 256) and a memory bandwidth many hundreds or thousands of times faster than what we have now if it uses content-addressable memory with copy-on-write internally. The speed difference is like comparing BitTorrent to FTP, and why GPUs run orders of magnitude faster than CPUs (unfortunately limited to their narrow use cases).

"content-addressable memory with copy-on-write internally" are you describing what caches already kind of do, in a way (esp. if I mix that with: "memory appears as 1 contiguous address space to any process")? The good news would then be: we already have them :)

What remains, that I think I fully understand what you mean, seems to be: more cores. The other good news here is that: it is in progress. If 6 years ago you would have gotten 6 to 8 cores on an enthusiast platform, you would now probably chose 12 to 16 cores on just a basic one (and even more on a modern enthusiast one)

There has been a pause but in recent years but it was basically Intel having process difficulties, and being caught up by the rest of the industry. Including some with power consumption also in mind, and given what an high perf CPU dissipates today, power consumption has also become key to unlock raw performance anyway.