I feel that single-threaded processing power stopped increasing at 2 major event...

temac · on May 21, 2022

> I'd vote to undo these setbacks by moving to local data processing, where a large number of cores each have 1/N of the total memory, shared by M memory busses. Memory controllers would manage shuffling data to where it's needed so that the memory appears as 1 contiguous address space to any process.

Cheap RAM is DDR. Fast RAM would be on-die but that would be very expansive, or maybe now on package (but with some tech to be developed). But appart from decoupling latencies of accesses, I don't really see the point of having N busses (from local core to its local memory), especially if you need a very large number of cores. More memory channels seems good enough. The bandwidth is already hard to saturate on well-designed SoC like the M1 Pro and above, probably improvement to the latency could yield to better benefits than trying to increase the bandwidth more.

> In other words, this would look identical to the desktop CPUs we have today, just with a large number of cores (over 256) and a memory bandwidth many hundreds or thousands of times faster than what we have now if it uses content-addressable memory with copy-on-write internally. The speed difference is like comparing BitTorrent to FTP, and why GPUs run orders of magnitude faster than CPUs (unfortunately limited to their narrow use cases).

"content-addressable memory with copy-on-write internally" are you describing what caches already kind of do, in a way (esp. if I mix that with: "memory appears as 1 contiguous address space to any process")? The good news would then be: we already have them :)

What remains, that I think I fully understand what you mean, seems to be: more cores. The other good news here is that: it is in progress. If 6 years ago you would have gotten 6 to 8 cores on an enthusiast platform, you would now probably chose 12 to 16 cores on just a basic one (and even more on a modern enthusiast one)

There has been a pause but in recent years but it was basically Intel having process difficulties, and being caught up by the rest of the industry. Including some with power consumption also in mind, and given what an high perf CPU dissipates today, power consumption has also become key to unlock raw performance anyway.

watersb · on May 22, 2022

I don't know how to control for other factors, like bus speed and RAM bandwidth, but:

- 2007 single-core performance: Geekbench 5 score ~ 500.

- 2021 MacBook Air M1 single core: 1750

Ok, only a factor of 3 or so. And only 2x as many cores.

I'm comparing Core 2 Extreme to a low power portable design, albeit one with notably high single-core performance.

russdill · on May 21, 2022

The shift to a focus on power consumption was already happening anyway without the iphone even on desktop. CPUs were already in the nuclear reactor territory as far as being able to produce as much heat per unit area