Interesting that it is not able to outperform the Zen3 CPUs. I had expected it to do somewhat better, especially it being a 5nm processor, and with all the hype around ARM processors.
I don't know how well it will hold up to its x86 competitors like this, especially once they launch their 5nm CPUs next year.
>I mean that's kind of expected if you compare a low-power CPU with fewer cores against an unlimited-cooling desktop monster with much more cores.
Are we looking at the same charts here? For cinebench multithreaded, the AMD 4xxx series CPUs are zen 2 parts with 15/35W TDP, hardly "unlimited-cooling desktop monster" like you described.
From the article: "While AMD’s Zen3 still holds the leads in several workloads, we need to remind ourselves that this comes at a great cost in power consumption in the +49W range while the Apple M1 here is using 7-8W total device active power."
Looking through the benchmarks, the zen 2 parts generally seem to have lower performance than the M1. The cinebench multithreaded benchmark is one exception. It's not that surprising because the 4800U has more cores than the M1 has high performance cores. The M1 wins the single threaded cinebench benchmark.
The Zen2 4800HS also outperformed the M1 in the Specint2017 multi-threaded results, too.
The M1's float results are weirdly good relative to the int results, though. Not sure why Apple seems to have prioritized that so much in this category of CPU.
Taking a loop and adding a bunch of `x|0` can also often boost performance by hinting that integers are fine (in fact, the JIT is free to do this anyway if detects that it can).
The most recent spec is also adding BigInt. Additionally, integer typed arrays have existed since the 1.0 release in 2011 (I believe they were even seeing work as early as 2006 or so with canvas3D).
It's a higher TDP part (I think - it's 35W) and has more high performance cores, so it's not surprising that it would win some of the multicore benchmarks.
In single core performance, yes, but as the next page on the article shows, it's more comparable to the 4900HS, AMDs mobile CPU in multithreaded performance.
I guess we're talking at cross purposes. I was just making the point that the 4900HS isn't really a competitor to the M1 because it's in a different TDP class. It looks like Apple wanted to stick with one chip for their first generation products, but they could presumably at least throw in a few extra cores if they had another 10-15W to play with.
Well no, but then again there is the 4800S at half the TDP that gets close in single core performance and wins in multicore performance, so there pretty much no way they could've beaten it at another power budget.
Indeed, it they were to add a few more cores to their M1, then AMD could have also thrown a few more cores in their 4800, and it would have been a wash.
Have you seen benchmarks post 1st page? M1 at 7-8W power drain beats or just trails behind a desktop class $799 Ryzen 9 5950x at +49W consumption in single threaded performance. What did you expect?
5950X's CPU cores at 5ghz consume around 20w, not +49W. And it's extremely non-linear power scaling, such that at 4.2ghz it's already down to half the power consumption at 10w/core.
The 5950X's uncore consumes a significant amount of power, but penalizing it for that seems more than a little unreasonable. The M1 is getting power wins from avoiding the need for externalized IO for GPU or DRAM, but those aren't strictly speaking advantages either. I, for one, will gladly pay 20w of power to have expandable RAM & PCI-E slots in a device the size of the Mac Mini much less anything larger. In a laptop of course that doesn't make as much sense, but in a laptop the Ryzen's uncore also isn't 20w (see the also excellent power efficiency of the 4800U and 4900HS)
That doesn't say too much. There is a single thread performance ceiling that all CPUs based on current lithography technology available just bump against and can't overcome. The Ryzen probably marks that ceiling for now, and the M1 comes impressively close against it, especially considering its wattage.
But you cannot extrapolate these numbers (to multi-core performance or to more cores or to a possible M2 with a larger TDP envelope), nor can you even directly compare them. The Ryzen 9 5950x makes an entirely different trade-off with regard to number of cores per CPU, supported memory, etc., which allows for more cores, more memory, more everything...and that comes at a cost in terms of die space as well as power consumption. If AMD had designed this CPU to be much more constrained in those dimensions and thus much more similar to what the M1 offers, they would surely have been able to considerably drive down power consumption - in fact, their smaller units 4800U and 4900HS which were also benchmarked and which offer really good multithreading performance for their power envelope, even better than the M1, clearly demonstrate this fact.
What I read out of these benchmark numbers is: the ISA does matter far less than most people seem to assume. ARM is no magic sauce in terms of performance at all - instead, it's "magic legal sauce", because it allows anyone (here: Apple; over there: Amazon) to construct their own high-end CPUs with industry-leading performance, which the x86 instruction set cannot do due to its licensing constraints.
Both ISAs, x86_64 and ARM, apparently allow well-funded companies with the necessary top talent to build CPUs that max out whatever performance you can get out of the currently available lithography processes and out of the current state of the art in CPU design.
> What I read out of these benchmark numbers is: the ISA does matter far less than most people seem to assume.
This was my conclusion too. Does this mean, there is not much possibility of desktop pcs moving to ARM anytime soon? Perhaps, laptops might move to ARM processors, but even that seems iffy, if AMD can come up with more efficient processors (and Intel too with its Lakefield hybrid cpu)
Yeah, as someone whose next laptop wont be a Mac again, this was a good ad for what AMD has achieved lately. MyLenovo P1 got a Intel Xeon of some kind, and while I'm otherwise very happy with the Laptop, the CPU is hot, uses way too much power and constantly throttles.
Interesting that your takeaway from all this is "oh, it can't beat some of the top x86 chips in existence—it can only meet them on even footing. Guess it'll be falling behind next year."
This is Apple's first non-mobile chip ever. You think this is the best they can do, ever?
They increased IPC only around 5% with A14. The remaining performance increase was from clockspeeds (gained without increasing power due to 5nm).
Short, wide architectures are historically harder to frequency scale (and given how power vs clocks tapers off at the end of that scale, it's not a bad thing IMO).
4nm isn't shipping until 2022 (and isn't a full node). TSMC says that the 5 to 3nm change will be identical to the 7 to 5nm change (+15% performance or -30% power consumption).
Any changes next year will have to come through pure architecture changes or bigger chips. I'm betting on more modest 5-10% improvements on the low-end and larger 10-20% improvements on a larger chip with a bunch of cache tweaks and higher TDP.
Intel 10nm+ "SuperFin" will probably be fixing the major problems, improving performance, and slightly decreasing sizes for a final architecture much closer to TSMC N7.
I'm thinking that AMD ships their mobile chips with N6 instead of N7 for the density and mild power savings (it's supposedly a minor change and the mobile design is a separate chip anyway). Late next year we should be seeing Zen 4 on 5nm. That should be an interesting situation and will help resolve any questions of process vs architecture.
I agree that most of the gains were due to the node shrink. However, being able to stick to these tick tock gains for the last several years is impressive. They could have hit a wall in architecture and were bailed out by the node shrink but I doubt they would have switch away from Intel if that was the case.
I'd expect NVIDIA to join the ARM CPU race, too. And they have experience with the tooling for lots and lots of cores from CUDA. So I'd expect to have 5x to 10x the M1's performance available for desktops in 1-2 years. In fact, AMD's MI100 accelerator already has roughly 10x the FLOPS on 64bit.
To quote from Ars Technica's review of the M1 by Jim Salter [0]:
> Although it's extremely difficult to get accurate Apples-to-non-Apples benchmarks on this new architecture, I feel confident in saying that this truly is a world-leading design—you can get faster raw CPU performance, but only on power-is-no-object desktop or server CPUs. Similarly, you can beat the M1's GPU with high-end Nvidia or Radeon desktop cards—but only at a massive disparity in power, physical size, and heat.
...So, given that, and assuming that Apple will attempt to compete with them, I think it likely that they will, at the very least, be able to match them on even footing, when freed from the constraints of size, heat, and power that are relevant to notebook chips.
Agree. Yeah I should have thought about the Switch and write things more clearly.
I meant that NVIDIA will start producing ARM CPUs optimized for peak data-center performance, similar to how they now have CUDA accelerator cards for data centers, which are starting to diverge from desktop GPUs.
In the past, NVIDIA's ARM division mostly focussed on mobile SoCs. Now that Graviton and M1 are here, I'd expect NVIDIA to also produce high-wattage ARM CPUs.
I don't know how well it will hold up to its x86 competitors like this, especially once they launch their 5nm CPUs next year.