16-inch MBP 2x slower than M1 MacBook Air in a real-world Rust compile

cercatrova · on Nov 17, 2020

Is anyone else thinking, what the f*ck? Are we in a new era of computing? It certainly feels that way when looking at these desktop class ARM chips, where performance doubled every year or so, just like back in the 80s and 90s.

bmurphy1976 · on Nov 17, 2020

Normally this wouldn't feel so ground breaking, but the stars are in alignment and all these improvements are hitting at the same time. We're seeing years of work and investment paying off (AMD, Apple, ARM, Nvidia, Amazon), new process nodes (TSMC), and new tech (Ray Tracing, DLSS, machine learning) all hitting at the same time.

And that's the big stuff! There's also the steady incremental improvements such as battery technology, SSD and RAM that are ongoing.

You also have an incumbent (Intel) who's lost their way. If they were as scrappy as they were 15 years ago this wouldn't feel like such a banner year.

Yeah, I think we are in a new era of computing.

whizzter · on Nov 17, 2020

Not to forget that apparently some of the people driving that scrappy team that had a big part in Intel's resurgence has been working on the M1.

This actually reminded me of how I got skeptical comments from people when I told them that my small and scrappy Pentium-M (P-M -> M1, HAH!) based laptop was almost as fast as their desktop P4 monsters in compiling code.

aidenn0 · on Nov 17, 2020

The P4 (and Netburst in general) was a true low-point for Intel. Intel is struggling with fab issues right now, but things were really bad from about '02 to '06, with the P4 not completely going away until '08.

Bayart · on Nov 17, 2020

I remember people using socket adapters to put Pentium M chips on their desktop. I also have very bad memories of my P4 Prescott around the same era. What an absolute trash pile of an architecture.

spronkey · on Nov 19, 2020

It wasn't actually that bad... Everyone likes to remember it as this massively power hungry beast with terrible performance. It was power hungry compared to what came later, but it did also provide some pretty good performance at the time - especially if you were the kind of tweaky nutter who liked to overclock.

Northwood was a relative bargain when you cranked the bus clocks up well beyond what it said on the box.

reaperducer · on Nov 17, 2020

Ray Tracing

I'm not someone into the inner-working of chips much, but is "ray tracing" a new term used for something in microprocessors now? Or is this the same graphic "ray tracing" we were doing back in the 80's on Amigas and Atari STs?

bmurphy1976 · on Nov 17, 2020

We're crossing the threshold this year where real-time ray tracing in hardware isn't just some theoretical concept, it's actually useful and available in affordable consumer hardware (NVIDIA and AMD GPUs, as well as the PS5 and new Xbox all have it).

Yes, the NVIDIA 2xxx RTX series had it two years ago, but this is the year where it's actually viable and not so gimmicky.

wlesieutre · on Nov 17, 2020

It's also shipping in consoles this generation, which is going to drive a lot more games to actually implement it. When it's only being used by 5% of the PC userbase, maybe you don't bother doing that work. If cheaper GPUs can push that up to say 20% of PCs next year, you still might not.

But when every PS5 and XSX has raytracing hardware, suddenly it makes sense. That's going to be helpful for getting it supported in PC titles sooner.

swiley · on Nov 17, 2020

We've been able to do realtime ray tracing in software since forever though and the shadertoys and whatnot have been full of hardware accelerated demos, that's not so interesting.

I remember playing with a number of demos on my intel core 2 duo macbook (not pro) a decade ago.

bmurphy1976 · on Nov 18, 2020

They weren't doing that in hardware real time accelerated at 4k resolution in AAA games. This year they are. That's a big leap from your core 2 duo demos.

nitrogen · on Nov 17, 2020

The new GPUs have hardware specifically designed to do the type of math that raytracing does, such as ray/volume intersections, faster or more efficiently than generic shader hardware. Sufficient quantity (FLOPS) can become a quality of its own.

foobiekr · on Nov 17, 2020

Realtime ray-tracing is a mix of hardware acceleration for ray intersections and de-noising. The de-noising approaches are certainly novel and not really analogous to previous methods, imho, though the accelerated bounding box hierarchies for accelerating intersections certainly appear in previous iterations.

shahar2k · on Nov 18, 2020

I think this is the key answer, de-noising, using various neural techniques finally allowed acceptable images to be produced from sometimes <1 ray / pixel, where normally images produced at such densities were nearly unusable.

This is still not perfect and many ray tracing techniques rely on accumulation over time which limits certain images from working (I imagine raytracing a small particle cloud, or fast shifting objects to be a worst case scenarios)

deerIRL · on Nov 17, 2020

The new ray tracing is being done in real time rather than being baked into the scenes.

mynegation · on Nov 17, 2020

I did not read it as a list of technologies used in chip fabrication, but as a list of “killer apps” driving the need for more performance.

manigandham · on Nov 17, 2020

It's the same graphical ray tracing, although with many modern improvements like path tracing for randomization of the ray bounces to efficiently approximate and converge on the correct lighting, and ML-powered denoising and upscaling algorithms that take low-resolution fast rendering and transform it to higher quality and detail.

_teyd · on Nov 17, 2020

The latter. Systems have become fast enough recently to do realtime ray tracing.

ashika · on Nov 17, 2020

its a common enough workload that dedicated hardware for it has emerged, and software written to utilize that hardware is becoming more common place.

bhouston · on Nov 17, 2020

Intel in 2020 is IBM in 1980.

aidenn0 · on Nov 17, 2020

So in 2060 it will still be a huge company with $20B in revenue?

scottlamb · on Nov 17, 2020

Yeah. When I'm guessing how long something will last, I guess it's about halfway through its life unless I have really good information to the contrary. Intel was founded in 1968, so let's say it will shut down in 2072 (give or take...many...years). And Fortune 500 companies don't seem to totally shrink back to startup head count and revenue even when they go bankrupt—they might have significant revenue but even more expenses. So it seems pretty reasonable to guess Intel will be around with (at least) $20B of revenue in 2060, in spite of their rough patch today.

ZeljkoS · on Nov 21, 2020

> When I'm guessing how long something will last, I guess it's about halfway through its life

That is called Lindy effect: https://en.wikipedia.org/wiki/Lindy_effect

rualca · on Nov 17, 2020

>>So in 2060 it will still be a huge company with $20B in revenue?

That's great for shareholders.

Meanwhile, keep in mind that IBM exited the consumer computing device market. If that's also in store for Intel then it's a bit pointless in the PoV of those in the market for consumer computing devices.

3000000001 · on Nov 17, 2020

Past performance does not indicate future performance, but there are plenty of reasons to think Intel might be around for some time to come.

eulers_secret · on Nov 17, 2020

That'd be quite a decline, last year Intel reported $72B in revenue...

spullara · on Nov 17, 2020

As soon as they switch to a services business.

holoduke · on Nov 17, 2020

No Intel is back to 2002 with it p4. Give them a year or 2 and they are back with a new cpu. Don't forget that they have enough in their r&d pockets.

rowanG077 · on Nov 17, 2020

That's what I thought 3 years ago. It hasn't happened.

ksec · on Nov 18, 2020

Overlooking much of TSMC and Current Intel Fabs issues.

thatsnotmepls · on Nov 17, 2020

And yet everything still feels slower after every year lol.

codeulike · on Nov 17, 2020

Note that the comparison is also compiling to ARM vs compiling to Intel, so its partly to do with the simplicity of the compilation process between the two architectures.

https://twitter.com/rikarends/status/1328762958118346753

rualca · on Nov 17, 2020

I'm also curious to learn where was the bottleneck in both tests.

I've read stuff about how the M1-packing MacBook Air shipped with a SSD whose burst write speed was far higher than the one shipped in older MacBooks. The bottleneck on build jobs tends to lie on disk access, specially with projects comprised of a significant number of small files whose build also outputs a bunch of small files.

This is one of the reasons behind doing builds on RAM drives.

If that's the case then these weird speedups might not be due to magic properties on Apple's M1 professor but due to the fact that the processor doesn't idle as much while waiting for all those reads and writes to finish.

If anyone has any data on this, please do share.

bydo · on Nov 17, 2020

The SSDs are faster, but only by ~20%: https://techcrunch.com/wp-content/uploads/2020/11/BlackMagic...

From the TechCrunch review, which is pretty breathless but also contains a lot of good data: https://techcrunch.com/2020/11/17/yeah-apples-m1-macbook-pro...

spronkey · on Nov 19, 2020

Sequential performance is only one facet of overall SSD performance. Some older Apple SSDs weren't the greatest at random reads and writes - it's possible these new ones are quite a bit faster in that department, FWIW.

stingraycharles · on Nov 17, 2020

Can’t you just instruct the compiler to compile for Intel on ARM and vice versa? We do this all the time, build ARM artifacts on an x86 CPU with GCC.

codeulike · on Nov 17, 2020

Yes, but that isn't what was done in this case.

an_opabinia · on Nov 17, 2020

Well that pretty much settles the matter, doesn't it.

ChrisRR · on Nov 18, 2020

This should be a higher up. A fairer comparison would be compiling for the same target

There could be all manner of processor optimisations that are taking time to process. Like for like would be much more indicative

urmish · on Nov 17, 2020

You can get better optimizations with CISC. If anything, compiling to a RISC ISA and expecting the same optimizations is actually tougher.

epistasis · on Nov 17, 2020

So though it's a new age in some ways, I don't think it will benefit many people. It will just increase the churn. Witnessing all the cries of "16GB of RAM isn't enough!" makes it quite apparent that software behaves like a gas rather than a liquid, and will expand to consume whatever hardware is available to developers.

I think that for consumer products, which these are targeting, software responsiveness, usability, and battery lifetime are by far the most important metrics.

These chips help with battery, and can hell with software responsiveness if there's is developer focus on it.

But what will probably happen is that development teams will buy the fastest computers they can, and then develop software that is mostly, somewhat adequately performing on this beefy hardware, then ship it to customers that are on weaker hardware.

This effect is especially pronounced for web software, where it's easier to make unresponsive interactions due to so many layers of software, especially with developer network connections usually being 10x-100x less latency than users.

xbar · on Nov 17, 2020

I do not agree. Look at the AnandTech Speedometer 2.0 metric. https://www.anandtech.com/show/16252/mac-mini-apple-m1-teste...

The 8GB Macbook Air at $899 educational is faster, and will feel faster, than any laptop that anyone has owned or thought of owning at that price point.

Millions of buyers who need laptops for *-at-home activities will sing its performance praises on Sheets and Salesforce.

The "I need 16GB crowd" of content creators need more RAM and GPU but they are a tiny fraction of the market for laptops.

epistasis · on Nov 17, 2020

I guess my point is that any such gains are temporary and will soon disappear with the next release of software, or with the next website redesign.

MS Office apps, for example, are horrifically unresponsive on Macs. Switching the ribbon to a new view has 700-1000ms of lag on my 2.4 GHz i5. Maybe an M1 brings it to 350ms. Once MS developers start developing on an M1 laptop, the developers will change code, and it will slow down, and until it gets slower than it currently is, the code will not be optimized.

This is what I mean about software being like a gas rather than a liquid. Any new CPU performance will be consumed by developers because their threshold for performance optimization changes with each new performance improvement.

xbar · on Nov 17, 2020

I run MS Office local apps on my 2018 MBP 15" 6-core. They are slow. I agree the M1 will never make them feel better. Neither will the M2 or M3. They will always be slow.

If they were ever going to be fast, they would already be fast. They are a software problem unto their own.

epistasis · on Nov 17, 2020

Searching on the Costco website has a 2-3 second delay between each entry for me.

I think there are two routes to making software faster for users: 1) intense education of developers and rewarding them properly for keeping software responsive, and 2) only letting them develop on 5-10 year old hardware. I'm not really sure 1) would work with many teams, but I'm pretty sure 2) would.

skrtskrt · on Nov 17, 2020

MS Office apps have always been (possibly intentionally?) horrifically bad on Mac. Not a great benchmark IMO.

The average Joe user just uses a browser and something like Spotify. Even most word processing by college students is in Google Docs now - very few people I knew bought MS Office for their Macs when I was in college 5 years ago, even with a $99 student license through the school.

epistasis · on Nov 17, 2020

It's a perfect benchmark because even 20 years later performance is only getting worse, rather than better.

Developers will use all available resources until their is pressure to be more efficient. This is not a critique of developers, this is the nature of software. Unless critical development time is spent making sure that software is responsive, it will only ever have barely acceptable performance.

Which is why new, faster CPUs have very little effect on users. Any performance gains will be gobbled up by new software frameworks that promise better use of developer time, but which may come at an absolutely tremendous cost of UI responsiveness.

Spotify, Slack, Office on Mac, hyper complex JavaScript web frameworks... all will continue to take more and more CPU cycles that are available.

esja · on Nov 17, 2020

"What Andy giveth, Bill taketh away".

https://en.wikipedia.org/wiki/Andy_and_Bill%27s_law

powerapple · on Nov 18, 2020

Definitely agree with you. The current software were developed for a 2x slower spec, and tested, tuned against a 2x slower spec, the speed gain will disappear once the development platform changes to M1, however, there will be more functionality, more animation in interface, as it has happened in past years.

ogre_codes · on Nov 17, 2020

The CPU gains on ARM have been increasing consistently year over year for the past decade. People have posted benchmarks of the A11/ A12/ A13 versus Intel for a while so this has been pretty obvious. It's just surfacing because suddenly we have a desktop CPU with desktop software like compilers and other things where it's more obvious outside of benchmark tools.

Apple is just jumping onto their existing ARM track, once they migrate their product line, which has surpassed Intel. Once they've migrated all their lines to ARM, the performance gains will be more like they have been on the iPhone/ iPad over the past few years. Mostly 20-30%/ year.

bryanlarsen · on Nov 17, 2020

IMO, this has little to do with it being ARM. 30 years ago ARM had a significant micro architectural advantage in performance per watt, but in this era of 10 billion transistor chips, that advantage has disappeared. x86_64 rationalized the x86 architecture and decode is such a small fraction of the power budget that it really doesn't matter anyways.

What does matter, IMO:

- assembling a killer team

- 5nm process

- high speed, low latency DRAM

- big-little

epistasis · on Nov 17, 2020

I'm no expert, but the only big architectural differences are a massively larger decoder and a reorder buffer that's several times as large as x86 designs.

If these are actually the reasons for the performance difference, and it's difficult to do these on x86 because of the instruction set, it seems to this amateur that ARM64 really does have an advantage over x86.

twoodfin · on Nov 17, 2020

Don't forget ARM's more relaxed memory model vs. x86's TSO.

unsigner · on Nov 17, 2020

One of the reasons Rosetta 2 works so well is Apple silicon sticks to the more restricted x86 memory model.

enzo1982 · on Nov 17, 2020

Does it? Apple's documentation seems to disagree [1]:

"A weak memory ordering model, like the one in Apple silicon, gives the processor more flexibility to reorder memory instructions and improve performance, but doesn’t add implicit memory barriers."

[1] https://developer.apple.com/documentation/apple_silicon/addr...

duskwuff · on Nov 18, 2020

It's switchable at runtime. Apple silicon can enable total store ordering on a per-thread basis while emulating x86_64, then turn it back off for maximum performance in native code.

Here's a kernel extension someone built to manipulate this feature: https://github.com/saagarjha/TSOEnabler

sfblah · on Nov 17, 2020

Couldn't Intel just come out with a new set of reduced-complexity instructions that run on a per-process basis based on some bit being flipped on context switches? Then legacy apps would run fine, but the new stuff would work too. This seems not that hard to address.

epistasis · on Nov 17, 2020

As I understand it, the challenge to making wider x86 chips is the mere existence of some instructions. Adding new instructions can't help with that. But I'm just repeating what I heard elsewhere:

> Other contemporary designs such as AMD’s Zen(1 through 3) and Intel’s µarch’s, x86 CPUs today still only feature a 4-wide decoder designs (Intel is 1+4) that is seemingly limited from going wider at this point in time due to the ISA’s inherent variable instruction length nature, making designing decoders that are able to deal with aspect of the architecture more difficult compared to the ARM ISA’s fixed-length instructions.

https://www.anandtech.com/show/16226/apple-silicon-m1-a14-de...

ant6n · on Nov 17, 2020

I find that odd. Don’t they have some sort of icache? Intel could decode into a fixed width Alternative instruction set inside the icache, then use a wider decode when actually executing.

Symmetry · on Nov 17, 2020

Yes, they have a cache for decoded operations. It'll hold a certain number but it's sort of inefficient because the fixed width decoded instructions are a lot larger than the variable length instructions so it doesn't hold too many. Because it doesn't help on code with large footprints and not too much time in inner loops you don't necessarily want the number of ops you can get form it to be too much more than the width of the rest of the system if you want a balanced design.

sitkack · on Nov 17, 2020

The ISA differences between ARM and x86 do not account for the difference in performance, there are multiple factors here (process, ssd, memory bandwidth, cache, thermal reservoir, etc).

While this is wonderful for ARM in the now-term, we just moved from walled ISAs to a plurality of ISAs, compute just became a bulk commodity in a way that it could not with an x86 duopoly.

Anyone can now take off the shelf RISC-V designs that are currently at > 7.1 coremarks/mhz and get them fabbed on Glofo or TSMC. If you need integrator help, you can use the design services of SiFive.

unsigner · on Nov 17, 2020

There’s not a shred of evidence RISC-V can approach the levels of performance discussed in this thread. There’s a lot of “big implementations can potentially do X” hand waving in RISC-V land, and not much real silicon to show for it.

ogre_codes · on Nov 17, 2020

Yes, a combination of things around Apple's A series CPUs.

ARM has been improving much faster than Intel.

Apple has been executing ARM much better than anyone else.

Apple's auxiliary processors and integration have been top notch.

TSMC has been crushing Intel in getting to 5nm.

sudhirj · on Nov 17, 2020

Chalking it down to ARM doesn’t cut it... other companies make ARM chips too, including Qualcomm. Most Android phones run ARM, and given they outnumber iPhones heavily you’d expect massive improvements. But this is better than putting Qualcomm’s best chip into the Mac.

rodgerd · on Nov 17, 2020

Qualcomm spend their time and energy on gaining monopoly lock-ins via standards committees, not building better chips.

robotnikman · on Nov 17, 2020

Qualcomm seems to be stagnating just like Intel at the moment, it will be a few years until high performance ARM chips come to non-Apple devices, unless Microsoft can strong arm them for their Surface line

I'm guessing the high performance ARM chips for non-Apple devices will be coming from Nvidia or Samsung in the future

ogre_codes · on Nov 17, 2020

Yes, ARM is one piece of many. (Commented about this above)

Though I suspect if Qualcomm were able to source a TSMC 5nm chip, it would be more competitive with Apple than Intel is at this point though. Apple has a lot of other things going for it where Qualcomm lags (the Secure Enclave, graphics performance, audio and photo processing, the neural engine etc etc)

selectodude · on Nov 17, 2020

Debatable. Qualcomm operates under a strict transistor budget because their chips lack a dedicated customer willing to pay what it costs to develop an ultra-wide CPU like this. Apple knows they're going to sell 100+ million of whatever core they make so they're able to more easily amortize and justify the costs of development.

Intel gets no such benefit of the doubt. I have no idea what on earth is going on over there.

ogre_codes · on Nov 17, 2020

Agreed.

What I was trying to get at is that the ARM designs plus the TSMC fabs are a big part of Apple's success here. The pieces are out there where someone else could put together an ARM based package that's more competitive with Apple. In retrospect, maybe it's more likely to see something like this from Nvidia than Qualcomm.

Even then, it's hard to say how competitive that CPU would be. Just based on Microsoft's Surface with it's half-assed Qualcomm CPU, it seems feasible though.

AgloeDreams · on Nov 17, 2020

I think another shock here is that a lot of people discredited the ability of ARM cpus as well.

Back when the iPad Pro with the A10X came out, Apple claimed it was faster than half of all Laptops sold and people in the PC space were yamming on and on about how numbers don't show how much better x86 cpus are at 'desktop stuff' and that ARM cpus can't equal x86, even with the same thermal envelope and shouldn't ever be compared. Ironically, many are now stating that the reason why they are so good is because of ARM, which isn't true either lol.

imtringued · on Nov 18, 2020

>people in the PC space were yamming on and on about how numbers don't show how much better x86 cpus are at 'desktop stuff' and that ARM cpus can't equal x86, even with the same thermal envelope and shouldn't ever be compared.

It needs 30W at 4 cores 3.2Ghz. Ryzen needs around 5W per core but it's on a worse process. The entire system does use less power than a x86 system but that has nothing to do with the processor. It's more about how the SoC is arranged and that RAM is (almost) on the same package. It means they can get away with higher bandwidth and lower power consumption for the entire system.

The idea that it's all about the processor is completely wrong. Yet all we have heard is how fanboys cry it's going to be 3x faster than desktop CPUs because of misleading TDP numbers.

AgloeDreams · on Nov 19, 2020

As far as we know, all four large cores at max plus the 4 small is ~20W. Whole chip max power use is 30W including GPU and the ML processor. Ryzen also blows a lot of power on things other than cores but AMD is absolutely the closest to this however. The hard thing for them is that the Big/Little arch is a huge advantage for battery use at idle. I would say the game being played here is that Apple is betting on this to scale all the way up for fast burst but they know that the real advantage is that their cores can also scale much much lower than anything out there. Less about magical performance gains and more about remarkable power use paired with much better power management lessons learned by making smartphones. Qualcomm could do this too if they actually cared about it.

oarsinsync · on Nov 17, 2020

I was until I tried doing the test myself. It takes 82 seconds to compile on my i5-4200M.

I'm not sure this test is deserving of the breathless headline and commentary, especially since the original tweeter later follows up with:

> Extra info: The M1 macbooks (air/pro) can't drive 2 external screens, and the air throttles a bit after 3+ minutes sustained compute (20-30%)

https://twitter.com/rikarends/status/1328753176552632321

I'll be more interested if the M1 can compile something 2x quicker than the i9 when the compile time on the M1 exceeds 30-60 minutes rather than being less than a minute.

EDIT: to be clear, I expect the M1 to feel faster than the i9 for the vast majority of users, however the headline is "in a real world Rust compile", implying that this is a more valid test than synthetic benchmarks. I take issue with that, as I don't really consider something that compiles in less than 2 minutes on 6 year old hardware to be a much more useful test than the benchmarks.

We already know the A-series of chips performs incredibly in short workloads. We have no information yet on how it performs under sustained workloads.

w0utert · on Nov 17, 2020

>> We already know the A-series of chips performs incredibly in short workloads. We have no information yet on how it performs under sustained workloads.

What makes you think that given sufficient cooling, it will not perform exactly the same as the M1 in the MBA but sustained? It’s not like the ARM architecture changes anything in the thermodynamics of cooling cpus compared to an x86 chip, right?

I’d wager that under load an i9 with passive cooling wouldn’t even last 30 seconds without throttling below even its base clock, if it doesn’t just shut down to prevent frying itself

imtringued · on Nov 18, 2020

It's in the same ballpark of power efficiency as AMD's x86 chips. It's slightly more efficient because of 7nm vs 5nm but if you scale it up to desktop frequencies it's going to consume the same amount of power as Ryzen CPUs.

w0utert · on Nov 18, 2020

Yes but the point is that Apple doesn't need to scale the M1 up to desktop frequencies, because it already is faster than x86 in single-threaded workloads, at lower clocks and significantly lower power. To scale up the multithreaded performance they just have to increase the core count and scale the cooling system accordingly, ie: exactly like you would have to for x86. A decent desktop cooler can dissipate enough heat to run CPU's with 100+ Watt TDP's, while the M1 in the current Mac Mini sits around ~20W estimated if you discount the RAM.

So again, what would make anyone think that an M1 with decent cooling would not be able to maintain the current ST performance indefintely, or a hypothetical 8+8 or even 16+16 core M1X or M2 with a TDP of 100W and top-notch cooling solution would be impossible?

spronkey · on Nov 19, 2020

Multicore scaling on M1 doesn't appear to be as efficient as that on Ryzen - for example, the 4200U is able to beat the M1 in multicore tasks at quite a similar power draw, but gets soundly beaten in single core. Ryzen's big advantage over previous multicore implementations was infinity fabric - so clearly there's more to scaling than just the actual compute cores.

Don't forget that scaling up is also not just about frequencies, there are also packaging considerations - the CPU dies have to actually be able to dissipate the heat generated, and the package itself has to be able to do so as well. I'd expect that this is something AMD and especially Intel have a leg up over Apple with - although, considering they've already tread that ground it makes Apple's job a bit easier too.

utxaa · on Nov 18, 2020

What makes you think that given sufficient cooling, it will perform exactly the same as the M1 in the MBA but sustained?

Nullabillity · on Nov 17, 2020

It's an Apple chip that will only ever be in Apple computers.

Asking about how it would do in a computer with sufficient cooling is about as relevant as asking how it would do in a computer with a usable keyboard or OS.

bydo · on Nov 17, 2020

How about compiling WebKit?

https://techcrunch.com/wp-content/uploads/2020/11/webkit-com...

oarsinsync · on Nov 18, 2020

It's a shame we have a headline proudly announcing that the M1 is 2x faster, when the reality is it's about 8% faster when doing a similar real world test for longer.

What's also hugely impressive is that under better cooling conditions, it's also 25% faster.

None of these numbers capture headlines like 2x sadly, but that's still massively impressive.

floatingatoll · on Nov 17, 2020

For those without context, WebKit apparently takes 30+ minutes to compile, as of (2018) Intel:

https://blogs.gnome.org/mcatanzaro/2018/02/17/on-compiling-w...

vladvasiliu · on Nov 18, 2020

I think this just shows how little improvement there has been over the last years. However, getting twice the performance with a fanless design looks interesting, even if there is throttling after a while. For what I do, I don't regularly wait for compiles that take ages, but having a silent computer without losing performance is a net win.

Curiosity got the best of me too and I ran the test on my late 2013 MBP, 2.3 GHz i7, 16 GB ram. Compilation took 44 seconds and the fans didn't even spin up (with 23 ºC ambient temperature).

A little further down the thread [0] he gives the actual numbers, which are around 20s on the M1, which puts the i7 [1] at around 40s.

I'm not sure how much of this is Rust specific, but for my own projects I haven't noticed a big difference between my mbp, an old i7-3930k and a newer i5-8500. The MBP is somewhat slower, but it only has 4 cores while the others have 6.

[0] https://twitter.com/rikarends/status/1328706132752347138?s=2...

[1] A tweet corrects the MBP CPU as being an i7 and not an i9.

vgchh · on Nov 17, 2020

This is the Apple equivalent of a Tesla showing up and out competing much more expensive cars on the performance. It feels like a huge disruption and probably Apple’s chance to gain significant PC market share.

JumpCrisscross · on Nov 17, 2020

> the Apple equivalent of a Tesla showing up and out competing much more expensive cars on the performance

Which was Tesla's equivalent of Jobs walking on stage with the iPhone, itself an homage to his iMac G3 turnaround, in turn a recapitulation of his promotion of the Apple II.

spronkey · on Nov 19, 2020

The issue with this is that Apple has the same problem that they have had for years - they don't really offer mainstream products.

Even their most inexpensive products are on the higher end of the pricing spectrum, and they could be 5x the performance but it still wouldn't matter.

There's a reason why Chromebooks are so popular and it sure isn't anything to do with their performance.

unsigner · on Nov 17, 2020

The difference being that Apple can actually manufacture the numbers needed to make a difference.

throwaway894345 · on Nov 17, 2020

I’m guessing the performance improvements derive from integrating the memory onto the same chip (instead of using external memory), not from ARM (although power savings come from ARM). So we will probably see a new era of laptop SoCs, but that also means coupling RAM with CPU (or maybe you can mix and match the on-chip RAM with external RAM?).

wmf · on Nov 17, 2020

I’m guessing the performance improvements derive from integrating the memory onto the same chip

Nope, LPDDR4x-4266 is LPDDR4x-4266. Apple, Intel, and AMD all have access to the same RAM. The Firestorm core is the real advantage.

unixfg · on Nov 18, 2020

A single core being able to fully saturate LPDDR4X bandwidth seems pretty advantageous.

https://www.anandtech.com/show/16252/mac-mini-apple-m1-teste...

“One aspect we’ve never really had the opportunity to test is exactly how good Apple’s cores are in terms of memory bandwidth. Inside of the M1, the results are ground-breaking: A single Firestorm achieves memory reads up to around 58GB/s, with memory writes coming in at 33-36GB/s. Most importantly, memory copies land in at 60 to 62GB/s depending if you’re using scalar or vector instructions. The fact that a single Firestorm core can almost saturate the memory controllers is astounding and something we’ve never seen in a design before.”

Symmetry · on Nov 17, 2020

You do get a power benefit from keeping the RAM in-socket rather than having to go out over more wires to reach the RAM.

rbanffy · on Nov 17, 2020

Isn't it attached via a wider bus?

imtringued · on Nov 18, 2020

The memory bandwidth you get on the M1 is around what you would expect from a dual channel desktop with good RAM. So basically twice as fast as competing laptops.

wmf · on Nov 18, 2020

Laptops have the same or faster RAM than desktops now.

throwaway894345 · on Nov 17, 2020

Ah, well I'm happy to be mistaken.

tambourine_man · on Nov 17, 2020

Yes. Exactly. I haven't been this exited about a chip in a really long time. Decade-long, probably.

And I'm really hopping Intel manages to get its act together. Competition is great for everyone.

regularfry · on Nov 17, 2020

I'd make a bet on Intel not turning around without new leadership at least willing to fling a gravitationally-significant pile of money at TSMC in the short term and significantly improve their pipeline in the medium to long term. They're so far behind now they're basically betting on everyone else screwing up. It's not just their position, though, it's their rate of change... their chip releases aren't getting faster as fast as AMD's or ARM's are, not by a long way.

spullara · on Nov 17, 2020

They need a Satya to replace their Balmer. Immediately.

utxaa · on Nov 18, 2020

well don't exit! stick around man.

tambourine_man · on Nov 18, 2020

No matter how much I spend on keyboards, I still can't type.

But thanks, I will :)

utxaa · on Nov 19, 2020

breaks. https://workrave.org/

bob1029 · on Nov 17, 2020

I thought this for a few seconds and then realized these sorts of benchmarks are lacking obvious controls (i.e. compiling for different architectures as noted by others).

If you want a more realistic sense of how the M1 performs relative to x86 peers in raw, equivalent workloads, there are some cinebench numbers appearing out there:

https://hexus.net/tech/news/cpu/146878-apple-m1-cinebench-r2...

I am way more interested in non-accelerated, deeply out-of-order instruction processing capabilities if we are talking about a "new era of computing". Being able to compile to ARM faster than x86_64 and having super fast HW codecs for processing special unicorn byte streams are not very compelling arguments in this context.

Show me an ARM chip scaled up to the same power+price budget as an Epyc 7742, throw them both at a 24 hour SAP benchmark, and then I will start paying attention if the numbers get close.

unsigner · on Nov 17, 2020

People are celebrating a kickass electric bicycle doubling the range of all the existing ones, and you demand to be shown it scaled up to the same power+price budget as an open pit mining truck, hauling dirt in three shifts. It might or might not happen, but the electric bicycle is still impressive.

spronkey · on Nov 19, 2020

Well, it's more like people celebrating an electric scooter that has twice the range of the electric bike.

Not saying it isn't a novel form of transport or equally useful in most cases, but... we're comparing a very stripped-down SoC to systems that have vastly more complexity for several different reasons - not least the ability to support more modular CPU/memory/GPU configurations.

Rework the existing x86 cores that we have into similar configurations and we'll likely see pretty substantial efficiency gains there too.

lavp · on Nov 17, 2020

Couldn't agree with you more. For the first time in who knows how long, I can confidently say that we made a leap into a new generation of computer power. It's a huge step up from the small marginal improvements we've been seeing over the past few years.

d33lio · on Nov 17, 2020

Completely! Worth the wait given it took about a decade for Arm chips to suck less ;)

csjr · on Nov 17, 2020

It certainly feels that way! Also, imagine if they keep pushing it like the A series.

tonyedgecombe · on Nov 17, 2020

I suspect HP/Lenovo/Microsoft are squirming in their chairs right now.

cercatrova · on Nov 17, 2020

I'm not so sure. Some people simply need Windows and now you can't Bootcamp on ARM Macs so people will take the best Windows laptop instead.

Siira · on Nov 17, 2020

Who knows, they might continue to offer an Intel MBP for years to come, or even add Windows ARM support if it commands enough demand. The other manufacturers have no mote here.

HeadsUpHigh · on Nov 17, 2020

Windows arm is not real windows. It's not a substitute. Apple might command it's developers like a herd, moving them to whatever fancy new thing they make but windows is much more fragmented with 20+ years of backwards compatibility and tons of unmaintained software that won't be magically ported.

robotnikman · on Nov 17, 2020

Current Windows on ARM seems promising with its x86 emulation so far, but still has a lot of room for improvement. Combined with the Windows 10 S sandboxing old windows apps I can see Microsoft catching up again in a few years once we have more widely available high power ARM processors.

In IT though, I don't expect any changes over to ARM hardware for another decade. I know where I work we have plenty of legacy cruft which would probably run into some weird edge cases if emulated on ARM.

harrygeez · on Nov 17, 2020

Qualcomm chips are not that far behind Apple's, and Microsoft has Windows on ARM ready for a while now. Those performance advantage won't be there for long.

I think the bigger question is what does this spell for x86-64?

stu2b50 · on Nov 17, 2020

Uh, unless there's some new Qualcomm chip I haven't heard of, the Qualcomm chips are all being utterly crushed by Apple's offerings in geek bench and specperf.

lrae · on Nov 17, 2020

Guess we'll see (maybe) in December (Snapdragon 875 will be announced December 1st at Qualcomms Digital Summit) - or a bit later, but the 875, now also on 5nm, supposedly will be quite a bump. Leaked (supposed) benchmarks show 30%+ improvement, which would be A14 territory.

I also have my doubts, but would be great for the market. (New Exynos 2100 supposedly also being up there.)

bydo · on Nov 17, 2020

A 30% improvement to the 865+ starts getting close to A13 single-threaded performance, on 7nm from a year ago.

So Qualcomm won't be quite so far behind Apple, but it's still pretty significant.

lrae · on Nov 17, 2020

Sure, more interesting than just pure CPU performance will be improvements to GPU, image processing, ML and others though - at least in my eyes.

If we'll just look at geekbench's single core bench, I'm sure Apple will still lead. (And overall likely still produce the better chips.)

leptons · on Nov 17, 2020

Qualcomm chips don't have on-chip DRAM at the highest clock rate possible. If they did the same things Apple is doing in that way, the performance advantage of M1 shrinks. There's no real "magic" in the M1 or Apple, it's a reality distortion field that is hitting everyone right now. It's entirely possible to outperform the M1, there just hasn't been a need to do that until now - no major OS would have benefitted from an ARM on steroids before now so there wasn't even a need for it.

leptons · on Nov 18, 2020

[flagged]

dang · on Nov 19, 2020

If you continue to degrade the quality of this site even further, we will have to ban your account. We've asked you about this more than once already.

https://news.ycombinator.com/newsguidelines.html

leptons · on Nov 23, 2020

Where have you asked me more than once? Please point me to that. If so, then I'm sorry, but I don't remember being told anything like that before this comment.

dang · on Nov 23, 2020

It was here: https://news.ycombinator.com/item?id=24475231

Sorry I said "more than once" — I didn't notice those two comments were in the same thread.

leptons · on Nov 24, 2020

It's pretty amazing to me that you consider that comment "mean". How do you ever interact with the world if such a benign comment is considered "mean"? I'm serious, I just can't understand how that comment is "mean". Please try to explain it, because it makes no sense to me. Is any criticism at all considered "mean" now? Can anyone point out flaws in someone else's comment or post without being called "mean"?

The_Colonel · on Nov 17, 2020

> I think the bigger question is what does this spell for x86-64?

Not much?

People are acting like M1 destroys x86, but as AnandTech showed in the recent benchmark, M1 is trading blows with Zen 3 in single thread performance while having much larger core and process advantage (5nm vs. 7nm) thus being actually more expensive to produce.

chungus_khan · on Nov 17, 2020

Plus I would like to remind people that there were points where PowerPC Macs were extremely impressive, and that didn't magically reorder the world because existing software and OS ecosystems mean a hell of a lot more than just straight performance.

bioipbiop · on Nov 17, 2020

The only Zen 3 chips are desktop class, so Apple’s first generation ultra book oriented processor is trading blows with the best line of desktop processors. That’s a pretty big deal!

nicoburns · on Nov 17, 2020

Note that apple has a process advantage here. AMD's 5nm chips will probably see a decent perf boost.

The_Colonel · on Nov 17, 2020

Zen 3 mobile chips are probably coming in the next months.

Single core performance does not differ much between mobile and desktop CPUs these days.

harrygeez · on Nov 17, 2020

My concern is not with performance but the seemingly energy consumption advantage. If Arm chips perform just as well (or as we've just seen, better) than x86 chips while sipping much less power, why would anyone want x86 on mobile devices?

Are there any workloads that requires or perform better with x86?

The_Colonel · on Nov 17, 2020

This is clear only when you compare against an obsolete Intel which is known to be throttling power hog.

The difference might be way smaller or non-existent once we see some comparisons with latest AMD offerings (especially when we have also CPUs with comparable process).

spronkey · on Nov 19, 2020

Yeah, Anandtech's dive revealed some interesting benchmark results - such as the Zen 2 4200U part being quite close in both performance and power consumption (a little higher) to the M1 in heavily multithreaded tasks, but getting stomped in both aspects in single-threaded ones.

The M1 isn't a tiny power-sipping mobile part - it sucks power down just like AMD's 15W TDP CPUs do. The efficiency gains Apple are getting here are likely to be the result of several factors, only one of which are the CPU cores themselves.

Symmetry · on Nov 17, 2020

Do we have numbers on how big the M1 is in square mm terms?

spullara · on Nov 17, 2020

The price of the 5950X is nearly the same as the Mac Mini. Estimates on actual cost of the M1 is around $75-100.

The_Colonel · on Nov 17, 2020

They could have used 5600X with virtually the same single core performance as well.

You also can't compare manufacturing cost and retail cost. But given AMD cores are both smaller and manufactured on older process, it's probably safe to assume AMD ones are cheaper to manufacture.

spaceman_2020 · on Nov 17, 2020

I was waiting for AMD Zen3 to build a new desktop. But now that these tests are out, I am very, very tempted to go with Apple again. The Mac Mini numbers look comparable enough and it will be cheaper than an equivalent Windows machine (never thought I'd see this timeline).

utxaa · on Nov 18, 2020

If you can, wait. We don't know what sustained loads are going to look like, or even medium term reliability.

Dirlewanger · on Nov 17, 2020

As they should be. Windows laptops have been embarrassingly stagnant for over a decade; they just copy Apple's innovations. This isn't an innovation that they will be able to release in their next product refresh. They're going to have to invest significant amounts of capital.

egao1980 · on Nov 17, 2020

Lenovo and Microsoft had started to release ARM devices way before Apple.

_alex_ · on Nov 17, 2020

Turns out software and integration matter

danaris · on Nov 17, 2020

Yeah? And how are they looking today?

imtringued · on Nov 18, 2020

I don't know but the fact that Windows on ARM is completely locked down and only lets you install applications from the Microsoft Store made me abandon Windows as a platform for good.

int_19h · on Nov 19, 2020

It hasn't been locked down for a while now.

rorykoehler · on Nov 17, 2020

Satya Nadella just tweeted about their new Pluton processor. I have a feeling it won't get the same hype as the M1 but they are clearly already trying to defend.

GiorgioG · on Nov 17, 2020

FYI - Pluton is a security chip, not a general purpose CPU.

rorykoehler · on Nov 17, 2020

Ah. I haven’t been following. I thought the timing was comical but I guess less so in that case.

AgloeDreams · on Nov 17, 2020

THIS. I saw multiple headlines calling it a processor, which may be somewhat true but it's not a General purpose processor and certainly is not a CPU. Yes, Apple's T2 chip has an A8 in it, but they don't call it a processor.

wmf · on Nov 17, 2020

Pluton is just playing catch-up with Apple's Secure Enclave.

burlesona · on Nov 17, 2020

I agree, and posted something similar on another M1 thread, though it got downvoted and I'm not sure why :D

tambourine_man · on Nov 17, 2020

I lot of people are having a hard time grasping or accepting.

“But this benchmark has these issues”, “but this is hardware accelerated, so it's not a fair comparison”, etc. I think when enough benchmarks and real world usage have been run, it will sink in.

What did people expect? They've been killing it with 5W fan-less chips for years. Have you seen how confident Apple is in those videos?

utxaa · on Nov 18, 2020

it's fair to be careful with something new. if it's that great and it will be quickly apparent.

tambourine_man · on Nov 18, 2020

I wouldn't classify the backlash as carefulness.

ummonk · on Nov 17, 2020

It would be normal if Moore's law hadn't dropped off in recent years.

whalesalad · on Nov 17, 2020

I love everything about the move to Apple Silicon with the exception of the decision to put memory on-die or in-package (not sure how it is configured). They call it 'Unified Memory'. It makes a lot of sense but I don't know if they are going to be able to pack enough memory in there.

A lot of folks are fixated on CPU performance lately (which is rad) but I think that there is a tendency to ignore memory. I have 32gb of RAM on my Macbook Pro and finally feel like it has enough. You can't get an M1 configuration right now larger than 16GB which is a table-stakes baseline dev requirement today.

sudhirj · on Nov 17, 2020

One, this is repeating the iPhone vs Android comparisons. iPhones with 4GB RAM feel faster and get more work done than Androids running Qualcomm ARM with double that amount of RAM. The faster IO becomes the cheaper paging becomes, and macOS and iOS have a lot of work done to handle paging well.

Two, this is the entry level processor, made for the Air, which is what we get for students, non-technical family members and spare machines. Let’s see what the “pro” version of this is, the M1X or whatever. We already know this chip isn’t going to go as is into the 16 inch MacBook Pro, the iMac Pro or the Mac Pro. I’d like to see what comes in those boxes.

cactus2093 · on Nov 17, 2020

I get what you're saying, I'm also looking forward for the even higher performing machines with 12 or 16 core cpus (8 or 12 performance cores + the 4 efficiency cores?), 32gb ram option, 4 thunderbolt lanes, and more powerful gpus. Wondering exactly how far apple can push it, if this is what they can do in essentially a ~20TDP design.

On the other hand it's quite funny that the title of this article is "16-inch MBP with i9 2x slower than M1 MacBook Air in a real world Rust compile" and the comments are still saying "yeah but this is entry level not pro".

Apparently Pros are more concerned about slotting into the right market segment than getting their work done quickly :)

aunetx · on Nov 17, 2020

I may be wrong, but the ecosystem does not really change here right? I mean, memory management should be roughly the same between x86_64 and arm regarding the amount of ram used, so I guess 16gb of ram under old macbooks is the same as 16gb under the new ones

sudhirj · on Nov 17, 2020

All else being equal, yes, but the memory is faster, closer to the chip, has less wiring to go through, and because of vertical integration they can pass by reference instead of copying values internally on the hardware. The last one is big - because all the parts of the SoC trust each other and work together they can share memory withing having to copy data over the bus. That coupled with superfast SSDs means that comparisons aren't Apples to oranges, excuse the pun.

16GB of memory on-die shared by all the components of an SoC is not the same as 16GB made available to separate system components, each of which will attempt to jealously manage their own copy of all data.

kcartlidge · on Nov 17, 2020

I'm not a hardware person, but I do software for a living. Your comment makes things much clearer.

You're saying that the effective difference in having the shared memory is that you get more data passed by reference and not by value at the lower levels?

If that's true, then you get extra throughput by only moving references around instead of shuffling whole blocks of data, and you also gain better resource usage by having the same chunks of allocated memory being shared rather than duplicated across components?

sudhirj · on Nov 18, 2020

That’s how I understand it, yes. I’m not into hardware either, going by engineering side of the event. In the announcement, there some parts shot at the lab/studio, where the engineers explain the chip. Ignore the marketing people with their unlabelled graphs, the engineers explain it well.

But yes, they’re basically saying because this is “unified memory”, there’s no copying. No RAM copies between systems on the SoC, no copies between RAM and VRAM, etc. because the chips are working together, they put stuff on the RAM in formats they can all understand, and just work off that.

andrekandre · on Nov 17, 2020

> because of vertical integration they can pass by reference instead of copying values internally on the hardware.

got any links about that?

sudhirj · on Nov 18, 2020

Going by the engineering explanations in the announcement video. See the segments shot in the “lab” set. They’re actually pretty proud of this and are explaining the optimisations quite candidly.

andrekandre · on Nov 18, 2020

interesting, thanks

wmf · on Nov 17, 2020

"Controlling the ecosystem" and "integration" and such are just wishful-thinking rationalizations. Chrome and Electron will use however much RAM they use; Apple can't magically reduce it. If you need 32GB you need 32GB.

diebeforei485 · on Nov 17, 2020

Slack (probably the most popular Electron app) has confirmed they are going native on Apple Silicon: https://twitter.com/SlackEng/status/1326237727667314688?s=20

wmf · on Nov 17, 2020

This is still Electron, no?

zerkten · on Nov 17, 2020

My guess is that it's an ARM build of Electron - unless they've been working to bring the iOS version over? That would be a huge win.

Even if this is Electron, I suspect this still great news for anyone that needs Slack. The Rosetta 2 performance of Electron would likely be a dog and Slack is a very high profile app with a lot of visibility.

sudhirj · on Nov 18, 2020

Yeah, that’s partly true. Applications that allocate 1000GB will need to get what they ask for. No getting around bad applications. The benefits are more in terms of lower level systems communicating by sharing memory instead of sharing memory by communicating, which is always faster and needs less memory, but needs full trust and integration.

ogre_codes · on Nov 17, 2020

> You can't get an M1 configuration right now larger than 16GB which is a table-stakes baseline dev requirement today.

Everyone on my team has been using 15" MacBook Pros with 16GB RAM for the past 3 years. I suspect most developers run with 16GB of RAM just fine.

I'm not arguing "16GB is fine for all developers everywhere!", but it's absolutely not a hard requirement. I suspect for a lot of us, the difference in performance between 16GB and 32GB is trivial.

Regardless, the thing which is kind of stunning about this chip is that they are getting this kind of performance out of what is basically their MacBook Air CPU. Follow on CPUs—which will almost certainly support 32GB RAM—will likely be even faster.

JimDabell · on Nov 17, 2020

> Regardless, the thing which is kind of stunning about this chip is that they are getting this kind of performance out of what is basically their MacBook Air CPU.

Or to put it a different way: this is the slowest Apple Silicon system that will ever exist.

Someone · on Nov 17, 2020

Laptop or desktop: likely, but even if the next Apple Watch will be faster, which I doubt, their smart speakers and headphones probably can do with a slower CPU for the next few years.

mft_ · on Nov 17, 2020

Is there a name for this trait of bringing unnecessary precision to a discussion, I wonder?

I mean, contextually it’s obvious that the previous poster meant this is the slowest Apple Silicon that will ever exist in a relevant and comparable use case - i.e. a laptop or desktop. And the clarification that yes, slower Apple Silicon may exist for other use cases didn’t really add value to the discussion.

And I’m not even being snide to you - I’m genuinely interested whether there’s a term for it, because I encounter it a lot - in life, and in work. ‘Nitpicking’ and ‘splitting hairs’ don’t quite fit, I think?

fpgaminer · on Nov 18, 2020

I don't have a name for it, but I agree that it should have a name. It's a fascinating behavior. I nitpick all the time, though I don't actually post the nitpicks unless I really believe it's relevant. Usually I find such comments to be non-productive, as you mention.

And yet, even though I often believe nitpicks to be unnecessary parts of any discussion, I also believe there is a certain value to the kind of thinking that leads one to be nitpicky. A good programmer is often nitpicky, because if they aren't they'll write buggy code. The same for scientists, for whom nitpicking is almost the whole point of the job.

It's just an odd duality where nitpicking is good for certain kinds of work, but fails to be useful in discussions.

DeRock · on Nov 18, 2020

It sounds like the word you’re looking for is ‘pedantic’.

bocklund · on Nov 17, 2020

Maybe "overparticular"

JimDabell · on Nov 18, 2020

Everything I have seen from Apple talks about Apple Silicon as the family of processors intended for the Mac, with M1 as the first member of that family.

I know other people have retroactively applied the term “Apple Silicon” to other Apple-designed processors, but I don’t think I’ve seen anything from Apple that does this. Have you?

whalesalad · on Nov 17, 2020

I think if you have a very specific role where your workload is constant it makes sense. I am an independent contractor and work across a lot of different projects. Some of my client projects require running a full Rails/Worker/DB/Elasticsearch/Redis stack. Then add in my dev tools, browser windows, music, Slack, etc... it adds up. If I want to run a migration for one client in a stack like that and then want to switch gears to a different project to continue making progress elsewhere I can do that without shutting things down. Running a VM for instance ... I can boot a VM with a dedicated 8GB of ram for itself without compromising the rest of my experience.

That is why I think 16GB is table stakes. It is the absolute minimum anyone in this field should demand in their systems.

Honestly the cost of more RAM is pretty much negligible. If I am buying laptops for a handful of my engineers I am surely going to spend $200x5 or whatever the cost is once to give them all an immediate boost. Cost/benefit is strong for this.

mixmastamyk · on Nov 17, 2020

All of this is doable in 16GB, I do it everyday with a 3.5GB Windows 10 VM running and swap disabled. There are many options as well such as closing apps and running in the cloud.

ogre_codes · on Nov 17, 2020

Update: Re-reading your above comment I realized I mis-read your post and though you were suggesting 32GB was table-stakes... which isn't quite right. Likewise much of below is based on that original mis-read.

I'm not convinced that going from 16GB to 32GB is going to be a huge instant performance boost for a lot of developers. If I was given the choice right now between getting one of these machines with 16GB and getting an Intel with 32GB, I'd probably go with the M1 with 16GB. Everything I've seen around them suggests the trade-offs are worth it.

Obviously we have more choices than that though. For most of us, the best choice is just waiting 6-12 months to get the 32GB version of the M? series CPU.

danaris · on Nov 17, 2020

I've seen others suggest that 32GB is table-stakes in their rush to pooh-pooh the M1.

I, personally, am a developer who has gone from 16GB to 32GB just this past summer, and seen no noticeable performance gains—just a bit less worry about closing my dev work down in the evening when I want to spin up a more resource-intensive game.

nthj · on Nov 17, 2020

I agree with this. I don't think I could argue it's table stakes, but having 32GB and being able to run 3 external monitors, Docker, Slack, Factorio, Xcode + Simulator, Photoshop, and everything else I want without -ever- thinking about resource management is really nice. Everything is ALWAYS available and ready for me to switch to.

tempest_ · on Nov 17, 2020

At some point it is easier to have something sitting in a rack somewhere. That way you dont have to cook your ultrabook to run that stuff.

ogre_codes · on Nov 17, 2020

People have been saying this kind of thing for years, but so far it doesn't really math out.

Having a CPU "in the cloud" is usually more expensive and slower than just using cycles on the CPU which is on your lap. The economics of this hasn't changed much over the past 10 years and I doubt it's going to change any time soon. Ultimately local computers will always have excess capacity because of the normal bursty nature of general purpose computing. It makes more sense to just upscale that local CPU than to rent a secondary CPU which imposes a bunch of network overhead.

There are definitely exceptions for things which require particularly large CPU/ GPU loads or particularly long jobs, but most developers will running local for a long time to come. CPUs like this just make it even more difficult for cloud compute to be make economic sense.

imtringued · on Nov 18, 2020

As someone who is using a CPU in the cloud for leisure activities this is spot on. Unless you rent what basically amounts to a desktop you're not going get a GPU and high performance cores from most cloud providers. They will instead give you the bread and butter efficient medium performance cores with a decent amount of RAM and a lot of network performance but inevitable latency. The price tag is pretty hefty. After a few months you could just buy a desktop/laptop system that fits your needs much better.

wil421 · on Nov 17, 2020

Larry Ellison proposed a thin client that was basically a dumb computer with a monitor and nic that connected to a powerful server in the mid 1990s.

For a while we had a web browser which was kinda like a dumb client connected to a powerful server. Big tech figured out they could push processing back to the client by pushing JavaScript frameworks and save money. Maybe if arm brings down data center costs by reducing power consumption we will go back to the server.

whycombagator · on Nov 17, 2020

What kind of development are you doing?

16gb is OK for my needs at home running linux, but on the odd occasion I wish I had more.

At work I find 32Gb is barely enough.

ogre_codes · on Nov 17, 2020

I would turn that around. What kind of development are you doing where you feel 32GB is "Barely enough"?

Right now I primarily work on a very complex react based app. I've also done Java, Ruby, Elixir, and Python development and my primary machine has never had 32GB.

More RAM is definitely better, but when I hear phrases like "32GB is barely enough", I have to wonder what in the hell people are working on. Even running K8s with multiple VMs at my previous job I didn't run into any kind of hard stops with 16GB of RAM.

bronson · on Nov 17, 2020

One data point: when I was consulting a year ago, I had to run two fully virtualized desktops just to use the client's awful VPNs and enterprise software. Those VMs, plus a regular developer workload, made my 16GB laptop unusable. Upgrading to 32GB fixed it completely.

mixmastamyk · on Nov 17, 2020

Desktops can use less memory than folks many folks think. I have a VM of Windows 10 in 3.5 GB running a VPN, Firefox, Java DB app, and ssh/git. For single use, memory could be decreased.

I think the art of reducing the memory footprint has been lost. Whenever I configure a VM for example, I disable/remove all the unused services and telemetry as the first step. This approaches an XP memory footprint.

imtringued · on Nov 18, 2020

That's not what this discussion is about. 16GB is definitively limiting if you run VMs but 32GB should be plenty. If you need more then either you are running very specialized applications which means your own preferences are out of touch with the average developer or you are wasting all of the RAM on random crap.

thom · on Nov 17, 2020

If you do machine learning or simulations with big datasets and lots of parameters it does become an issue, but I will admit I could just as easily run these things on a server. I don’t think I’ve ever maxed out 32gb doing anything normal.

mixmastamyk · on Nov 17, 2020

> What kind of development are you doing?

Sounds like folks never want to close an app. It could be a productivity booster if you want to spend the money and electricity, but is rarely a requirement.

DonaldPShimoda · on Nov 17, 2020

Keep in mind that so far Apple has only started offering M1 on what are essentially entry-level computers. I think it's likely there will be a 32GB Unified Memory version for the 16" MBP (which maybe will become available on the 13" or Mac Mini too).

I think M1 would not be able to achieve the performance and efficiency improvements if the RAM were not integrated, so they'll stick with Unified Memory for the time being. I don't think this will be as tenable for the Mac Pro (and maybe not even iMac Pro), but those are probably much further from Apple Silicon than anything else, so we'll see what happens.

whalesalad · on Nov 17, 2020

I agree with you completely. I am looking forward to the next offering and hope that they have a plan for more memory.

In the meantime I wonder if they are going to do dual (or more) socket configurations. I was just thinking to myself imagine a Mac Pro with 8 of these M1 chips in it all cooled by one big liquid heat block. That thing would rip.

DonaldPShimoda · on Nov 18, 2020

> hope that they have a plan for more memory.

I can't imagine them not doing it. If they were satisfied that 16GB was sufficient, I would've expected them to also refresh the 16" MBP with M1. I think the fact that they didn't is a good indicator that something about M1 isn't ready for the big boy league, and my guess is RAM will factor into that.

My guess is that the second generation (M2?) will improve performance with little efficiency gain and will include up to 32GB "Unified Memory". And then binning will be used to produce the 16GB and 8GB variants.

> I wonder if they are going to do dual (or more) socket configurations.

Whoa, that's something I hadn't thought of! I wonder if M1 is amenable to that kind of configuration. That would be pretty neat!

arpigartigar · on Nov 23, 2020

(This should be a reply to an older comment of yours and I realise it's probably bad form to be posting it here, but I couldn't find any other way to contact you)

A few weeks ago you made a comment (https://news.ycombinator.com/item?id=24653498) where you mentioned a PL Discord server. Could I get an invite? I can be reached through aa.santos at campus.fct.unl.pt if you'd rather not discuss it in public/if you'd like to verify my identity.

Sorry to everyone else in the thread for being off-topic.

DonaldPShimoda · on Nov 24, 2020

Hello! Yes, HN's lack of notifications really poses a problem in situations like this. Sigh.

However, the answer to your question is fortunately a simple one: the Discord server mentioned is run by the /r/ProgrammingLanguages community over on Reddit [0]! If you go to that page (might need to be on a desktop browser because ugh) and look in the sidebar/do a search for "Discord server", you'll find a stable invite link.

Alternatively, I can just provide you with the current link [1] and note that it may not work forever (for anybody who finds this comment in the future).

[0] https://www.reddit.com/r/programminglanguages/

[1] https://discord.gg/yqWzmkV

ddingus · on Nov 17, 2020

Seems to me, off chip RAM becomes a new sort of cache.

If Apple sizes the on chip RAM large enough for most tasks to fly, bigger system RAM can get paged in and performance overall would be great, until a user requires concurrent performance exceeding on chip RAM.

DonaldPShimoda · on Nov 18, 2020

Hmm that's a good point!

The thing I worry about is that the whole appeal of the Mac Pro is upgradeability — you can replace components over time. So integrated RAM would be problematic since that's a component people definitely like to upgrade.

But with your idea... I dunno, if they could pull that off that would be super cool!

ddingus · on Nov 18, 2020

Yeah I think so too. If they can execute from off chip Ram, perhaps with a wait state or whatever it takes, for a ton of use cases no one will even notice.

It will all just effectively be large RAM.

Doing that coupled with a fast SSD, and people could be doing seriously large data work on relatively modest machines in terms of size and such.

A very simple division could be compute bound code ends up being on chip RAM, I/O bound code of any kind ends up in big RAM, off chip.

Doing just that would rock hard.

bufferoverflow · on Nov 17, 2020

Entry level? Did you see their prices?