Apple's M4 has reportedly adopted the ARMv9 architecture

ribit · on May 24, 2024

They have not adopted ARMv9. This is still ARMv8, but with SME.

hajile · on May 24, 2024

ARMv9.0 is very similar to ARMv8.5 (9.0 supersets 8.5 with SVE2, TME, TLA, and CCA), so it's not a massive deal. SME implies v8.7 which is basically identical to v9.2 except for those couple extensions previously mentioned.

I wonder if there is licensing at play though. Apple may have gotten a really great licensing deal on ARMv8 that they wouldn't be offered for ARMv9.

rmccue · on May 24, 2024

From what I’ve read previously, Apple has a special licensing deal already as they were part of founding Arm, although I don’t know if there’s any details on exactly how that works.

pkaye · on May 24, 2024

I believe its an architectural license which lets them design their own cores based on the the ARM instruction set. I think a few other companies may have this license but its not disclosed.

https://www.electronicsweekly.com/news/business/finance/arm-...

zimpenfish · on May 24, 2024

I believe they also don't have a per-chip license cost either (which, at Apple scale, probably adds up.)

astrange · on May 24, 2024

That seems like something people just made up, seeing as Apple didn't use ARM for something like a decade or two after that.

However, Apple basically commissioned ARMv8 in the first place to develop the A/M chips, so that presumably helps.

NobodyNada · on May 24, 2024

Apple cofounded ARM for use in the Newton product line; they released new Newton products from 1993-97 and discontinued them in 1998. They then used ARM again for the iPod, released in 2001.

astrange · on May 25, 2024

They didn't design the iPod ARM SoC though, nor the ones in iPhones for quite some time, and the microcontrollers in Macs for power management and such were not ARM. (I mean, some of them might've been, but the one I'm thinking of was SH or something.)

zimpenfish · on May 24, 2024

> Apple didn't use ARM for something like a decade or two after that.

They used the ARM610 in the Newton in 1993 (ARM was founded in late 1990) and then an 8 year gap to the iPod in 2001 (ARM7TDMI which are ARM designs.) Their first in-house ARM design (I believe) is the iPhone 4 in 2010.

They definitely didn't "architect/design ARM" for nearly a couple of decades after founding ARM, yeah, but they did use them.

ksec · on May 25, 2024

>That seems like something people just made up,

Yes. Watching it unfold, the whole misinformation self spread, and back into its loop is both interesting and tiring. It took months and some hard work to camp down Unified Memory being SRAM and something special. But this ARM deal? 5 years and counting.

It is two point;

Apple has an architectural license, which somehow became a "special deal" as if they are the only one doing it. They are not and even the Amphere Computing has one.

Apple was the part of the founder of ARM, and somehow this gave people impression they have a "special deal".

And had hajile not step up and provided the ARMv9 being a superset of ARMv8 etc etc I would have to spend some time to look up the detail of superset just to stamp out these type of non-sense. ( Each ARMv8+ and v9 have way too many features and optional extensions I dont even remember which is which )

And this is not the first time YouTuber Vadim Yuryev gave something out that is completely wrong.

skavi · on May 24, 2024

Does anyone have insight into why arm CPU vendors seem so hesitant about implementing SVE2? ~They seem~ *Apple seems to have no issue with SSVE2 or SME.

Edit: Only Apple has implemented SSVE and SME I think.

brigade · on May 24, 2024

What is the measurable benefit to implementing 128b SVE2? Like, ARM has CPUs that implement that, and it's not even disabled on some chips. So there must be benchmarks somewhere showing how worthwhile it is.

And implementing 256b SVE has different issues depending on how you do it. 4x256b vector ALUs are more power hungry than generally useful. 2x256b is only beneficial over 4x128b if you're limited by decode width, which isn't an issue now that A32/T32 support has been dropped. 3x256b would probably imply 3x128b which would regress existing NEON code. And little cores don't really want to double the transistors spent on vector code, but you can't have a different vector length than the big cores...

hajile · on May 24, 2024

I'd say that the theoretical ability to gang units together would be appealing.

If you have four 128-bit packed SIMD, you must execute 4 different instructions at once or the others go to waste. With SVE, you could (in theory) use all 4 as a single, very wide vector for common operations if there weren't a lot of instructions competing for execution ports. You could even dynamically allocate them based on expected vector size or amount of vector instructions coming down the pipeline.

Additionally, adding two 2048-bit vectors using NEON (128-bit packed SIMD) would require 16 add instructions while SVE would require just one. That's a massive code size reduction which matters for I-cache and the frontend throughput.

ribit · on May 25, 2024

I don't see how this would work out beneficially. Let's say your hardware can join 4x128b units as a virtual 512-bit SVE SIMD unit. This means you have to advertise VL as 512bit for reasons of consistency. Yes, you will save some entries in the reorder buffer if you encounter a single SVE instruction, but if the code contains independent SVE streams, you will be stalled. Moreso, not all operations will utilize all 512 register bits, so your occupancy might suffer. The only scenario I see this feature working out is if you are decode or reorder buffer limited. Neither is a problem for modern high-performance ARM cores. With x86, it might be a different story. From what I understand, AVX512 instructions can be quite large.

Modern out-of-order cores are already good at superscalar execution, so why not let them do their job? 4x128b units give you much more flexibility and better execution granularity.

janwas · on May 26, 2024

On x86 at least, the cost of OoO is astonishing - more pJ per instruction dispatch than the operation itself. Amortizing that over more operations is the whole point of SIMD. I have not yet seen such data for Arm.

That aside, see the "cmp" sibling thread for a major (4x penalty) downside to 4x128.

ribit · on May 26, 2024

Yes, OoO is expensive — after all, that is the cost of performance. Very wide SIMD is great for energy efficiency if that is what your compute patterns require (there is a good reason why GPUs are in-order very wide SMT SIMD processors). Is this the best choice for a general-purpose CPU? That I am not so sure about. A CPU needs to be able to run all kinds of code. A single wide SIMD unit is great for some problems, but it won't deliver good performance if you need more flexibility.

Could you point me to the "cmp" thread you mentioned? I don't know where to look for it.

janwas · on May 26, 2024

I agree with you we do not only want "very wide SIMD", and it seems to me that 2x512-bit (Intel) or 4x256 (AMD) are actually a good middle ground.

Sure, it's https://news.ycombinator.com/item?id=40465090.

ribit · on May 27, 2024

> I agree with you we do not only want "very wide SIMD", and it seems to me that 2x512-bit (Intel) or 4x256 (AMD) are actually a good middle ground.

I'd already classify this as "very wide". And the story is far from being that simple. Intel's 512-bit implementation is very area- and power-hungry, so much so that Intel is dropping the 512-bit SIMD altogether. AMD has 4x add units, but only two are capable of multiplication. So if your code mostly does FP addition, you get good performance. If your workflows are more complex, not so much.

The thing is that on many real-world SIMD workloads, Apple's 4x128bit either matches or outperforms either Intel's or AMD's implementation. And that on a core that runs lower clock and has less L1D bandwidth. Flexibility and symmetric ALU capabilities seems to be the king here.

> Sure, it's https://news.ycombinator.com/item?id=40465090

Ah, that is what you meant. Thank you for linking the post! My comment would be that this is not about 128b or 256b SIMD per se but about implementation details. There is nothing stopping ARM from designing a core with more mask write ports. Apparently, they felt this was not worth the cost. Other vendors might feel differently. I'd say this is similar to AMD shipping only two FMA units instead of four. Other vendors might feel differently.

janwas · on May 27, 2024

For very wide, I'm thinking of Semidynamic's 2048-bit HW, which with LMUL=8 gives 2048 byte vectors, or the NEC vector machines.

AFAIK it has not been publicly disclosed why Intel did not get AVX-512 into their e-cores, and I heard surprise and anger over this decision. AMD's version of them (Zen4c) are a proof that it is achievable.

I am personally happy with the performance of AMD Genoa e.g. for Gemma.cpp; f32 multipliers are not a bottleneck.

> The thing is that on many real-world SIMD workloads, Apple's 4x128bit either matches or outperforms either Intel's or AMD's implementation

Perhaps, though on VQSort it was more like 50% the performance. And if so, it's more likely due to the astonishingly anemic memory BW on current x86 servers. Bolting on more cores for ever more imbalanced systems does not sound like progress to me, except for poorly optimized, branch-heavy code.

ribit · on May 28, 2024

> Perhaps, though on VQSort it was more like 50% the performance.

I looked at the paper and my interpretation is that the performance delta between M1 (Neon) and the Xeon (AVX2) can be fully explained by the difference in clock (3.7 vs 3.3 Ghz) and the difference in L1D bandwidth (48byes/cycle vs. 128bytes/cycle). I don't see any evidence here that narrow SIMD is less efficient.

The AVX-512 is much faster, but that is because it has hardware features (most importantly, compact) that are central to the algorithm. On AVX2 and Neon these are emulated with slower sequences.

janwas · on May 28, 2024

Note that compact/compress are not actually the key enablers: also with AVX-512 we use table lookups for u64 keys, because this allows us to actually partition a vector and write it both to the left and write sides, as opposed to compressing twice and writing those individually.

Isn't the L1d bandwidth tied to the SIMD width, i.e., it would be unachievable on Skylake if also only using 128-bit vectors there?

ribit · on May 28, 2024

> Note that compact/compress are not actually the key enablers: also with AVX-512 we use table lookups for u64 keys, because this allows us to actually partition a vector and write it both to the left and write sides, as opposed to compressing twice and writing those individually.

That is interesting! So do I understand you correctly that the 512b vectors allow you to implement the algorithm more efficiently? That would indeed be a nice argument for longer SIMD

> Isn't the L1d bandwidth tied to the SIMD width, i.e., it would be unachievable on Skylake if also only using 128-bit vectors there?

It's a hardware detail. Intel does tie it to SIMD width, but it doesn't have to be the case. For example, Apple has 4x128b units but can only load up to 48 bytes (I am not sure about the granularity of the loads) per cycle.

janwas · on May 28, 2024

Right, longer vectors let us write more elements at a time.

I agree that the number of L1 load ports (or issue width) is also a parameter: that times the SIMD width gives us the bandwidth. It will be interesting to see what AMD Zen5 brings to the table here.

camel-cdr · on May 26, 2024

> but if the code contains independent SVE streams, you will be stalled.

Can you explain why thats bad?

Don't you still get full utilisation of the 4x128b units?

ribit · on May 26, 2024

If you do streaming-type operations on long arrays, yes. If your data sizes are small, however, four smaller units might be more flexible. As a naive example, let's take the popular SIMD acceleration of hash tables. Since the key is likely to be found close to its optimal location, long SIMD will waste compute. With small SIMD however you could do multiple lookups in parallel courtesy of OoO.

This is why I like the ARM/Apple design with "regular SIMD" and "streaming SIMD". The regular SIMD is latency-optimized and offers versatile functionality for more flexible data swizzling, while the streaming SIMD uses long vectors and is optimized for throughput.

dzaima · on May 24, 2024

You can't do 2048 bits of addition in one SVE instruction; not portably, at least (and definitely not on any existing hardware). While the maximum SVE register size is 2048 bits, the minimum is 128 bits, and the hardware chooses the supported register size, not the programmer. For portable SVE, your code needs to work for all of those widths, not just the smallest or largest. (of related note is RISC-V RVV, which allows you to group up to 8 registers together, allowing a minimum portable operation width of 128×8 = 1024 bits in a single instruction (and up to 65536×8 = 64KB for hypothetical crazy hardware with max VLEN), but SVE/SVE2 don't have any equivalent)

brigade · on May 24, 2024

A for() loop does the same thing at the cost of like 3 instructions. 4x128b has the flexibility that you don't need 512b wide operations on the same data to keep the ALUs fed. If you have 512b wide operations being split to 4x128b instructions, great, otherwise the massive OoOE window of modern chips can decode the next few loop iterations to keep the ALUs fed, or even pull instructions from a completely different kernel.

camel-cdr · on May 25, 2024

> What is the measurable benefit to implementing 128b SVE2

Probably not much, SVE2 has some nicer instructions, but neon already is quite solid.

> And implementing 256b SVE has different issues depending on how you do it

For in-order, and not very aggressively out-of-order cores having a larger vector length can be very useful to still get a lot of throughput out of your design. It also helps hide memory latency.

Here is a paper: https://ar5iv.labs.arxiv.org/html/2309.06865

and presentation: http://riscv.epcc.ed.ac.uk/assets/files/sc23/Short-reasons-f...

For aggressively out-of-order cores it should, for the most part, just be about decode, and some what memory latency hiding.

> 2x256b is only beneficial over 4x128b if you're limited by decode width [...] 3x256b would probably imply 3x128b which would regress existing NEON code.

I agree, that's why I don't get why people are "excited" for Zen5 to have 512b execution units, instead of 256b ones. At best there won't be a performance improvement for avx/avx2 code, at worst a regression.

janwas · on May 25, 2024

Anyone interested in getting such numbers could run github.com/google/gemma.cpp on Arm hardware with hwy::DisableTargets(HWY_ALL_NEON) or HWY_ALL_SVE to compare the two :) I'd be curious to see the result.

Calling hwy::DispatchedTarget indicates which target is actually being used.

skavi · on May 24, 2024

Masked instructions primarily. But apart from that it’s just a more complete ISA vs NEON. More comparable to AVX512/AVX10.

> 2x256b is only beneficial over 4x128b if you're limited by decode width

This is only true if we ignore more complex instructions and focus on things like adding two vectors.

brigade · on May 24, 2024

What is the percentage gain of using masked instructions on any benchmark/task of your choice? It can be negative on weird kernels that do lots of vector cmp since even ARM decided the cost of more than one write port in the predicate register file wasn't worth it, or if the masking adds lots of unnecessary and possibly false dependencies on the destination registers.

> This is only true if we ignore more complex instructions and focus on things like adding two vectors.

ARM implemented a CPU that had 2x256b SVE and 4x128b NEON. Literally the only benchmarks that benefitted from SVE were because they were limited by the 5-wide decode in NEON.

Do you have an actual real-world counterexample?

janwas · on May 25, 2024

It's great you bring up cmp, helps to understand why 4x128 is not necessarily as good as 1x512. Quicksort, hardly a 'weird kernel', does comparisons followed by compaction. Because comparisons return a predicate, which have only a single write port, we can only do 128 bits of comparisons per cycle. Ouch.

However, masking can still help our VQSort [1], for example when writing the rightmost partition right to left without stomping on subsequent elements, or in a sorting network, only updating every second element.

[1] https://github.com/google/highway/tree/master/hwy/contrib/so...

skavi · on May 24, 2024

I think it's somewhat unfair to ask for real world examples when there really aren't many people writing optimized SVE code right now. Probably because there are hardly any devices with the extension.

I think the transition from AVX2 to AVX512 is comparable in that it provided not only larger vectors, but also a much nicer ISA. There were certainly a few projects that benefited significantly from that move. simdjson is probably the most famous example [0].

[0]: https://lemire.me/blog/2022/05/25/parsing-json-faster-with-i...

snvzz · on May 25, 2024

>I think it's somewhat unfair to ask for real world examples when there really aren't many people writing optimized SVE code right now. Probably because there are hardly any devices with the extension.

Ironically, on the RISC-V side, RVV 1.0 hardware is readily available and cheap. BananaPI BPI-F3 (spacemiT K1) is RVA22+RVV, as well as some C908-based MCUs.

brigade · on May 24, 2024

CPUs with SVE have been generally available for two years now. SME and AVX-512 got benchmarks written showing them off before the CPUs were even available. Seems fair to me.

simdjson specifically benefitted from Intel's hardware decision to implement a 512b permute from 2x 512b registers with a throughput of 1/cycle. That's area-expensive, which is (probably) why ARM has historically skimped on tbl performance, only changing as of the Cortex-X4.

Anyway simdjson is an argument for 256b/512b vector permute, not 128b SVE.

Having written a lot of NEON and investigated SVE... I disagree that SVE is a nicer ISA. The set of what's 2-operand destructive, what instructions have maskable forms vs. needing movprfx that's only fused on A64FX, and dealing the intrinsics issues that come from sizeless types are all unneeded headaches. Plus I prefer NEON's variable shift to SVE's variable shifts.

janwas · on May 25, 2024

Fair point about movprfx, I understand they were short on encoding space. This can be mitigated by using *_x versions of intrinsics where masks are not used.

The sizeless headache is anyway there if you want to support RISC-V V, which we do.

One other data point in favor of SVE: its backend in Highway is only 6KLOC vs NEON's 10K, with a similar ratio of #if (indicating less fragmentation, more orthogonal).

skavi · on May 24, 2024

It’s been a while since I looked, but I remember SVE2 being much more usable than SVE. A64FX was SVE IIRC. I think SVE did not do a great job of fully replacing NEON.

neonsunset · on May 24, 2024

This.

AVX512 is all around a nice addition as JIT-based runtimes like .NET (8+) can use it for most common operations: text search, zeroing, copying, floating point conversion, more efficient forms of V256 idioms with AVX512VL (select-like patterns replaced with vpternlog).

SVE2 will follow the same route.

hajile · on May 24, 2024

SVE2 is an extension on top of SVE which some stuff already implements. The issue is more likely to be the politics of moving to ARMv9 than anything else.

As to SVE though, I'd guess variable execution time makes the implementation require a bit of work. Normally, multi-cycle tasks have a fixed number. Your scheduler knows that MUL takes N cycles and plans accordingly.

SVE seems like it should require N-M cycles depending on what is passed. That must be determined and scheduled around. This would affect the OoO parts of the core all the way from ordering through to the end of the pipeline.

That's definitely bordering on new uarch territory and if that is the case, it would take 4-5 years from start to finish to implement. This would explain why all the ARMv8 guys never got around to it. ARMv9 makes it mandatory, but that was released in 2021 or so which means non-ARM implementors probably have a ways to go.

dzaima · on May 24, 2024

SVE doesn't need variable-execution-time instructions, outside of perhaps masked load/store, but those are already non-constant. Everything else is just traditional instructions (given that, from the perspective of the hardware, it has a fixed vector size), with a blend.

ribit · on May 25, 2024

I am curious, which SVE instructions imply variable execution time? I’d guess that first fault load could be tricky to implement…

skavi · on May 24, 2024

This isn’t a convincing explanation to me. There are plenty of variable latency instructions on existing high performance arm64 cores.

anticensor · on May 25, 2024

Variable execution time instructions can always be divided into smaller fixed execution time microinstructions.

ribit · on May 24, 2024

What do you mean? Apple is the only one who has an SME/SSVE implementation.

skavi · on May 24, 2024

I misremembered. Looks like it is only Apple. I appreciate the correction.

ribit · on May 24, 2024

My guess is that Apple is simply not interested in some of the ARMv9 features. They are not eager to implement SVE and the se Ure virtualization features are probably not that relevant to them.

axoltl · on May 24, 2024

Yep, the binaries are all arm64e.

saagarjha · on May 24, 2024

This doesn’t really say much

saagarjha · on May 24, 2024

I do find it amusing that journalists never go beyond Twitter for discussion on this because this was all being confirmed on Mastodon days before any of the posts in the article

TiredOfLife · on May 25, 2024

According to https://news.ycombinator.com/item?id=40465451 they have not switched to armv9. So I am curious what they were confirming?

saagarjha · on May 25, 2024

SME and streaming SVE. In fact I was going to include “…and nobody seems to have good evidence of ARMv9 support” but I figured my comment was enough as it was ;)

zarathustreal · on May 25, 2024

Thing is, normal people don’t really like interacting with the kind of person that would have jumped over to Mastodon. Zeal is insufferable most of the time.

saagarjha · on May 25, 2024

Well, unfortunately, the kind of person who is an expert on SME is on Mastodon. So if you're writing an article on it, you should probably go to them instead of tech influencers who recycle content on Twitter.

joshstrange · on May 25, 2024

I’m fairly certain that staying on (or joining) Twitter shows just about the same amount of “zeal” as leaving for Mastodon/BlueSky/Threads. Or put another way, I find people on Twitter to be insufferable.

It’s not a secret that a large portion of actually technical people (not tech influencers) left Twitter for Mastodon. So the people on Twitter may be more polished turds but are turds nonetheless.

Full disclosure: I still use my Twitter for customer support because that’s all it’s good for at this point IMHO. I also don’t regularly read mastodon but it’s where the people I care about are and when I post (rarely) that’s where I do it.

johnklos · on May 24, 2024

ARM really could've come up with better numbering / identification. I suppose it's ARM, emphasis on v, then 9, to differentiate it from ARM9, such as ARM9E-S?

pjmlp · on May 24, 2024

So finally MTE enabled?

saagarjha · on May 24, 2024

Apple’s chips have been MTE enabled for a while, it’s just turned off

astrange · on May 24, 2024

No, it has PAC and I think BTI but not MTE.

saagarjha · on May 24, 2024

That is not what I hear

pjmlp · on May 25, 2024

Official documents only refer PAC, glad if you can prove us wrong.

saagarjha · on May 25, 2024

I can only make you sad then, sorry

meindnoch · on May 24, 2024

So binary sizes are going to double?

fl0ki · on May 24, 2024

This is more like supporting AVX512 than a whole separate architecture. If you have to target both old and new devices from one binary, you do a runtime feature check and call the corresponding code.

That is certainly more code, but not double. You only need it for the parts of the code that are both (a) bottlenecks worth optimizing and (b) actually benefit by using the new instructions.

rbanffy · on May 24, 2024

And, in modern desktop software, code is a tiny bit of the total size of the application - visual elements tend to occupy a lot more space than code.

dumbo-octopus · on May 24, 2024

In games, perhaps. In basically nothing else though. And I have 0 games on my M1, yet my apps folder is 23 GB. Docker, Edge, and SketchUp all 2+ GB, despite not having almost any UI to speak of.

(Edit to remove iMovie from the list, as it has GB's of "Transitions" and "Titles" that I really should just delete)

parl_match · on May 24, 2024

No, in pretty much everything.

All three examples you gave have substantial UI and other bundled assets. For example, the Docker Desktop app is about 2GB on my computer, yet included assets make up at least 1.2GB, and a further 600MB is a bundle containing the UI, which itself is about 100MB of binaries.

If you actually open those bundles (as they're called on macos) and take a look inside, you'll see that they don't even contain all of their assets, anyways, often linking to frameworks contained in ~/Library

This is a very layperson explanation, btw, but I assure you that "in modern desktop software, code is a tiny bit of the total size of the application" is a very true statement.

joshstrange · on May 25, 2024

There are cases where the app icon is larger than the compiled code in some apps and if you include a couple images for things like “here is how you give my app access to record the screen” that can also account for a large part of your app bundle.

Yes, game assets really take it to another level but it’s been my experience that even apps without a lot of UI still have their images making up a lot of the app bundle size.

hajile · on May 24, 2024

ARMv9 is just ARMv8.5 with 4 extra extensions. It's not a complete overhaul like the ARMv7 to ARMv8 change was.

It's more comparable to x86 chips with AVX-512 and chips without AVX-512. 99% of your code is the same, but the compiler will generate SSE, AVX, and AVX-512 variants and choose the correct one based on the CPU.

201984 · on May 24, 2024

Are there any extensions that ARMv9 is required to have? I'm looking through the reference manuals and those 4 extra extensions are all marked as "OPTIONAL" for ARMv9.

hajile · on May 24, 2024

I believe it requires the ARMv8.5 instruction sets (but maybe some of those are optional too?)

phkahler · on May 24, 2024

>> So binary sizes are going to double?

If you're already supporting 2 arch it will only increase by 50 percent to support a 3rd ;-)

Hamuko · on May 24, 2024

The most impressive I've seen has been four-arch support in a single bundle: PowerPC, 32-bit x86, 64-bit x86 and ARM64.

pasc1878 · on May 24, 2024

Openstep had as a default 4 hppa, sparc, i386 and m68k - I often built stuff on a HP for production use on Intel and 68000 boxes and I think they also had unreleased m88k as well at the same time so internally might have had five way binaries.

JKCalhoun · on May 24, 2024

And you'd like to think the binaries are still not the largest component of an app contributing to the file size. But who knows these days.

DaiPlusPlus · on May 24, 2024

Its the static-linking of the Swift runtime - it’s incredibly inelegant.

saagarjha · on May 24, 2024

The Swift runtime is not statically linked on Apple’s platforms

kenferry · on May 24, 2024

(Anymore)

saagarjha · on May 24, 2024

I don't think it ever was?

saurik · on May 24, 2024

I do also believe it wasn't ever technically "statically linked", but the dynamic libraries were separately distributed as part of every app (and so I'd think the semantic slippage acceptable given the context). This has tons of advantages, but prevented Swift from being used in Apple libraries.

DaiPlusPlus · on May 27, 2024

Right; I was using loose and casual language there; I meant "bundling", not literally statically-linked.

-----

I said "statically-linked" because the first thing that came to mind was go-lang's morbidly obese executable binary sizes, and mentally walked to iOS app sizes.

georgeburdell · on May 24, 2024

IME Mac binaries are much smaller than Windows for your average third party software. I don’t know why.

ben7799 · on May 24, 2024

Windows binaries often seem to have excess statically linked libraries.. even though they are called DLLs which is supposed to mean dynamic. They might be loading it dynamically but they still seem to have decided to include their own private copy.

I've even seen windows binaries have multiple different versions of the same DLL inside them, and it's a well known DLL that is duplicated multiple places elsewhere.

All OSes/Apps do this but maybe a lot of Mac apps do it a little less. (I don't even have any real statistical idea how common this is with windows apps either)

tonyarkles · on May 24, 2024

From having worked on Windows, OSX, and Linux desktop software over the years there's a few factors at play off the top of my head:

- Windows DLLs don't usually have strong versioning baked into the filename. On OSX or Linux, there's usually the full version number baked in (libfoo.so.3.32.0) with symlinks stripping off version components. (libfoo.so, libfoo.so.3, libfoo.so.3.32) would all be symlinks to libfoo.so.3.32.0 and you can link against whichever major/minor/patch version you depend on. If your Windows app depends on a specific version it's going to be opening DLLs and querying them to find out what they are.

- Native OSX software (not Electron) seems to depend much less on piles of external libraries because the OSX standard library is very rich and has a solid history of not breaking APIs and ABI across OS versions. While eg CoreAudio is guaranteed to be installed on an OSX install and be either compatible or discoverably-incompatible, the version of DirectSound you're going to have access to on Windows is more of a crapshoot.

- Windows apps (except for the .Net runtime sometimes) are often designed for longevity. A couple of months ago I installed some software that was released in 1999 on my Windows 11 machine and it just worked. Bundling up those DLLs is part of why they work.

- Linux apps can rely on downstream packaging to install the necessary shared libraries on demand, generally speaking. Linux desktop apps distributed as RPMs or DEBs can "just" declare which libraries they need and get them delivered during install.

ziml77 · on May 24, 2024

On Windows isn't it possible to have the OS deal with the DLL version issue by using side-by-side assemblies? I believe in practice that's only ever used by DLLs provided by the OS, but I thought it was possible to apply the mechanism to other DLLs as well.

tonyarkles · on May 26, 2024

Maybe? I haven’t really done a deep dive into that. You’d still have to bundle them along with the installer though since there isn’t a good way to request 3rd-party DLLs (heck, there isn’t even a good way to request a specific version of MSVCRT…)

HeatrayEnjoyer · on May 24, 2024

How so?

awill · on May 24, 2024

What's so shocking about this? The Arm Cortex X2 which launched 2 years ago has ARMv9

eyelidlessness · on May 24, 2024

Why do you assume anything is supposed to be shocking about this?

javawizard · on May 24, 2024

I'm guessing GP meant something more like:

What's so newsworthy about this?

TiredOfLife · on May 25, 2024

And what does arm reference design has to do with completely custom core?

capl · on May 24, 2024

Hmm, I was thinking of buying an M3 Pro 16” this summer, but maybe I should wait then

deergomoo · on May 24, 2024

Since the move to Apple Silicon you are realistically never more than 12-18 months away from a new chip generation in a MacBook. An M1 is still plenty good for the vast majority of workloads, especially if it's an M1 Pro/Max/Ultra.

Actually probably the best thing to do is wait until the M4 machines launch then bag a good deal on a clearance M3.

rbanffy · on May 24, 2024

The next ones to get M4 will probably be the Mini, the Studio, and the Pro. iMac and MacBooks got an M3 refresh, but the other desktops have M2s now.

rkuska · on May 24, 2024

That’s actually a nice side effect of all the *rumors pages. The rumors of future products keep me of buying the current products. I keep on using my previous products while saving money and planet and being excited about what future holds.

sitkack · on May 24, 2024

https://en.wikipedia.org/wiki/Osborne_effect

Kon-Peki · on May 24, 2024

On the contrary, I think that the reliable update cadence in modern electronics means that people should generally all but ignore future product roadmaps.

When you actually need to get a new device, just get whatever the up-to-date thing is.

OK, ok, I suppose that it's reasonable to check the rumor sites to see if you should delay by a month or two. But not any longer than that.

rbanffy · on May 24, 2024

It's much harder with PCs, where you can get, for instance, new Thinkpad's with anything from 11th gen Core i all the way to new Core Ultras. And, now, ARMs as well...

sitkack · on May 25, 2024

I was inline to buy a 128GB M3 MAX, know that I know the M4 exists and already shipped in the iPad, it lets me know that the whole M4 pipeline has already started and what the perf numbers are, absolutely means I will be waiting. I survived yesterday without it, I can survive tomorrow. And now I can budget in the AMD Epyc bridge that covers that span.

I think Apple has been pretty good about hitting the right cadence with processor perf increases. They are making up for lost Intel time. The M6 is going to make us loose our minds. Apple is going to bring back "this is a munition" ads.

rbanffy · on May 26, 2024

Both the M3 and the EPYC will be useful for far longer than the time it takes Apple to have the M4 on their next-gen laptops. Computers last a lot longer than they used to. I have a 10 year old Mac Mini that’s still comfortable to use, and, while an M3 Mac is a beast, it’s not that much faster than an M2 (or an i7) to create a qualitative change in my workflows. What is possible now was already possible last year. It’s just faster now. I get a higher return on investment with better keyboards and screens.

usefulcat · on May 24, 2024

> The rumors of future products keep me of buying the current products.

For myself, I like to think of it as applied procrastination. I could buy that new thing I want today.. but something better will come along in time, so I can afford to put it off a while longer yet..

repelsteeltje · on May 24, 2024

> The rumors of future products keep me of buying the current products.

Spot on!

Back in the nineties, Intel managed to push competing RISC architectures (UltraSparc, MIPS, DEC Alpha, PowerPC) out of the market using nothing but promises that Itanium was going to blow them all out of the water.

And apparently Apple is okay with procrastinating and cannibalizing current sales of M1, 2, 3 if it helps prevent some Snapdragon (or Ampere) sales.

ruined · on May 24, 2024

>And apparently Apple is okay with procrastinating and cannibalizing current sales of M1, 2, 3 if it helps prevent some Snapdragon (or Ampere) sales.

sales of what

i actually can't think of a single competing product. admittedly i don't keep up with laptop news but still, i haven't heard of anything yet that can meaningfully compete with the m1 from four years ago

nolongerthere · on May 24, 2024

Microsoft just announced some lackluster arm laptops that they claim can compete with M-series chips. The question is what windows programs are gonna run on them...

Tagbert · on May 24, 2024

Some people have been running Windows 11 for Arm on a VM in Apple Silicon. It has an automatic transcoder that translates most x86 code at start. It seems to run many apps well. Microsoft claims these new machines have a better transcoder. This might work.

fl0ki · on May 24, 2024

Some folks have looked into it, and it doesn't sound too bad.

https://www.youtube.com/watch?v=uY-tMBk9Vx4

For me at least, the best possible outcome of this is that Windows handheld gaming devices become more power-efficient. That might be an advantage over Linux-based handhelds for a while, unless Valve decide that Proton needs to also be an architecture emulator. The chip efficiency wins must surely be tempting in this form factor.

rbanffy · on May 24, 2024

Your question answers itself. "What Windows programs" is the key part.

I don't have any need for any Windows-only program.

gumby · on May 24, 2024

> And apparently Apple is okay with procrastinating and cannibalizing current sales of M1, 2, 3 if it helps prevent some Snapdragon (or Ampere) sales.

Not sure where “procrastinating” fits in (a typo?), but as Scott McNealy once said, “If someone shows up and eats our lunch, it might so well be us.”

thfuran · on May 24, 2024

There may be a world in which Apple is procrastinating in chip design, but it's not this one.

thisislife2 · on May 24, 2024

> The rumors of future products keep me of buying the current products.

You may have heard of the 5-minute rule - "Will doing this take me less than 5 minutes? If the answer is yes, do it now." An adaption of that to reduce impulse purchases is - "Do I really need this product right now? If the answer is no, don't buy it."

tonyarkles · on May 24, 2024

And on the flip side I am generally hesitant to buy first-release Apple hardware. Over the 20 years I've been buying Apple kit I've generally found it to be exceptionally robust but newly released hardware has had enough bugs (either hardware or OS) that I just sit back and let other users find the issues first. But I do simultaneously have the same issue: if WWDC is coming up within a month or two I'm not going to be buying any hardware because there's a good chance that something new will be released or the hardware I was going to buy is going to get a refresh or a price drop.

tedivm · on May 24, 2024

I wait until the new release, and then look at the refurbished store to get a discount on the last generation model. I do this every four years or so.

a13o · on May 24, 2024

I do this technique too, and it's a great time for it. The OLED screen on the new iPad signals that Apple devices are moving to a better panel. If you've been waiting for the right time to move off an Intel Mac and onto a SoC Mac, it's now. Pick up a refurbished M2 MacBook. They're in the sweet spot for support, power, and cost.

The next one will probably have an OLED screen; so if you wait til then, your refurb M1/2/3 will be on Apple's short list of devices they don't want to support. (And you might have panel FOMO.) Or you'll have to pay the premium price for the latest model.

yardie · on May 24, 2024

I'm still on a i7 MBP because everytime I think I'm ready to update the next one is announced.

0_____0 · on May 24, 2024

These machines are great. I still use my 2015 rMBP as a secondary. It's a little slow now but a couple years ago I was still running Solidworks (in Bootcamp) on it with minimal issues.

tmalsburg2 · on May 24, 2024

My wife is still using her 2012 MBP. We maxed out RAM and gave it an SSD in 2016. She uses it for video editing and music production. The thing look like new. Completely ridiculous. Only downside: no OSX updates since I don’t know when.

jasomill · on May 24, 2024

You might find OpenCore Legacy Patcher[1] worth a look. In many cases, it allows later-that-supported Mac OS versions to be installed on older Macs.

As a data point, I still use a 2013 Mac Pro as my primary desktop, and I've been using Sonoma on it for several months, have been able to install all Sonoma patches over-the-air on release without incident, and have only experienced a single, trivial problem: the right side of the menu bar occasionally appears shaded red, in a way that doesn't affect usability; switching applications immediately resolves the problem (the problem appears to be correlated with video playback).

[1] https://dortania.github.io/OpenCore-Legacy-Patcher/

lproven · on May 24, 2024

OCLP is your friend.

https://dortania.github.io/OpenCore-Legacy-Patcher/

manquer · on May 24, 2024

video encoder/decoder support and performance has order of magnitude improvement in M series, I am surprised that didnt sway you.

Not just that, for high res stuff or modern codecs like AV1 or h265 is probably not supported at all in a 2012 device without updates for so long?

Even if support was possible it would be software encoding and even short clip it can take hours to render ?

I would happily use an older device for development a lot of dev work especially if not frontend or UI usually i can use any laptop just as a terminal, but UI or video editing I wouldn’t be able to.

jkestner · on May 24, 2024

I can't help but reply every time this thread comes up. I'd still probably be using my 2010 if it wasn't for a series mechanical failures. Paid to replace the keyboard once (85 screws, didn't need to do that to myself), but third battery crapping out, trackpad not clicking (probably due to swollen battery) and the MagSafe connector getting loose and glitchy was the end of it. Though I did just boot it up because my phone is somehow still supposed to sync music from it.

mschuster91 · on May 24, 2024

If the battery is swollen, get rid of it as soon as possible. Swollen battery == ticking time bomb, and I'm not joking about the bomb part. These things can, do and will explode randomly.

gtirloni · on May 24, 2024

Besides maybe battery life (which is a huge win), anything you'd benefit from the M's? I only had a 2008 macbook so I'm curious.

danielbln · on May 24, 2024

Overall, my 2020 M1 MBP is infinitely better than the 2015 MBP I had before, it's not even close. Battery life, thermal output, speed, noise, neural engine (for ML workloads). It's an utter workhorse that just marches on, no matter what I throw at it. I haven't even considered upgrading to another more current Mx version because this one just.. works. Best laptop I ever owned.

danslinky · on May 24, 2024

I just want to echo this experience and sentiment. I absolutely adore my 2020 13” M1 mbp, for all the reasons you list. I do ML workloads and Linux builds and I’m starting to think they forgot to put fans in mine because I’ve never heard them! Despite the annoying limitation of 1 external screen, it’s up there with my 2007 13” mb (rest in peace) as being the best laptop I’ve ever owned.

gbear605 · on May 24, 2024

+1

I recently upgraded from a 2019 Intel Mac to a similarly-specced M3 Mac, and it really is night and day. My battery life is more than doubled - I can run IntelliJ and multiple Docker containers on battery for more than my whole work day, when before it would barely last a couple hours with that load and be slow while doing so. The fan hardly ever runs while on my Intel Mac it would run constantly.

nomel · on May 24, 2024

I can definitely say there's a downside. I sometimes take the bus home, but it can get chilly at night. Previously, I would fire up a little python script that saturate all the cores, to warm my lap. My old Intel was plenty warm to keep me from getting too uncomfortable. I can't even feel my M2 through my pants, and sticking it into my shirt makes me look like an idiot.

astrange · on May 24, 2024

They make battery powered hand warmers for that, but it could make you infertile, or I guess set your pants on fire.

etempleton · on May 24, 2024

Battery life is insanely better. If you have not used one of the M series laptops it cannot be overstated how much better the battery life is. It is worth it for battery alone.

But beyond that they are also incredibly fast and run cool. In the MacBook Air there is no fan and on the Pros they barely ever spin up in an audible way.

philjohn · on May 24, 2024

The fans literally never come on for my personal M2 MBP 14" or on my work 16" M1 (it helps that the heavy lifting of running stuff and compiling happens on a dev server)

During work from home during Covid I was still using an Intel MBP and video conferences invariably caused the fans to kick up to the point where using noise cancelling headphones and not the built in speakers was necessary for sanity.

rbanffy · on May 24, 2024

I NEVER heard the fans on my 16" M2. Not even when building Docker images for six different platforms at the same time.

hnrodey · on May 24, 2024

I upgraded from a 2011 MacBook Pro to a M1 MacBook Air and never looked back.

Battery life, portability, COOL. Like, my actual lap is no longer burning.

ben7799 · on May 24, 2024

I went from the last Intel i9 16" MBP to an M3 Pro in the last month at work.

I think it's saving me an hour a day and the fan has never come on, the the laptop has never felt warm, and the battery life is just mind blowing.

I run docker & compilers all day. The i9 would run the fan 75% of the time and had to throttle down any time it was on battery power and it was lucky to last 3 hours on battery.

OBFUSCATED · on May 24, 2024

Noise level, M series are completely silent in my experience.

astrange · on May 24, 2024

llama.cpp somehow causes the fans to spin up pretty hard even if you just leave it at the prompt, but I assume that's performance bugs on their part.

benterix · on May 24, 2024

I can run uncensored models locally - slow but useable.

thfuran · on May 24, 2024

I mean, it won't do your laundry, but it'll be much better in every way a laptop plausibly could be.

lowbloodsugar · on May 24, 2024

I've got a top spec i9 MBP and my same-price M1 Max blows it out of the water, while being vastly cooler and lasting forever on battery.

_dp9d · on May 24, 2024

I just upgraded from a mid 2014 MBP to a used M1 air.

It is much, much faster, silent, and I use it for days without power. Editing 4K video is not just possible, it is a non event.

opan · on May 24, 2024

If you care about Asahi support, an M1 or M2 would probably be better short-term anyway.

Waterluvian · on May 24, 2024

Apple’s tempo is so regular that this has been a problem my entire adult life.

I’ll buy a little later. I’ll buy a little later!

a-french-anon · on May 24, 2024

The way to exit that loop is to convince yourself that the next one will bring a truly lasting difference. Which is why I'm still waiting for GDDR7 GPUs with my 4GB RX 480.

uptown · on May 24, 2024

Get the math co-processor! It'll rip.

vundercind · on May 24, 2024

I just only buy every fourth or fifth whatever it is. Usually the previous model, too, when I do, sometimes used or official-refurb. Works great.

MrFantastic · on May 24, 2024

I've had several devices fail right before the new model was released.

So frustrating.

2OEH8eoCRo0 · on May 24, 2024

That's why my desktop CPU is 11 years old.

capl · on May 24, 2024

Hehe yeah, same when I think about it… I guess best thing to do is buy on launch and update every 2-5 years.

spacebanana7 · on May 24, 2024

If staying on the latest model isn’t important to you, there’s significant cost savings to be made from just buying the previous version.

MediumOwl · on May 24, 2024

It entirely depends how long you keep your devices. I try to keep my iPhones until release year + 6, so I would need the price of a previous version to be reduced by more than 1/6th on a new version release, which is usually not the case.

spacebanana7 · on May 24, 2024

Similar to cars, most depreciation happens in the first year.

So owning a device for 6 years between age 1 and 7 will generally have a lower cost than owning a device between age 0 and 6.

For Apple products it’s generally feasible to effectively buy first hand devices aged 1+ because they’re still available for sale (at least in some retailers) after a new edition is released.

asddubs · on May 24, 2024

so wait for M5 to buy M4

nothercastle · on May 24, 2024

That’s a good strategy with most things that aren’t prone to mfg variability . For cars having a launch version there are a lot of initial manufacturing defects that need to be worked out.

wombat-man · on May 24, 2024

Yeah, feels like if you don't buy in the first few months you might as well hang on for the next one.

alwillis · on May 24, 2024

For those with MacBook Pro FOMO, do not read about the rumored foldable 18.8-inch screen MacBook Pro running on the M5 coming in 2026 [1].

[1]: https://www.macrumors.com/2024/05/23/18-8-inch-foldable-macb...

JulianWasTaken · on May 24, 2024

Without anything but a skim, surely there is no way a MB Pro ships with a virtual keyboard, that is pure torture.

solardev · on May 24, 2024

Maybe it uses the camera for gesture recognition so you can air-write each letter one at a time? Air-quotes will be fun... air-tabs, not so much. "Space, but <widens arms> BIGGER!"

dbspin · on May 24, 2024

Obviously Apple's upcoming LLM will be used to infer based on observation of your past behaviour what you would have typed, and type it for you.

jasomill · on May 24, 2024

So basically the next generation of

https://www.youtube.com/watch?v=R8gF0KTfMrQ

LordDragonfang · on May 24, 2024

If anything, they'll probably use the "studio" branding (or more likely just have it under the iPad line, since they have desktop chips in them now anyways)

cjk2 · on May 24, 2024

Kill me with a blunt spoon before I take on a foldable touch screen.

threeseed · on May 24, 2024

macOS does not support touch input.

It would require the biggest UI redesign in the history of the company to ensure every input control is at least a centimetre away from anything else.

And would require every Mac developer to absorb the cost for major updates to their apps as well.

This would almost certainly be an iPad.

alwillis · on May 25, 2024

Obviously Apple knows how to do touch input.

And it would be possible to update macOS to enable basic touch input for unmodified apps if they wanted to.

yumraj · on May 24, 2024

The rumored price is enough to make me not worry about those.

brigade · on May 24, 2024

Inflation still has a chance to make $3500 in 2026 dollars equivalent to the starting price of the 2016 16" MBP

andy_ppp · on May 24, 2024

I just bought a second hand M2 Air in perfect condition and it feels faster than my M1 Max in a really beautiful body for travel. I’m not certain it matters that much anymore to be honest. What are you using it for?

throw0101d · on May 24, 2024

> this summer

In recent years the MBP line has been updated towards the end of the year (Oct/Nov) or early (Jan):

* https://buyersguide.macrumors.com/#MacBook_Pro_16

So if you can 'limp' along towards the autumn/winter/Christmas, then it's probably worth the wait to get the M4 (or pickup an M3 when the price presumably drops to clear inventory).

jascination · on May 24, 2024

I just bought a refurbished 16in M3 pro, no regrets at all. There's always a new one around the corner, it's really just about whether your setup achieves what you need it to.

Look at real world differences between M2 and M3, it's not a massive jump at all.

I do cross platform app development and the machine is excellent for that. Glad to have it now rather than waiting months for a slightly better system

pmontra · on May 24, 2024

Osborne effect https://en.wikipedia.org/wiki/Osborne_effect

zorrn · on May 24, 2024

This is so real. I have this exact problem... but I think I'm just buying a? refurbished MacBook Air M2 13"

Aaargh20318 · on May 24, 2024

The most incredible thing about the new iPads is that even with the crazy fast M4 chip MS Teams manages to crawl to a halt. Clearly it takes all the engineering skills of the largest and most valuable software company in the world to make text entry go at about 1 fps on a chip as powerful as the M4.

heroprotagonist · on May 24, 2024

Every character you type results in some sort of hit to their telemetry server. It will include the actual letter you typed, if you or your org are not configured to be in EU. With their EU configuration option (pulled from server every launch) it will only report the fact that you typed _something_.

Now if that's not fun enough, their telemetry also covers mouse movements. Go ahead and watch your CPU as you spin your mouse in circles around the Teams window.

For extra fun, block their telemetry server and watch Teams bloat in RAM, to as much as your system has, as it keeps every action you take in local memory as it waits for the ability to talk to that telemetry server again.

If you're going to block their telemetry its best to fake an accept via some mitm proxy and send back a 200 code.

I do not know exactly how much this applies to iPad version, compared to their desktop apps. Mobile offers both more and less data possibilities. It's a different context.

SllX · on May 24, 2024

So they’re running a keystroke logger and masquerading it as “telemetry”? That should be outlawed. It’s not a drafts feature, it’s not an online word processor, it’s just a straight up keystroke logger.

fredley · on May 24, 2024

It is outlawed, in the EU.

SllX · on May 25, 2024

Great. It needs to be outlawed in America.

Aaargh20318 · on May 24, 2024

I’m in the EU and should have the EU configuration.

The problem mainly occurs when I mention someone in a reply to a thread. Once I type @<name> the text input just slows down so much I can type much faster than it can render the text.

heroprotagonist · on May 24, 2024

It's still going through the same telemetry action, it just omits the actual character you typed. And yes, it is character by character. That collection (eg, the timestamp you hit the character, channel/person it was to, etc) is the inefficiency causing your typing to slow.

If it was a straight text box that they polled contents of _occasionally_ or after you hit 'send', it would be a much better user experience.

torginus · on May 24, 2024

Haha what exactly does it protect me from? If they see that I typed something to someone, and see the chat history of me having sent a particular message at a particular time, it doesn't take a genius to put things together.

zamadatix · on May 24, 2024

As much as I dislike overbearing telemetry in the context of an M4 or even an N95 the computer should be more than capable of logging a few kb of telemetry about inputs per second without even a notice on the performance statistics. The problem remains that every single thing in the app is just implemented god awful slow and would still be regardless if it's also recording telemetry on the input data or not.

Klonoar · on May 24, 2024

...do you have any source verifying this? Setting aside how insane of an issue it'd be network-wise/privacy-wise/etc, this is like a day one "debounce the call" fix.

I'm not even saying I doubt you, I'm just curious how you ascertained this exact behavior.

heroprotagonist · on May 25, 2024

Mainly this was just myself getting irritated at MS Teams and trying to figure out what it was doing. It was a couple years ago and my current company doesn't use teams, thankfully, so I can't really see if its still valid.

From what I remember..

There are files on the disk that get updated/overwritten with pulls from the server every time it launches. Somewhere in AppData I think. A few of these are config files (with lots of interesting looking settings, including beta features).

One of the config entries specifies a telemetry endpoint (which, you _could_ figure out with a network tracing tool but there are a ton of MS telemetry endpoints your machine is probably talking to. Best to just grab the one explicitly being used from the config like this). I forget the full name of the setting but the name pretty clearly indicates its for telemetry, and the file is clearly a config file. If you can't find it just by browsing the structure, try a multi-file search tool and look for 'telemetry' or URL/hostnames.

You can't really change the value on disk and make it just take effect from there, since it gets downloaded from the server and overwritten before Teams loads. There might be some tricks you can do locally to persist the change but nothing seemed to work for me. You could override response from server via mitmproxy but that requires finding where it comes across the wire at launch time and then building a script/config to replace it.

Anyway, you can block that telemetry endpoint from a firewall and see your memory bloat. Or you can intercept that endpoint in any mitm proxy. I went with this [mitmproxy](https://mitmproxy.org/). From there you can capture the content it sends to the endpoint, or even change the response the server sends (Teams just seems to expect a 200 code back).

The telemetry data itself is some kind of streaming event format. I think I even found documentation on the structure on some microsoft website, so its likely a reused format.

It's pretty straightforward.

I couldn't spend too much time on it and now it's not something I even use, but some cool things you might want to try if you dive deeper into this:

- Overwrite the config file as it returns from the server, to turn on EU data protection, change various functionality you're not supposed to, or flip some feature flags.

- Figure out if there's a feature flag or even other overwrite to fully disable the metrics so they aren't even collected, from anywhere in the app.

- Intercept telemetry, return an 'OK' response and drop the data from telemetry, or maybe document what they collect more definitively if you think there's interest somewhere. This keeps your privacy but doesn't really do anything for performance.

- Interfere with the data before actually returning it, maybe try playing with event contents and channel/user indicators. Microsoft probably won't like this if they notice, but it's unlikely they'll even notice.

spamizbad · on May 24, 2024

No clue why they're even doing this and not just sampling after the fact. There's no way they are gleaning anything useful that they couldn't more efficiency (and anonymously) capture.

zorrn · on May 24, 2024

Do u have a source for this?

causal · on May 24, 2024

I really want one too. This is the closest thing I could find, but doesn't claim the keystroke level of detail described above: https://www.zdnet.com/article/i-looked-at-all-the-ways-micro...

aaomidi · on May 24, 2024

It’d be trivial to MITM it with wireshark

runjake · on May 24, 2024

For those interested, try Burp Proxy[1] or Charles Proxy[2].

1. https://portswigger.net/burp/documentation/desktop/getting-s...

2. https://www.charlesproxy.com/

zamadatix · on May 24, 2024

What's the easiest way to get the Teams app to accept the MITM on TLS?

aaomidi · on May 24, 2024

I believe it just uses your systems rootstore. So adding the signer cert in there should probably be enough.

zamadatix · on May 24, 2024

Ah, that's right, Chrom* based things look at the system store by default and it's the Firefox based things that don't (without configuration at least). Thanks.

Edit: and that reminds me I should probably run this test on new Teams, where it now uses the built in WebView2

transpute · on May 24, 2024

On iOS, Charles Proxy.

Salgat · on May 25, 2024

Is the telemetry really that big of a load? Having a persistent connection and sending data over it is pretty standard for games going back decades, even AOL instant messenger had this feature for typing.

diebeforei485 · on May 25, 2024

It is on mobile.

yegle · on May 24, 2024

Your description is so absurd that I can't tell if it's real or a satire.

Please tell me this is a satire piece...

vundercind · on May 24, 2024

Asana used to sometimes have textareas that would take a full three seconds to display each key press. On then-current MacBook pros. You know, something that had nearly zero latency on first-gen single-core Pentium chips. Hell it may still do that, I never saw them fix it, I just finally got to stop using it.

Never underestimate the ability of shitware vendors to make supercomputers feel slower than an 8086. These days it usually involves JavaScript, HTML, and CSS.

lucianbr · on May 24, 2024

I had the exact same feeling. Can't tell if it's real or a joke. It's not only outrageous for privacy, but also very bad engineering.

rbanffy · on May 24, 2024

It is the kind of engineering Teams feels like.

A glorified IRC client should run in under a megabyte of memory.

aaomidi · on May 24, 2024

This is so stupid. They are the sender and receiver of the messages. They can backfill their telemetry using batch processes offline ffs.

rbanffy · on May 24, 2024

Unless someone really complains loudly enough large orgs switch to a competitor, it's tech debt and not a bug they need to fix.

chongli · on May 24, 2024

MS Teams is by far the worst piece of software I’ve ever used. It is ungodly slow and it just gets slower the more you use it.

I believe it is actually hitting the server to update the online/away status light for every single message in a conversation. If you turn off all the status update stuff in the settings then the software speeds up dramatically. Another thing you can do is find the folder where it caches everything and just trash the entire thing. Somehow, they’ve managed to make caching slow everything down rather than provide a speed up.

lostlogin · on May 24, 2024

Have you tried ‘New Teams’?

It’s exactly the same but then you can have to open twice and drain your resources much quicker.

z500 · on May 24, 2024

And as a bonus, Outlook stops reporting presence!

beanjuiceII · on May 24, 2024

hmm I use teams daily all day for work and don't have these issues, maybe your org messed something up?

packetlost · on May 25, 2024

I use it exclusively as a progressive web app on Linux and it's not particularly slow, but it is buggy as all hell. Easily the worst rich text editing experience of any chat client I've ever used. Teams is without a doubt the worst piece of software I'm required to use on a daily basis