More

fork-bomber · 2025-01-22T10:55:19 1737543319

Arm has been enabling server/data-center class SoCs for a while now (eg Amazon Graviton et al). This is only going to pick up further (eg Apple Private Cloud Compute).

Also, there's nothing fundamentally stopping chiplet pick-up in traditional embedded domains. It's probably quite likely.

ahartmetz · 2025-01-22T13:56:26 1737554186

They have been "enabling" them but not designed the best of them(°), and I'm not sure how serious they are about the top end because their results are rather half-assed compared to Apple, AMD and Intel. As is, their bread and butter and main focus is still mobile and embedded chips.

(°) The best of them also seem to use barely any ARM standards except for the ISA itself

sliken · 2025-01-22T18:13:00 1737569580

Arm's definitely trying to push on the laptop, tablet, desktop, and server markets. The fastest cluster on the top500 was arm for several years, most of the big clouds either have home grown arm servers (like graviton) or will soon.

They are definitely making progress.

fork-bomber · 2025-01-22T09:31:38 1737538298

Arm doesn't only do ISA. It essentially wrote the standards for the AMBA/AXI/ACE/CHI interconnect space. Standardizing chip-to-chip interconnects is very much in Arm's interests. It is a double edged sword though since Chiplets will likely enable fine grained modularity allowing IP from other vendors to be stitched around Arm (eg RISC-V IOMMU instead of Arm SMMUv3 etc).

fork-bomber · 2024-04-02T14:24:39 1712067879

Surely that's an incredibly broad categorisation ?

Learning Rust, like any other language, is a strategic investment that pays off with experience. Companies that are willing to invest, benefit accordingly.

Evidently, several companies that care about memory safety and programmer productivity have invested and benefited from Rust.

Finally: this is subjective of course but the borrow checker isn't something that necessarily needs fighting 'for a month or two'. There's just so many excellent resources available now that learning how to deal with it is quite tractable.

fork-bomber · on Dec 24, 2023

With that pedantic an approach you are missing out on an otherwise fairly informative piece.

x86x87 · on Dec 29, 2023

I think words matter and it's important that we all agree on the meaning if we want to convey ideas. My approach may be pedantic - but it also saves me from wasting my time reading an article that my be full of idiosyncrasies. It's okay if other people take a stab at parsing the article and/or derive value from it through.

fork-bomber · on June 13, 2023

SiFive

fork-bomber · on Dec 15, 2022

Although not a hard and fast rule, it is commonplace to use the term clusters for CPUs that share a common cache at some level (typically L2). This is quite prevalent in Arm designs, for example, such as big.LITTLE compositions where a big CPU cluster would share an L2 cache, the LITTLE cluster would share another and system software would set up the cache ability and share ability domains (arm parlance for silos where cache maintenance operations can propagate) to reflect that topology.

fork-bomber · on Nov 12, 2022

In an attempt to learn more, I accidentally misspelt Symbian as Sybian on Google search. The world truly is er wondrous. Cough.

fork-bomber · on Jan 4, 2022

I don't think that's largely true anymore for server class hardware - which is the focus of the article.

Arm came up with a bunch of standardisation requirements quite a while ago (see: https://developer.arm.com/architectures/system-architectures...) which have been quite successful especially for server designs.

That was an absolute requirement in order for AArch64 to even be considered as an alternative in the datacenter space where it is now a very compelling alternative to x86_64.

What I mean specifically is standardised support for firmware, hypervisor and operating system kernel interfaces for things like system bootstrap, power-perf control etc. Think ACPI, EFI, CPU capability discovery, DVFS, Idle management etc.

Being unable to boot Linux on modern AArch64 server class hardware is actually increasingly rare thanks to the standardisation.

Your comments are more applicable to the general Arm embedded systems scene where fragmentation is understandably rife. It was the price Arm had to pay to keep its royalty model in flight - "Pay the license fee, do what you will with the design".

fork-bomber · on Dec 30, 2021

'Vagina' is not anatomically correct in the context of the description within the github text. 'Vulva' would be more appropriate. Vagina is the internal passage onwards from the vulva and terminating at the cervix. The vagina wouldn’t be visible under normal circumstances, the vulva would.

unbanned · on Dec 30, 2021

You significantly underestimate the sexual depravity of your average deep learning enthusiast. Vagina may full well be visible.

fork-bomber · on Dec 30, 2021

Possibly. But not in the average case surely

fork-bomber · on Dec 30, 2021

Memcpying and executing code could also surface micro-architectural realities of the underlying CPU and memory subsystem micro-architecture that may need attention from the programmer.

For example:

- On most RISCy Arm CPUs with Harvard style split instruction and data caches special architecture specific actions would need to be taken to ensure that after the memcpy any code still lingering in the data cache was cleaned/pushed out to the intended destination memory immediately (instead of at the next cache line eviction).

- Any stale code that happened to be cached from the destination (either by design or coincidence) needs to be invalidated in the instruction cache.

- Depending on the CPU micro architecture, programmer unknown speculative prefetching into caches as a result of the previous two actions may also need attention.

oshiar53-0 · on Dec 30, 2021

Is there anything else that should be done except flushing D-cache and invalidating I-cache? I'm genuinely curious.

jlokier · on Dec 30, 2021

If you are asking about general cases:

If using paging you may need to invalidate the TLB entry which contains execute permission for the page.

On x86 if using segments, after changing segment attributes you need to reload the segment selectors.

The execution pipeline may need to be flushed, using a serialising instruction.

When modifying code in place that may be being executed by another thread on another core at the same time, some modifications may trigger CPU errata.

On particular CPUs there may be other kinds of caches or state invalidation required, but hopefully the OS provides a "flush I-cache" function that covers all of them.