Bare Metal Rust Generics

josh2600 · on Aug 23, 2020

Rust is great. While working with Secure Enclave’s isn’t the same as a microcontroller per se (many SE’s work in tandem with powerful chipsets), at MobileCoin we’ve found a lot of benefits from using Rust.

Rust appears to provide the ability to largely move to a world where the bugs are in logic and not in buffer/array verification or multi-threading. It feels like a real advancement to move up a layer of abstraction from plumbing to design.

Time will tell, certainly it’s not impossible to have bugs in rust by any stretch of the imagination, but I have found that bugs in rust code tend to be more interesting than “out of bounds memory exception at line X.”

throwaway894345 · on Aug 23, 2020

The author cites "concern about binary size" bloat as a valid concern his colleagues have about generics. He explains that they normally would write distinct implementations. Am I wrong for thinking that generics shouldn't result in more bloat than the equivalent hand-duplicated code. And conceivably the compiler would be able to recognize that a List<&Foo> and a List<&Bar> are parameterized by equivalently-sized types and thus can share a single copy of the same generated code (although this might break inlining optimizations)?

steveklabnik · on Aug 23, 2020

Yes, if you were to duplicate the code, you'd be doing the same thing as monomorphization. But it's possible that they wouldn't actually duplicate it, and do something else that would be smaller, like, casting to a void pointer or something.

The compiler can do some optimizations like this, yes, but there's a lot more work to do in this area.

As an example of something folks still do by hand sometimes, take the starts_with method on Path: https://doc.rust-lang.org/stable/std/path/struct.Path.html#m...

It looks like this:

    pub fn starts_with<P: AsRef<Path>>(&self, base: P) -> bool {
        self._starts_with(base.as_ref())
    }

    fn _starts_with(&self, base: &Path) -> bool {
        iter_after(self.components(), base.components()).is_some()
    }

Why split this into two functions? Well, you can get smaller code size this way, because you're sort of "hand-de-duplicating" the parts that aren't generic. Once we call .as_ref, everything else is actually identical, but the compiler isn't good enough at this yet to do this itself, so we do it by hand.

I wouldn't say this technique is super super common or well-known, just for the standard library (because it's used everywhere) and for folks that are sensitive to code size, like embedded or wasm people.

habitue · on Aug 23, 2020

Additionally when you make something easy, people do it more than they would if it were annoying and manual. You might imagine that if you had to hand roll everything you'd be more cognizant of how many copy/pastes you did.

__cuervo · on Aug 24, 2020

This is exactly what I meant. Thanks Steve for the clear explanation!

In the next part of the article I'll hopefully explore some drawbacks of choosing generics all the way down. Either driver would've likely perform a little better if developed fully independently, but by a smaller margin that I would've expected.

bestouff · on Aug 24, 2020

There's already a crate for that: https://lib.rs/crates/momo

mlindner · on Aug 24, 2020

Wouldn't "dyn" get you the equivalent in Rust? Runtime generics as opposed to compile time.

steveklabnik · on Aug 24, 2020

Nope; dyn is dynamic dispatch, this is statically dispatched.

It is true that dyn means you don't get monomorphization, and can help with binary sizes.

EDIT: thinking about this some more, I wanted to say that it does feel similar, but one big difference that’s easy to explain is that dyn will change the way the value is represented in memory, and this will not. Conceptually, both do “cast and then call this single function”, but in the dyn case, the cast would be to a trait object, whereas this casts directly to &Path.

ridiculous_fish · on Aug 23, 2020

Generics can increase code bloat because they give the compiler less information, or the information is available too late.

For example, see how switching from a concrete to a generic type increases the size of clone() by 4x: https://rust.godbolt.org/z/qbYr3v

This is because at the point the compiler synthesizes clone() it has less information about the type.

codeflo · on Aug 23, 2020

Very interesting. Does this mean certain optimizations run before the monomorphization step? Do you know why? Compilation performance is the obvious thing that comes to mind.

ridiculous_fish · on Aug 24, 2020

Yes there are optimizations that run before monomorphization; this one occurs during macro expansion.

It's a bit of a chicken-and-egg problem. To monomorphize clone(), someone must emit an implementation first. But an optimal implementation requires analyses that aren't available until later in the pipeline. Here the optimization kicks in for types deriving Copy, but a generic parameter is enough to defeat it.

eddyb · on Aug 25, 2020

IIRC, that "optimization" mostly avoids wasting time compiling a complex `Clone` implementation, when simply returning `*self` suffices (there are some crates with at lot of `#[derive(Copy, Clone)]` types). We try to avoid having a lot of logic like that too early, for precisely the reasons you mention.

I'd be interested in an example where LLVM can't optimize the general version, as it means we might want to do this through MIR shims instead (which can be generated when collecting the monomorphic instances to codegen - this is what happens when you clone a tuple or closure, for example).

eddyb · on Aug 25, 2020

Did you mean to link to a different example, or different compiler flags?

The link you provide shows only one function, because LLVM has optimized both to be identical, and deduplicated them.

(If you disable the "Directives" filter, you can see a `.set example::clone_concrete, example::clone_abstract`, which aliases one to the other)

5462 · on Aug 26, 2020

The behavior differs between (the present) nightly and rustc 1.45.2; the nightly available when this link was posted matched the 1.45.2 behavior.

The output with 1.45.2 is as follows:

  example::clone_concrete:
        mov     eax, edi
        ret

  example::clone_abstract:
        mov     ecx, edi
        and     ecx, -256
        xor     eax, eax
        xor     edx, edx
        cmp     dil, 1
        sete    dl
        cmove   eax, ecx
        or      eax, edx
        ret

eddyb · on Aug 27, 2020

Fascinating coincidence! It was probably the LLVM upgrade (https://github.com/rust-lang/rust/pull/73526) landing, probably before the comment was even posted (but the nightly would only show up with the upgraded LLVM the next day).

kev009 · on Aug 23, 2020

In C, because you often use void * for containers, there may be less code bloat than a container multiply used using generics which accessors are monomorphized for each type. In practice this is pretty trivial difference and can be bounded and understood well enough.

bluejekyll · on Aug 23, 2020

I love this article, gives some great examples of decent abstractions to adopt in different situations to achieve the goals of the project. Great story telling.

Right at the beginning this is a great quote:

"I was wrong."

We'd all do well to say that more often when it happens, and that's not about Rust, just life generally.

dgellow · on Aug 23, 2020

I very much enjoyed the way the article is written. The good balance between technical details and humor.

bborud · on Aug 23, 2020

I'm hoping that someone will one day build a RTOS in Rust that can run on M0 (or smaller) devices, comes with decent tooling (so you won't need projects like Platformio to compensate for the horrific tooling that embedded platforms have), and abstractions that are "just so".

The "just so"-bit is really hard.

Embedded systems need something like Rust.

bluejekyll · on Aug 23, 2020

Have you looked at Tock OS? https://github.com/tock/tock

I took a class in it at one of the RustConf's and that made me feel, even as a novice in the embedded space, that I would be able to do some great things. If I remember correctly the class was on an M0 board.

bborud · on Aug 24, 2020

Thanks for the tip, will have a look.

What is needed is to get chip manufacturers onboard - and to make them understand that they need to take software seriously. Right now the situation is a bit depressing. The ones that are ahead of the pack choose based more on politics and less based on technical merit. And most see this as a "solved problem" because people are provably able to write software for their devices.

A lot of embedded developers don't really care that the toolchains are awkward because it "works for them". Which isn't helpful. And people easily get annoyed if you suggest that "perhaps this could be done less clumsily". There really isn't any easy way to say "this needs improvement" without provoking a defensive response. And people tend to be fairly frustrated when they get to the point where they reach out to get some help to solve problems.

I still know of companies that skip using any RTOS and create their own OS-like infrastructure because the existing offerings are just too much of a time-waster. (Try to use Zephyr for a year on fairly new chipsets. The amount of time spent getting builds to work after upgrades is astonishing and reveals just how sketchy and fragile the tool chains are - which gives even seasoned embedded-devlopers used to sketchy and/or expensive tool chains pause).

There have been "bare metal rust" initiatives for a while, but I haven't seen any that have gotten critical mass yet. Or buy-in from the hardware industry. I'm not sure why. I suspect the relative newness of Rust might be a factor. Perhaps people that have invested decades in C/C++ are afraid of starting over with a new language because they want to stick to what is familiar.

I think we need to understand better what is needed to make this a success. I'm tempted to think that starting with the tooling might be a good idea. And assume that it needs to support multiple "kernels" (that might not even need compatible interfaces).

One thing that is a bit interesting to look at is the difference in how Espressif and ARM have approached this. Espressif devices became popular in hobbyist segment where Arduino (or rather Wiring) was the "gold standard". Rather than fight it, they embraced and supported it. In not that many years this familiarity lead to lots of commercial grade stuff being made with Espressif chipsets - because they are cheap and there's lots of solutions on the web that can get you prototyping fast. This lead to widespread use of their devices and people "graduating" to using their FreeRTOS-based SDK. An SDK that now has pretty impressive support for anything from Audio to motor control.

(As for ARM: their tooling and them kind of restarting mBed years ago and causing confusion, regressions in tooling usability etc made me ditch them and look elsewhere. Since the "professionals" didn't even use mBed it just wasn't worth the investment in time)

With regard to tooling, it is worth looking at Platformio and ponder why there is a need for a tool to manage embedded toolchains - and what functionality the tool provides. Platformio is perhaps the most convenient system I've used for managing embedded projects. It still has some shortcomings, but it is miles beyond where the industry is right now.

steveklabnik · on Aug 24, 2020

I know you're not excited about ARM, but they are taking an active interest in Rust, going so far as to contribute employee time and company money to the project to get ARM platforms to an equal level of support as X86. They don't make the chips of course, but I'm hoping it trickles down to the manufacturers anyway. Things are already pretty good, I'm excited for them to get even better.

bborud · on Aug 24, 2020

I'm very excited about ARM since half the time I spend programming MCUs that have ARM cores. Just not mBed.

But to be fair, it has been a while since I used mBed and it might be time to give it another spin just to know the lay of the land.

If they were to dedicate serious resources to making Rust run on M0 and up, I'd be really, really thankful.

steveklabnik · on Aug 24, 2020

Ah :)

So yeah, they've given us hardware to be able to run CI, which is a huge step. This also includes statements like "Arm intends to further donate newer and more capable hardware to this initiative."

They've had employees doing testing and submitting patches and fixing bugs.

The focus is on aarch64-unknown-linux-gnu to start, but there's a lot of procedural kinks to work out. Future targets should go much more smoothly.

More detail here: https://github.com/rust-lang/rfcs/pull/2959

bborud · on Aug 25, 2020

I'm mostly talking about devices that are too small to run Linux :). This is the biggest MCU I've worked on to date:

https://www.nordicsemi.com/Products/Low-power-cellular-IoT/n...

steveklabnik · on Aug 25, 2020

Totally, Rust code has been running on that chip specifically for at least a year. ARM just wanted to start support at the big end of the spectrum. I'm personally working in the M4/M7 range, which is a bit bigger than you...

bborud · on Sept 1, 2020

With Apple moving to ARM it would make sense to focus there first since ARM probably have a decent shot at growing their high end market with Apple's help.

davidhyde · on Aug 23, 2020

See RTIC (real time interrupt-driven concurrency) formerly known as RTFM (real time for the masses - renamed due to confusion with a similar acronym about reading manuals).

https://rtic.rs

This library gets a lot of attention in the embedded rust community and is designed for tiny devices like an arm M0 mcu.

wyldfire · on Aug 23, 2020

RTIC looks kinda cool, I may kick the tires on it. Do you know if there's any support or docs for running C or C++ code as an `rtic::app`?

davidhyde · on Aug 23, 2020

You can easily call c code from rust using the extern keyword:

https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#usin...

If your c++ code uses the c ABI (application binary interface) then you can call it easily too. There is a tool called rust-bindgen which will automatically generate the required rust code from header files.

cbHXBY1D · on Aug 23, 2020

I'd love to see a QNX clone.

ansible · on Aug 24, 2020

Redox (https://www.redox-os.org) is still under heavy development, and isn't intended for widespread use yet, but it is based on a microkernel architecture. I don't know if that's close enough to what you'd like to see.

bborud · on Aug 24, 2020

Isn't that made for somewhat beefier systems? Wouldn't Linux fill much of that niche today (modulo RT properties)?

Remember that a lot of embedded systems run on really constrained chipsets where memory is measured in kilobytes and where the underlying architecture isn't necessarily a von Neuman architecture.

AlphaSite · on Aug 23, 2020

mbed and its RTOS was a rather nice platform when i fiddled with it.

Squonk42 · on Aug 23, 2020

It looks like you can even run Rust on a small BluePill board featuring an STM32F103C8T6 ARM Cortex M3 with a modest 64 Kbytes of Flash memory and only 20 Kbytes of SRAM: https://github.com/TeXitoi/blue-pill-quickstart

wyager · on Aug 24, 2020

I think the boards I used for http://yager.io/vumeter/vu.html had even less memory, like 8kbytes of SRAM or something. Rust wasted a few kB on error reporting functions, iirc, but other than that everything fit without too much effort as long as I didn't try to include soft-float emulation or something.

oxymoron · on Aug 23, 2020

Great article. The writing style reminds me of James Mickens. For instance, this classic https://www.usenix.org/system/files/1311_05-08_mickens.pdf

__cuervo · on Aug 24, 2020

Hey! You just made my day. James Mickens (that article in particular) was a huge inspiration, especially the part about crying blood. May have been a bit too dramatic...

oxymoron · on Aug 25, 2020

In that style, there is nothing too dramatic. ;)

tomcatfish · on Aug 24, 2020

:) This very link is used in the article where it is linked with the text "Zeus Hammers"

phibz · on Aug 23, 2020

...I imagine. Drawn beyond the lines of reason, push the envelope watch it bend

phibz · on Aug 25, 2020

Considering the article title references sacred geometry and so does the Lateralus lyrics, I thought it was apropos, if a bit silly. Guess some of you disagreed.

agumonkey · on Aug 23, 2020

Let's ask ADAers if they moved to rust

KMag · on Aug 23, 2020

I'm a huge fan of Rust, but I was under the impression that the domains where Ada is most commonly used are very risk-averse, so I presume Rust is still a bit young for a lot of the Ada use cases.

eggy · on Aug 23, 2020

I agree. Ada has been around and picked apart for high-integrity software jobs. I am personally learning SPARK2014, the subset of Ada. It has formal verification as part of the toolset. I would think Rust would need to develop similar formal tools and not just the PL to be used in the same industries as Ada/SPARK2014. I know somebody is going to link to the use of Rust for something in aerospace. I know Julia is touting something to do with air traffic control, but the core of these high-integrity applications need the full gamut of design tools around the PL. I still like Rust, but I am also playing with Zig.

littlestymaar · on Aug 23, 2020

> It has formal verification as part of the toolset. I would think Rust would need to develop similar formal tools and not just the PL to be used in the same industries as Ada/SPARK2014.

There are people agreeing with you, and they are trying to make it happen: https://ferrous-systems.com/blog/sealed-rust-the-plan/

eggy · on Aug 24, 2020

Thank you. I hadn't heard of it before. I would like to see it happen. I do think it will take many man-years to catch up with Ada/SPARK2014 and win industry favor and confidence, but it's good to see.

shakna · on Aug 23, 2020

Ignoring that the engineering sphere that generally uses Ada works on multi-decade timelines, making Rust a bit too young and immature...

Ada and Rust don't have 1:1 comparison on their safety guarantees.

Rust makes a lot of guarantees around memory safety that Ada does not, whilst Ada is far more focused on correctness. Whilst related, these are very different.

agumonkey · on Aug 23, 2020

what correctness does ADA focus on ?

shakna · on Aug 23, 2020

SPARK is a formal proving system for Ada. Formal correctness is what they focus on, that Ada can safely prove mathematically how a function will behave, and how a body of functions will behave.

If you're using Ada without SPARK, it will try and enforce these same guarantees, but generally with a runtime contract system, rather than compile time.

OnlyOneCannolo · on Aug 23, 2020

I've always wondered this - Does the compiler actually use all that contract information, or is it just the prover?

shakna · on Aug 23, 2020

As of Ada 2012, the redesign means that the compiler can make use of the contract information, and was one of the main reasons for a lot of the changes.

OnlyOneCannolo · on Aug 24, 2020

That makes sense. Otherwise why bother conflating the programing language with the verification language.

agumonkey · on Aug 24, 2020

pretty nice, is the SPARK market lively ? I'd love to see how work is done there

shakna · on Aug 25, 2020

Lively, but proprietary. A lot of SPARK stuff tends to be automotive, factory designs, fabrication and the sort of stuff where people working there are buried under dozens of NDAs preventing them from discussing their work in any way.

But projects tend to have a longer shelf-life. Contracts expected to take a decade to complete aren't unusual, and the product can be expected to be in use for twice that, or longer.

OnlyOneCannolo · on Aug 23, 2020

Would they even have a choice?

agumonkey · on Aug 23, 2020

Good point