Making a RISC-V Operating System Using Rust

azhenley · on Oct 2, 2019

I've been pushing the author to write this up for a year now (we both work at University of Tennessee). He teaches a lot of OS-oriented courses and makes them for fun so he is perfect to write this.

Next I'm trying to convince him to make an OS in Zig for Raspberry Pi with a focus on graphics. If you have any suggestions for him, shoot him an email!

jermaustin1 · on Oct 2, 2019

What would a "focus on graphics" mean? A GUI and windowing system, or just graphics processing?

azhenley · on Oct 2, 2019

I want it to be a video game console :) I think our students would have a lot of fun with that.

mrec · on Oct 2, 2019

If you do, definitely ping flohofwoe, who a) maintains a Zig port of his lightweight Sokol graphics API and b) has a penchant for retro consoles.

https://github.com/floooh/sokol-zig

flohofwoe · on Oct 2, 2019

The zig wrapper is only a minimal experiment and work in progress now, also I've hit a little bug :)

https://github.com/ziglang/zig/issues/3211

BUT the sokol_app.h and sokol_gfx.h headers require an underlying platform 3D-API (e.g. GL, D3D11 or Metal) and window system to setup the swap chain. On a bare metal machine with only a 2D framebuffer they're not all that useful unfortunately.

mrec · on Oct 2, 2019

Ah yes, I didn't really think that one through. I know Redox OS (Rust) has OpenGL working, but I think they're just using the Mesa software renderer.

monocasa · on Oct 2, 2019

On a related note, MIT seems to have ported xv6 to riscv.

https://github.com/mit-pdos/xv6-riscv

duckqlz · on Oct 2, 2019

Wow! What an awesome tutorial/book. I look forward to reading more chapters. This is already an extremely useful resource since it covers cross compilation from an x86 system. I love to play around with this kind of stuff in my free time but connecting to dev boards and constantly shifting paradigms was annoying enough to limit my enthusiasm. It took a lot of research to find out how to do what the author fits in a single chapter. Thank you for this awesome post!

Lichtso · on Oct 2, 2019

Looks similar to my effort to interface RISC-V emulators and bare metal C: https://github.com/Lichtso/riscv-llvm-templates

ashort11 · on Oct 3, 2019

I took Programing Languages from him, as well as TA'd for his Operating Systems class. He is an amazing lecturer, and I am glad he is releasing something like this to the public! Definitely something worthwhile to read. Looking forward to more content being released.

coldnose · on Oct 2, 2019

Rust is build around the expectation that allocations happen automatically and can never fail. I’m curious to see how they deal with this in a kernel...

steveklabnik · on Oct 2, 2019

Rust, the language, strictly speaking, knows nothing about allocation at all.

The standard library assumes infallible allocation currently. Kernels generally do not use the standard library.

(Okay, STRICTLY speaking there’s a hack for Box<T> in the language right now but it’s not about the allocation part and Box<T> is technically defined in the standard library with Magic(tm) and so isn’t relevant in an OS dev context, as you won’t use the standard library and therefore Box<T>. Rust-the-language never implicitly inserts heap allocations, including Box<T>, anywhere.)

SAI_Peregrinus · on Oct 2, 2019

The Rust standard library is. The Rust Core library (the far more minimal library that gets used when you mark a crate as #![no_std]) does not. That's how the other "OS in Rust" and "allocator in Rust" projects work.

wahern · on Oct 2, 2019

Well, this OS in Rust project simply commits the same sins as the standard library by using a custom allocation routine that panics on allocation failure: https://os.phil-opp.com/heap-allocation/#allocations-in-rust

Saying that Rust the language doesn't require a non-failing allocator misses the point--the ergonomics of the language make dealing with allocation failure difficult; sufficiently difficult that none of the projects I've seen actually bother attempting it.

See https://cs.brown.edu/research/pubs/theses/ugrad/2015/light.a..., which explains idiomatic Rust instructs developers to return by value, relying on caller assignment to types like Box (which uses exchange_malloc under the hood), to handle heap allocation. Basically, the strategy for dynamic object management in Rust is predicated on hidden heap allocations.

So of course it's not necessary. But good luck writing an entire operating system otherwise. Even Redox OS doesn't bother trying to fight the language in this regard: https://gitlab.redox-os.org/redox-os/slab_allocator/blob/mas...

comex · on Oct 3, 2019

That's not quite right. Rust has two ways to allocate a Box.

One is the `box` keyword: this guarantees that the object will be constructed in place on the heap, but doesn't work with fallible allocators (or, in fact, anything but the default allocator). This is what the research paper you linked is talking about. However, all these years later, `box` has never been stabilized; nor will it be in its current form, because it's considered too inflexible. Whatever form of in-place construction does eventually get stabilized will likely support fallibility.

The other way to allocate a Box is `Box::new`. This is not compiler magic; it's simply a regular function, implemented in Rust, that calls the allocator and then moves (i.e. memcpys) an existing object into the new allocation. If you write your own Box-like type, there's nothing stopping you from making your `new` function fallible.

What about optimizations? Does `Box::new` get optimized in ways that a fallible version won't? Well, no. The compiler will inline `Box::new`, and if you call it with a fresh stack allocation as an argument, LLVM can theoretically, sometimes, elide the stack allocation and the memcpy altogether, instead initializing the object directly on the heap. Theoretically. The paper claims that it always does so, but the paper is wrong. [1] In fact, the compiler doesn't do so even in relatively easy cases. [2] It would be nice if LLVM did better here, but it doesn't seem to be a big source of overhead in Rust programs in practice. If LLVM did improve the optimization, it would probably work equally well for a fallible allocator as for `Box::new`, because they're equally complex from its perspective: `Box::new` can panic, and LLVM treats panics as branches.

(Box does have compiler magic for a different case: the ability to move out of it. Not being able to replicate that in a custom type is suboptimal, but not the end of the world.)

As for why those OS projects panic on allocation failure:

The phil-opp.com one implements the standard allocator interface in order to use the standard library container types. I think it sucks that Rust's standard containers don't support allocation failure, but you don't need to use them...

For Redox... I'm not actually sure what they're doing, but I think "fn oom" is a hook for users of the allocator to call if they want to panic on out-of-memory, not something that mandates panicking. At least, that's the behavior of `std::alloc::handle_alloc_error` [3], which was moved there from being a trait method on `GlobalAlloc` named `oom`. However, they're implementing a different `oom` on an old version of the `Alloc` (not `GlobalAlloc`) trait, which is unstable; that method was removed entirely well over a year ago, so I guess the code must be out of date.

[1] https://users.rust-lang.org/t/how-to-create-large-objects-di...

[2] https://play.rust-lang.org/?version=stable&mode=release&edit... (press "..." -> "ASM")

[3] https://doc.rust-lang.org/nightly/std/alloc/fn.handle_alloc_...

jimktrains2 · on Oct 2, 2019

> the ergonomics of the language make dealing with allocation failure difficult; sufficiently difficult that none of the projects I've seen actually bother attempting it.

Honest question: what languages make this ergonomic and can you share any projects that handle this gracefully?

wahern · on Oct 2, 2019

Off-hand I don't know of any that make this ergonomic, at least none that don't make use of exceptions. Rust isn't unique in this regard.

What makes it a potential impediment in Rust is that the constraints and burdens of the borrow checker are offset by mechanisms like Box. The fact that all the extant examples choose the convenience of Box over handling OOM, even in situations where not handling OOM is obviously a deal breaker for production systems, speaks volumes about the significance of the problem and how the language shapes people's choices. Async/await is in the same boat--technically doesn't require a non-failing heap allocation, but who's going to bother making it work? You technically don't need async/await to do asynchronous programming, either, but the whole point was that this is the type of thing that needs to be addressed by the core language with some primitive (i.e. generators) that does the heavy lifting and which can be built upon.

I don't know enough Rust to know how easy it would be to create a Box-like implementation that rewrites the AST to automatically propagate allocation failure via the idiomatic Result<T, E> protocol. But that seems roughly what the proper solution might look like in Rust; either that or finally getting over the anxiety and aversion about exceptions.

Lua handles OOM quite well. Lua doesn't have try/catch, just "protected calls" which are not as light-weight as regular function calls. (A pcall initializes a recovery point with _setjmp.) In Lua you tend to use protected calls at, effectively, transactional boundaries--the points in your call graph where you're willing and able to rollback application state for non-specific, otherwise unrecoverable errors. AFAIU you could technically do the same in Rust, except Rust makes unwinding optional at compile-time[1], so it's not the kind of thing people will make a habit of deliberately designing for in their libraries. Which makes me think the likely solution for Rust, if any, is to permit libraries to opt-in to OOM recovery with an allocator pragma that does something like the Result<T, E> propagation mentioned above.

[1] Lua is GC'd. The cost of running destructors outside the normal call/return protocol is fixed and independent of whether an application uses protected calls.

monocasa · on Oct 3, 2019

Writing a KBox that calls a kmalloc and where new() returns a Result<KBox<T>, KAllocFailure> should be pretty trivial.

Rusky · on Oct 2, 2019

You deal with this in a Rust kernel in exactly the same way you deal with it in a C kernel: by using different APIs for allocation.

Linux doesn't use malloc; a Rust kernel need not use Box::new either.

tal8d · on Oct 2, 2019

Try harder: https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html#t...

You can implement your own global allocator, use an existing crate, or (in the case of user applications) use the system's. That is for the stdlib level, not even core...

muricula · on Oct 2, 2019

Nope, the GP is still correct. If your custom allocator fails, anything created with Box::new or similar will panic.

tomlong · on Oct 2, 2019

I see the author and I share a naming convention for our miscellaneous scripts, and `do.sh` appears prominently in both our work.

I subsequently branched out to e.g. `do-backups.sh`, where despite the `do-` being superfluous, I quite like the aesthetic.

Slightly more on topic: I really enjoy blog series like these, with plenty of detail on esoteric topics I really have no idea about, or on the face of it much interest in. They're a fun way to increase the breadth of topics I have a superficial knowledge of.

loquor · on Oct 2, 2019

For someone with a standard CS undergraduate education, how much effort (in hrs/week) would such an endeavour require?

DannyB2 · on Oct 2, 2019

Old wives tail from ancient times, before CompuServe:

The first 20% of the effort gets you about 80% of the results. So everything seems exciting. But to that that last 20% of the results requires 80% of the total work.

There is a ton of un-fun, un-glamorous work getting a gazillion device drivers written, for example.

nickpsecurity · on Oct 2, 2019

You can always use a hypervisor, a driver OS that re-uses Windows/Linux drivers, and the new OS in a VM using virtual drivers. That's what some L4-based setups, including commercial OKL4, did. They also let you write native drivers directly on the microkernel or in OS VM's for situations where effort was justified. I'm surprised more haven't done this.

Rump kernels are the closest trend.

panpanna · on Oct 3, 2019

> I'm surprised more haven't done this.

Because this is generally not secure. The driver OS will have access to hardware that can bypass the memory restrictions set upon it by the microkernel.

There is sometimes special hardware to address this but they are too complex to manage from the kernel.

nickpsecurity · on Oct 5, 2019

"Because this is generally not secure."

Most OS's aren't designed to be that secure, though. That's why I wonder why it hasn't been tried more for usability. A security-focused setup certainly has more to be concerned about. Like I advocated with Xen, a good start would be making the host OpenBSD. They should be able to get hardware that would be compatible.

GRBurst · on Oct 2, 2019

Since there are missing parts: Do you have a sheduling plan for the remaining posts?

azhenley · on Oct 2, 2019

One per week.

turblety · on Oct 2, 2019

Amazing tutorial, and really exciting for RISC-V. I hope he creates a patreon or some other donation platform, as it must take some time to write this up. I'm sure some people would be willing to send a bit his way.

azhenley · on Oct 3, 2019

Done! Thanks for the suggestion. The Patreon link is at the top of the blog.

etskinner · on Oct 3, 2019

One suggestion: Your patreon page only mentions the Aarch64 project, not the Rust on RISC-V project. That might alienate people coming from the Rust blog; they might think they landed in the wrong spot (I sure did)

mastrsushi · on Oct 2, 2019

Making X.....but using Rust!!!

GRBurst · on Oct 2, 2019

I really really like the first sentence

> RISC-V ("risk five") and the Rust programming language both start with an R, so naturally they fit together

agumonkey · on Oct 2, 2019

I expect the os to be fully based around and r7rs loaded racket erlang and R.

DannyB2 · on Oct 2, 2019

When RISC-V inevitably renames itself CISC-V, then what language would naturally fit together?

And don't say cobol.

mindcrime · on Oct 2, 2019

Ceylon?[1]

Crystal?[2]

Church?[3]

Coq?[4]

[1]: https://ceylon-lang.org/

[2]: https://crystal-lang.org/

[3]: https://en.wikipedia.org/wiki/Church_(programming_language)

[4]: https://en.wikipedia.org/wiki/Coq

Someone · on Oct 2, 2019

COMTRAN would be a way better choice than COBOL :-) (https://en.wikipedia.org/wiki/COMTRAN)

Common Lisp would be a decent choice, too, with

    CISC:RISC = Common Lisp:Scheme

antoinealb · on Oct 2, 2019

madmulita · on Oct 2, 2019

Obviously: Clojure.

kbenson · on Oct 2, 2019

Why nor CRISC-V? Or CaRISC-V? Or (eww) RaCISC-V?

It's not like current "CISC" architectures haven't taken a lot of the good aspects of RISC already too...

astrobe_ · on Oct 2, 2019

Well it would be "less risk", maybe even "riskless" so I guess you'd need a new language named "restless"?

swsieber · on Oct 2, 2019

V

https://vlang.io/

You could vlog about it.

gpm · on Oct 2, 2019

Cython with a rust core?

pnako · on Oct 3, 2019

C

Worse is better, forever.

phkahler · on Oct 2, 2019

"Because I wanted to" should be reason enough. But for those who demand a better justification, this is very specific ;-)