Guaranteeing memory safety in Rust [video]

pohl · on May 22, 2014

Was anyone able to make out what the questions from the audience were at the end? I might need to watch this again with headphones. Great talk. Thank you for sharing this.

jruderman · on May 22, 2014

The last question was about using Rust for Servo, and in particular whether there had been any major pain points. Patrick Walton helped to answer (45:48):

~ The overall discipline hasn't been too difficult to been follow.

~ Most of the issues we hit are issues in the implementation. (e.g. the precise way the borrow-checker checks inveriants, reasons about lifetimes and whatnot.) These kinds of issues are fixable, and we continue to improve them all the time.

~ I don't speak for the entire Servo team, but I feel like the discipline, the overall type-system strategy that Rust enforces, has been pretty friendly.

~ We still have a lot of unsafe code, but a lot of it is unavoidable, for calling C libraries. And also we're doing things like: we have Rust objects which are managed by the SpiderMonkey [JavaScript] garbage collector. Which is really cool that we can do that, but the interface has to be written in the unsafe dialect [of Rust].

jruderman · on May 22, 2014

The first set of questions was about how the borrow checker understands vectors.

~ How do you tie the ownership of [the element array] to the vector? How does the compiler know that when you take a reference into the element array, [it should treat the vector itself as borrowed]?

~ What happens if I write my own library class [instead of using one that's part of the standard library like vec]?

cpeterso · on May 22, 2014

This is a good reminder that speakers should always repeat the question on mic during Q&A. :)

pcwalton · on May 22, 2014

This talk also doubles as a description of how to understand the memory management concepts in Rust—if you've ever wondered how to use references and boxes in Rust, this is the talk to watch.

bjz_ · on May 22, 2014

It also tackles head on the comparisons between Rust and C++11. Worth a watch if you are curious about that.

_pfxa · on May 22, 2014

The syntax of Rust was one major thing that made me avoid studying the language. Now, this talk encourages me to go and play with it.

One question that comes to my mind is that how Rust concurrency compares to Go concurrency. After this talk, and some little use of “goroutines”, Rust seems to be the choice of the wise, for it being seemingly more sage and still as easy to use as Go's. I'd love to read a reply from someone who knows concurrency/parallelism, who is most definitely not me.

bjz_ · on May 22, 2014

I'm not an expert about concurrency, but from my understanding Go still does not solve the safety issues relating to data races. In addition, Rust's concurrency 'primitives' are built as libraries rather than into the language itself, making them far more powerful and extensible. That said, Rust pays for this power and safety with a steeper learning curve and a more complex type system, but in my opinion it's worth it.

twic · on May 22, 2014

What was your beef with the syntax?

_pfxa · on May 22, 2014

I really dislike abbreviations like “fn”, “mut”, “proc”, etc., not only in Rust but anywhere. I'd rather prefer actual words. At first, very shallow look; I thought Rust was C++ with shorter keywords (I didn't know anything about the features featured in the talk until today, though). Combined with (afaik) everything being an expression, Rust code would, I thought, become unreadable very quickly.

kibwen · on May 22, 2014

I actually find the shorter keywords aid readability. Code, just like prose, has a maximum line length after which readability starts decline. Given that we only have so much horizontal space to work with, shorter keywords allow one to use more descriptive variable and function names. Given that the number of keywords is fixed and the number of variables and functions is unlimited, I'd rather be terse in the former and verbose in the latter.

Also, Rust does care a great deal about readability. You might scoff at this statement, but it's true: Rust's syntax works hard to strike a balance between all of clarity, consistency, regularity, readability, and familiarity. As the language gains wider use, understanding of these constraints improves; modern Rust has removed 95% of the sigils that characterized ancient Rust. There used to be tons of sigils, and many were largely inscrutable: ~, @, +, ++, -, more that I've forgotten. Now there's only two: & (for references, a sigil taken directly from C++) and * (for unsafe pointers, a sigil taken directly from C). And there are even some people who would like to see the * sigil for unsafe pointers go away.

bsaul · on May 22, 2014

This is a fantastic video. I've been reading rapidly about rust and its types of memory pointers before, but this talks really helps me see the big picture ( aliasing & mutability, and the borrowing metaphore). Great great talk.

nwmcsween · on May 22, 2014

Rust doesn't really have zero-cost abstraction (runtime bounds checking, symbol mangling, exception handling frames, split-stacks (this may have changed?)) this was and still is my complaint about it, the one way to fix it is death by a thousand knobs that modify functionality but by then it's not even the same language. Also the recent change adding -ffuntion-sections -fdata-sections (a hack imo) shows that the language or the implementation has an issue. I may be wrong, if so let me know.

dbaupp · on May 22, 2014

Bounds checking is only required for random access indexing into an array. Sequential access is very efficient (and perfectly safe) using iterators. Furthermore, if you're sure an access is always valid, you can use call the unchecked indexing method. (This requires an `unsafe` block to call: i.e. risk-of-unsafety is opt-in, but it is still easily possible on a case-by-case basis.)

Symbol mangling and exception handling can be disabled.

Split stacks are gone, but, at the moment, the prelude for checking stack bounds is still used to protect against stack overflow (I believe this can also be disabled).

pcwalton · on May 22, 2014

> runtime bounds checking

In practice this is quite rare because of iterators, which do not need to do bounds checking. In the rare case in which you need to use indexes, you can always opt out of the safety without the use of a compile-time switch, exactly like the choice between C++'s "operator[]" and ".at()". The only difference is that the default is safe, and the unsafe version requires you to jump through the "unsafe{}" hoop, unlike C++.

Also, with MPX on Skylake even the bounds checking may well become zero-cost.

> symbol mangling

How does symbol mangling make your program go slower?

> exception handling frames

Also enabled by default in C++, and you can turn it off just as in C++. LLVM will also optimize it out if you don't use it and use LTO. On most architectures exception handling is zero-cost via table-driven unwinding (although it can inhibit optimizations).

> split-stacks (this may have changed?))

They're gone.

> this was and still is my complaint about it, the one way to fix it is death by a thousand knobs that modify functionality but by then it's not even the same language.

Nothing needs to be fixed, as none of the things you mention are an issue.

> Also the recent change adding -ffuntion-sections -fdata-sections (a hack imo) shows that the language or the implementation has an issue.

Why? Rust's compilation model is exactly the same as C++ "unity builds" (which are becoming the norm in large projects).

nwmcsween · on May 22, 2014

> symbol mangling

Preference in all honesty, what besides the obvious requires mangling?

> split-stacks

What took place of split stacks? guard pages?

> ffunction-sections fdata-sections

an example is to use weak aliasing of dummy functions - but this is sort of elf dependent (I have no idea how or if weak aliasing works in pe). I think this is an area that needs to be improved.

If I'm wrong, let me know.

pcwalton · on May 22, 2014

> Preference in all honesty, what besides the obvious requires mangling?

We have a module system. So we have to do name mangling to allow you to use multiple functions with the same name in different modules. This allows you to program without fear of annoying name collisions.

    mod foo {
        fn f() {}
    }
    mod bar {
        fn f() {}
    }

> What took place of split stacks? guard pages?

Stack checks that abort if you run out of stack (like MSVC's __chkstk). It would be nice to just replace it with guard pages where available, and we can probably do that in future versions.

> an example is to use weak aliasing of dummy functions - but this is sort of elf dependent (I have no idea how or if weak aliasing works in pe). I think this is an area that needs to be improved.

How does this allow the linker to throw away unused functions? The reason we use function and data sections is to allow the static linker to remove dead code (which is a lot for many programs).

nwmcsween · on May 28, 2014

Modules don't really need sym mangling though a simple mod_fn or mod_fn_type for parametric polymorphism but I'm probably missing something. Weak aliasing to a dummy creates a stub if the other strong sym function is not used thus not pulling it in. I think this is a language level issue and shouldn't be punted to the linker to figure out.

kibwen · on May 22, 2014

  > what besides the obvious requires mangling?

Rust's symbol mangling allows multiple versions of the same library to be installed on the same system and peacefully coexist. A single Rust program can even link against two different versions of the same library (which comes in handy when two of your dependencies rely on the same library, but are updated at different rates).

You can explicitly disable mangling via an attribute:

  #[no_mangle]
  fn this_symbol_wont_be_mangled() { ... }

  > What took place of split stacks?

https://mail.mozilla.org/pipermail/rust-dev/2013-November/00...

"Instead of segmented stacks we're going to rely on the OS and MMU to help us map pages lazily."

nwmcsween · on May 29, 2014

So a version of gnu symbol versioning, does it have the same issue with gnu symbol versioning? If multiple versions of functions are supported it defaults to the lowest version if none are specified, thus possibly introducing broken behaviour.

nwmcsween · on May 28, 2014

If rust had compile time analysis of stack usage wouldn't that remove the need for bounds checking, iterator abstraction as well as guard pages (or whatever is in use)?

kibwen · on May 29, 2014

Statically verifying stack sizes is a hard problem, and I've never heard of a programming language that does it (Ada might?).

In any case, it wouldn't obviate the need for bounds checking. It's possible to index into an array given information known only at runtime, which means that the check for whether that index is valid needs to happen at runtime.

bjz_ · on May 22, 2014

Rust has bounds checking when indexing into arrays, but those can be usually bypassed by using iterators. There are also unsafe methods if you want to avoid bounds checking (but they are not recommended for most code). Split stacks were removed a long time ago.

pjmlp · on May 22, 2014

Hackers and security tool vendors thank the success of their business to C, when better alternatives were already available.

Some of the Rust features can be disabled via compiler flags, on the other hand I rather pay a little bit for security.

arthursilva · on May 22, 2014

The topics in the second half of the talk are very impressive. Looking forward to write some rust!

bjwbell · on May 22, 2014

Anyone else having problems with streaming? It's unwatchable for me (Comcast cable).