I have been complaining about Rust's "unsafe" for years. There's too much "unsaf...

Cladode · on Jan 18, 2020

    there are things that you either 
    can't express, or are very hard
    to express, in safe Rust.

This is true (and in some sense inevitable by Rice's theorem) but it is unclear what can be done about this. Bear in mind that Rust's design constraints are

- decidable (indeed fast) type-inference/type-checking

- simple typing system

- you don't pay for features you don't use

It is easy to get rid of unsafe blocks if you throw complexity at Rust, e.g. use dependent types and / or theorem provers, but all the obvious ways of doing this immediately violate the constraints above. Doing it within Rust's constraints is an open research problem.

   potentially fixable.

What would you use as an objective, measurable criterion that we can use to decided whether a purposed fix is in fact a fix? If you have a concrete design proposal that works (in particular w/o destroying Rust's fast (bidirectional) type-inference/type-checking and the simplicity of Rust's types), and evidence that it's widely applicable (i.e. doesn't just allow to type some trivial edge cases of back-links and partially initialised arrays) that would easily publishable in top PL conferences (e.g. POPL) and a working prototype would help towards getting this into Rust.

I would be happy to help with implementing & writing up such work. I have quite a bit of experience with sub-structural types a la Rust. I have spend some time thinking about this problem space, but didn't manage to crack it. I have an incoming PhD student whom I could ask to help with this.

Animats · on Jan 19, 2020

Oh, not Rice's theorem again. That only applies to infinite systems. There are useful decidable properties for a reasonably large set of programs. Yes, they're "uninteresting" mathematically, but we're talking about memory safety here. If your program is anywhere near undecidable in that area, you're doing it wrong. Any time you're close to the edge on that, you can get decidability with a run-time check. The Microsoft Static Verifier people have been down that road with some success. I went there 40 years ago, with the Pascal-F Verifier.

The question I'm asking is, what do you really need to do that you can't do in safe code. On the memory safety front, for pure Rust (no calls to C), probably not all that much. Maybe only this:

- Syntax for talking about partially initialized arrays. For a collection, you have a variable that indicates how much of the array has been initialized, and you need a statement in the language that ties that variable to a partially initialized array. Then you can do simple inductive proofs to show that as you initialize another element and add 1 to the counter, the entire array is initialized. Non-consecutive initialization would require more complex proofs. Is there a real use case for that? One with major performance implications, where initializing the data to some null value would be too slow?

Un-initialized blocks of types such as raw bytes aren't a problem, because those types are "fully mapped" - all bit combinations are valid. Only types that are not fully mapped an initialization guarantee.

- Backlinks. Backlinks in doubly-linked lists and trees obey some simple invariants. You need syntax to say "this is a backlink". That implies an invariant that if A points to B, B must point back to A. Everything which updates A or B must preserve that invariant. The pointer from B back to A then no longer needs to have ownership of A. (It's a problem that Rust doesn't have "objects", because such invariants are usually required to be true when control is outside the object, but can be broken momentarily while control is inside it. The inside/outside distinction is important for this, and Rust is a bit soft on that. Spec# took inside/outside seriously, but viewed it dynamically, rather than being tied to static scope. Haven't looked at that in years.

What else do you really need? Remember that some knotty ownership problems can be resolved by allocating the objects in an array which owns them, and using array indices for relationships between the objects. It's not like you have to use pointers.

Concurrency is a related issue, but I've said enough for one night.

Cladode · on Jan 20, 2020

I'm still unclear about this, should this be done by the typing system or by an external static verifier:

   simple inductive proofs to show 
   that as you initialize 
   another element and add 1 to 
   the counter

If the latter, you can already do this with Rust as it is, it's just a question of building such a static verifier. If the former, I really doubt that this can fit in Rust's typing system without major surgery. We know well how to add arithmetic to typing systems, but the natural approach to doing so (dependent types) will probably make type-checking / -inference undecidable.

akamaka · on Jan 17, 2020

I’m replying to your message using software written in C, so do you think that it might be a little bit unfair to call it “broken”?

I suppose I’d agree if you’re making a general argument that most of the software we use is broken to some degree, but it has an entirely different meaning to directly tell someone “Your library is broken”.

thu2111 · on Jan 18, 2020

Are you? All web browsers I know of are written in C++ and always have been. In modern C++ you don't use char* to mean "pointer to byte array of unknown size", you use std::vector. Moreover Chrome actually uses garbage collected C++ (they call it Oilpan).

aws_ls · on Jan 20, 2020

They may be using Linux as OS.

layoutIfNeeded · on Jan 17, 2020

>You're sending a reference to an array of unknown size.

The next argument is literally the length of the array.

jdbernard · on Jan 17, 2020

No, that's what the caller says is the length of the array and as we already established, they're lying. ;)

In all seriousness there is quite a difference in structures that allow themselves to have guarantees built into the language/tooling (like including the size of the array in the languages basic "array" building block) vs. the more manual book-keeping the C code above requires.