How safe is Zig?

AndyKelley · on June 23, 2022

I have one trick up my sleeve for memory safety of locals. I'm looking forward to experimenting with it during an upcoming release cycle of Zig. However, this release cycle (0.10.0) is all about polishing the self-hosted compiler and shipping it. I'll be sure to make a blog post about it exploring the tradeoffs - it won't be a silver bullet - and I'm sure it will be a lively discussion. The idea is (1) escape analysis and (2) in safe builds, secretly heap-allocate possibly-escaped locals with a hardened allocator and then free the locals at the end of their declared scope.

nine_k · on June 24, 2022

I would rather prefer the compiler to tell me: "Hey, this stack-allocated variable is escaping the function's scope, I can't do that! Allocate it somewhere outside the stack."

Maybe the compiler could offer me a simple way to fix the declaration somehow. But being explicit and transparent here feels important to me; if I wanted to second-guess the compiler and meditate over disassembly, I could pick C++.

flohofwoe · on June 24, 2022

Since HN upvotes are invisible in the UI: +1. IMHO escaped locals should be an error, but not a hidden allocation.

randyrand · on June 23, 2022

I feel like the most user friendly solution for Use After Free is to just use Reference Counting. Basically, just copy Objective-C's "weak_ptr" design.

For every allocation, also store on the heap a "reference object", that keeps track of this new reference.

  struct reference_t {
   // the ptr returned by malloc
   void* ptr;
   // the stack/heap location that ptr was written to.
   // i.e  the reference location.
   void* referenceAddr; 
  }

Reference track: every time ptr is copied or overwritten, create and delete reference objects as needed.

If ptr is freed, visit every reference and call *referenceAddr = NULL, turning all of these references into NULL pointers.

kaba0 · on June 24, 2022

Reference counting is an incomplete solution it itself - it leaks circular references, and on multithreaded programs it is quite slow. Though some optimizations could be done statically to improve on that.

All in all, a mark-and-sweep will likely be faster (in throughput at least) let alone a good GC.

randyrand · on June 24, 2022

Circular references wouldn’t matter.

The programmer would still decide when to free the object, not some automatic system. Manual memory management, with Ref Counting just to add some additional Use After Free checks.

jhgb · on June 24, 2022

Wouldn't this, if applied by default, complicate Zig's easy C interoperability? I don't quite see how plain C code could play well with this.

skullt · on June 23, 2022

Does that not contradict the Zig principle of no hidden allocations?

kristoff_it · on June 23, 2022

I don't know the precise details of what Andrew has in mind but the compiler can know how much memory is required for this kind of operation at compile time. This is different from normal heap allocation where you only know how much memory is needed at the last minute.

At least in simple cases, this means that the memory for escaped variables could be allocated all at once at the beginning of the program not too differently to how the program allocates memory for the stack.

messe · on June 23, 2022

Static allocation at the beginning of the program like that can only work for single threaded programs with non-recursive functions though, right?

I’d hazard a guess that the implementation will rely on use-after-free faulting, meaning that the use of any escaped variable will fault rather than corrupting the stack.

wavesquid · on June 24, 2022

No need to limit to single-threaded: as long as we reserve enough space in the TLS area it's possible to work with multiple threads.

Zig has future plans to require recursive functions to declare their maximum stack memory usage up-front, so that will provide the rest.

adwn · on June 24, 2022

No need for recursion or multi-threading: If you call the function in a loop and don't release the escaped local unter after the loop, and if the number of loop iterations isn't statically known, then it's impossible to pre-allocate heap storage for that local variable.

remexre · on June 23, 2022

Could this be integrated into the LLVM SafeStack pass? (I don't know how related Zig still is to LLVM, or if your thing would be implemented there.)

randyrand · on June 24, 2022

Just another idea for use after free:

What If we combined the 'non-repeating' malloc idea with 128-bit uuids?

malloc would just return a 128-bit uuid, and to get to the data ptr you'd need to consult a hash table.

  dataPtrArr[hash(uuid)].dataPtr = dataPtr

We'd check if it's been freed by checking:

  dataPtrArr[hash(uuid)].uuid == uuid

flohofwoe · on June 24, 2022

At that point it's better to use 'tagged index handles', but IMHO that's outside the scope of the language (but maybe an option for the stdlib):

https://floooh.github.io/2018/06/17/handles-vs-pointers.html

(but for this an "auto-decaying pointer" would be nice which cannot be stored outside the stack and cannot be "carried across" function calls.

randyrand · on June 24, 2022

Nice article. Similar idea ofc. Yes, that design could work tho all objects per pool must be the same size. Not sure how that would be better.

pjmlp · on June 24, 2022

At which point is much simpler to introduce automatic memory management in some form.

That solution is basically how platforms like Psion or Symbian used handles due to memory constraints.

randyrand · on June 24, 2022

IMO this is still simpler than automatic memory management, and the runtime costs are mostly fixed and predictable.

You also don't need to worry about ref cycles or GC pauses.

pjmlp · on June 24, 2022

Assuming the mythical existence of the perfect developer that nevers does mistakes which handle to use, which certainly wasn't the case on Psion or Symbian projects.

anonymoushn · on June 23, 2022

I would like Zig to do more to protect users from dangling stack pointers somehow. I am almost entirely done writing such bugs, but I catch them in code review frequently, and I recently moved these lines out of main() into some subroutine:

  var fba = std.heap.FixedBufferAllocator.init(slice_for_fba);
  gpa = fba.allocator();

slice_for_fba is a heap-allocated byte slice. gpa is a global. fba was local to main(), which coincidentally made it live as long as gpa, but then it was local to some setup subroutine called by main(). gpa contains an internal pointer to fba, so you run into trouble pretty quickly when you try allocating memory using a pointer to whatever is on that part of the stack later, instead of your FixedBufferAllocator.

Many of the dangling stack pointers I've caught in code review don't really look like the above. Instead, they're dangling pointers that are intended to be internal pointers, so they would be avoided if we had non-movable/non-copyable types. I'm not sure such types are worth the trouble otherwise though. Personally, I've just stopped making structs that use internal pointers. In a typical case, instead of having an internal array and a slice into the array, a struct can have an internal heap-allocated slice and another slice into that slice. like I said, I'd like these thorns to be less thorny somehow.

10000truths · on June 23, 2022

Alternatively, use offset values instead of internal pointers. Now your structs are trivially relocatable, and you can use smaller integer types instead of pointers, which allows you to more easily catch overflow errors.

anonymoushn · on June 23, 2022

This is a good idea, but native support for slices tempts one to stray from the path.

alphazino · on June 23, 2022

> so they would be avoided if we had non-movable/non-copyable types.

There is a proposal for this that was accepted a while ago[0]. However, the devs have been focused on the self-hosted compiler recently, so they're behind on actually implementing accepted proposals.

[0] https://github.com/ziglang/zig/issues/7769

throwawaymaths · on June 23, 2022

This. I believe it is in the works, but postponed to finish up self-hosted.

https://github.com/ziglang/zig/issues/2301

avgcorrection · on June 23, 2022

A meta point to make here but I don’t quite understand the pushback that Rust has gotten. How often does a language come around that flat out eliminates certain errors statically, and at the same time manages to stay in that low-level-capable pocket? And doesn’t require a PhD (or heck, a scholarly stipend) to use? Honestly that might be a once in a lifetime kind of thing.

But not requiring a PhD (hyperbole) is not enough: it should be Simple as well.

But unfortunately Rust is (mamma mia) Complex and only pointy-haired Scala type architects are supposed to gravitate towards it.

But think of what the distinction between no-found-bugs (testing) and no-possible-bugs (a certain class of bugs) buys you; you don’t ever have to even think about those kinds of things as long as you trust the compiler and the Unsafe code that you rely on.

Again, I could understand if someone thought that this safety was not worth it if people had to prove their code safe in some esoteric metalanguage. And if the alternatives were fantastic. But what are people willing to give up this safety for? A whole bunch of new languages which range from improved-C to high-level languages with low-level capabilities. And none of them seem to give some alternative iron-clad guarantees. In fact, one of their selling point is mere optionality: you can have some safety and/or you can turn it off in release. So runtime checks which you might (culturally/technically) be encouraged to turn off when you actually want your code to run out in the wild, where users give all sorts of unexpected input (not just your “asdfg” input) and get your program into weird states that you didn’t have time to even think of. (Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.)

nyanpasu64 · on June 23, 2022

Unsafe Rust is an esoteric language without iron-clad guarantees, and type-level programming and async Rust is an esoteric metalanguage (https://hirrolot.github.io/posts/rust-is-hard-or-the-misery-...). For example, matklad made a recent blog post on "Caches In Rust" (https://matklad.github.io/2022/06/11/caches-in-rust.html). The cache is built around https://docs.rs/elsa, which is built around https://docs.rs/stable_deref_trait/latest/stable_deref_trait..., which is unsound for Box and violates stacked borrows in its current form: https://github.com/Storyyeller/stable_deref_trait/issues/15

There is a recurring trend of sound C programs turning into unsound Rust programs, because shared mutability is often necessary but it's difficult to avoid creating &mut, and Stacked Borrows places strict conditions on constructing &mut T (they invalidate some but not all aliasing *const T).

staticassertion · on June 27, 2022

I don't think this is a great example of "sound C program turning into unsound Rust program". The crate isn't "unsound" in the way a C program would be - it's unsound in the sense that, given either 'unsafe' elsewhere or changes to how Rust constructs work (that are not guaranteed) a consumer of this crate could accidentally violate one of the necessary guarantees.

For a Rust program the bar is "has to be safe, even if some other part of the program uses unsafe". That seems like it's arguably a higher bar than C where everything is already "unsafe" in that same way.

haberman · on June 24, 2022

I invested a lot of time porting some parsing code I had written to Rust, with the vision that Rust is the memory-safe future. The code I was porting from used arenas, so I tried to use arenas in Rust also.

Using arenas required a bunch of lifetime annotations everywhere, but I was happy to do it if I could get provable memory safety.

I got everything working, but the moment I tried to wrap it in Python, it failed. The lifetime annotation on my struct was a problem. I tried to work around this by using ouroboros and the self-referencing struct pattern. But then I ran into another problem: the Rust arena I was using (Bumpalo) is not Sync, which means references to the arena are not Send. All of my arena-aware containers were storing references to the Arena and therefore were not Send, but wrapping in Python requires it to be Send. I wrote more about these challenges here: https://blog.reverberate.org/2021/12/19/arenas-and-rust.html

You might say "well don't use an arena, use Box, Rc, etc." But now you're telling me to write more complicated and less efficient code, just to make it work with Rust. That is a hard pill to swallow for what is supposed to be a systems language.

himujjal · on June 24, 2022

I did the Crafting Interpreters book to learn Rust after I did The Rust book.

I faced the same problem as you. Somehow it felt like Arenas went beyond Rust's philosophy and added huge amounts of complexity to the interpreter.

Tree traversal and mutating the environment of a `block` internally was an issue I spent like 2-3 days on. I was porting Java code to Rust after all. Somehow got it working in a Rust way. I used unsafe at one place. But I was left heavily unsatisfied. Something about graphs/trees and Rust don't match up.

kaba0 · on June 24, 2022

I’ve written a JVM in Rust where I was pretty much unable to work without unsafe, and while I know the general consensus is that it is a sin to use, I managed to get away with a few usages wrapped tightly inside a safe API. Sure, I had memory problems during writing, but with MIRI and some debug assertions they were not hard to hunt down and I really only had to get that few lines of code inside unsafe blocks right.

What I’m trying to say with all that, do not be afraid to use unsafes in Rust. It is part of the language for a reason. Sure, do use ARC for some non-performance critical whatever, because it frankly doesn’t really matter. But where it matters, and you decided to use a low-level language, then go for unsafe if that’s the only reasonable way. The result will still be much safer than the other low-level languages. I believe the problem here is the same what C++ tried to achieve: making people believe it is a high level language. That is just dishonest, and really should not be the goal of Rust.

staticassertion · on June 27, 2022

You can tell pyo3 that your type isn't Send and then it'll panic if the object is accessed from multiple threads. Given that that's the only safe option, that seems fine? You say that that's not acceptable for a production library but I don't see the issue.

You have the same restrictions in C++ except with worse consequences.

haberman · on June 29, 2022

> You have the same restrictions in C++ except with worse consequences.

In C++ it is safe because the arena is only used from one thread at a time.

To model this C++ pattern in Rust, what I would really want is:

1. Arena should be Sync, and not use interior mutability.

2. Arena::alloc() should do a dual borrow: (a) a mut borrow of the Arena metadata, only for the duration of the alloc() call, and (b) a non-mut (shared) borrow of the Arena data.

Because this kind of split borrow cannot be expressed in Rust AFAICS, (2) is not possible, so (1) is not feasible. This forces Bumpalo to be !Sync, which makes a direct Rust port of the C++ pattern impossible.

I've heard this called the "factory problem" for Rust: you cannot easily make a factory type in Rust that returns references, because if the create() operation mutates the factory, then the returned reference will have a mutable borrow on the factory.

The alternative would be to make a truly thread-safe arena/factory, which could be Sync with interior mutability, but that is an efficiency compromise due to synchronization overhead.

himujjal · on June 24, 2022

I understand where you are coming from. Rust and Graphs/Trees is a hard problem. Somehow it goes beyond the way we think about Graphs.

avgcorrection · on June 24, 2022

I respect the effort. I won’t argue against such hard-earned experience.

gnuvince · on June 24, 2022

Rust has been my primary language for the past 5 years, but it's moving in a direction that gets it farther away from my own values about what software ought to be like. As more features are added to the language, the ways they interact with each other increases the overall complexity of the language and it becomes hard to keep up.

I really like the safety guarantees that Rust provides and I want to keep enjoying them, but the language -- and more importantly, its ecosystem -- is moving from something that was relatively simple to a weird mish-mash of C++, JavaScript, and Haskell, and I'm keeping an eye out for a possible escape hatch.

Zig, Odin, or Hare are not on the same plane of existence as Rust when it comes to out-of-the-box safety (or, at the moment, out-of-the-box suitability for writing production-grade software), but they are simpler and intend to remain that way. That really jives with my values. Yes, this means that some of the complexity of writing software is pushed back onto me, the programmer, but I feel that I have a better shot at writing good software with a simple language than with a complex language where I only superficially understand the features.

brabel · on June 24, 2022

> have a better shot at writing good software with a simple language than with a complex language where I only superficially understand the features.

That's exactly how I feel too. No matter how much I use Rust, I find it nearly impossible to claim I understand a lot of its features. Zig OTOH is basically what C would look like if designed today! The improvements it offers over C are very compelling to me... it remains to be seen if the lack of formal guarantees that Rust gives still makes Zig programs similarly as buggy as C programs, but my current impression is that Zig programs are going to be very far away from C's in terms of safety issues... the features the blog post mention go a long way.

socialdemocrat · on June 24, 2022

I think you are too dismissive of the importance of simplicity. Programming is hard. That Rust takes away certain problems doesn’t change that. A lot of coding is just reading and understanding some code. If you have problems understanding some code then I hat is also code you are. Or likely to not catch bugs in.

A compiler cannot be a substitute for your brian. The ability to read code and think clearly about it is a massively important feature because humans at the end of the day are the ones who have to understand code and fix it.

It depends on the person. Programmers are different. Rust works great for some. To me it looks too much like C++ which is something I want to put behind. I know it is a different language but it has a lot of that same focus as C++ that leads to slow compilers and complex looking code.

If I was younger I might have put in the effort, but I am not willing to make the same wrong bet I did with C++. I sunk so much time into perfecting C++ skills and realizing afterwards when using other languages that it was a huge waste.

gurjeet · on June 24, 2022

> ... I am not willing to make the same wrong bet I did with C++. I sunk so much time into perfecting C++ skills ...

What are your top complaints about C++? What parts/patterns are wasteful, and must be avoided?

I have encountered C++ a couple of times in my career. And both those times I was barely able to survive in the short periods of time I spent in those jobs. I'm pretty good at C, but for the life of me, I just can't deal with the hidden behaviours of C++.

So far, I have seen only one example of well-written, understandable C++ code: LLVM. I dabbled in learning LLVM for a side project, and the tutorials on writing compiler passes, and the LLVM's own code, seems to use only the rudimentary features of C++ (primarily, single-inheritance). And this absence of complex C++ features made me feel comfortable looking at, reading, and understanding the LLVM code.

I am of the firm belief now that good code can be written in any bad language, and bad code can be written in any good language. Perhaps those couple of times that I encountered difficult C++ codebases, they were just instances of bad code, and should not be used as an indictment of the C++ language.

Good code => readable, understandable, maintainable, extensible, rewritable. Bad code => !Good code.

SubjectToChange · on June 24, 2022

> LLVM's own code, seems to use only the rudimentary features of C++ (primarily, single-inheritance). And this absence of complex C++ features made me feel comfortable looking at, reading, and understanding the LLVM code.

Uh, what parts of LLVM were you reading? Because LLVM uses pretty much every C++14 feature allowed by their coding standard. The fact that you felt like no complex machinery was being used is actually a testament of the powers of abstraction in C++.

avgcorrection · on June 24, 2022

Readable code is important? Then keep in mind the context: low-level programming and writing Safe Rust or writing in some not-quite-memory-safe language because one would rather be bitten by undefined behavior now and again rather than have to learn the borrow checker.

Knowing for sure that the code you write is at least memory safe is a certain kind of readability win and I don’t see how anyone can claim that it’s not.

kristoff_it · on June 23, 2022

> Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.

The problem with getting lost too much in the ironclad certainties of Rust is that you start forgetting that simplicity (papa pia) protects you from other problems. You can get certain programs in pretty messed up states with an unwanted wrap around.

Programming is hard. Rust is cool, very cool, but it's not a universal silver bullet.

avgcorrection · on June 23, 2022

Nothing Is Perfect is a common refrain and non-argument.

If option A has 20 defects and option B has the superset of 25 defects then option A is better—the fact that option A has defects at all is completely besides the point with regards to relative measurements.

Karrot_Kream · on June 23, 2022

But if Option A has 20 defects and takes a lot of effort to go down to 15 defects, yet Option B has 25 defects and offers a quick path to go down to 10 defects, then which option is superior? You can't take this in isolation. The cognitive load of Rust takes a lot of defects out of the picture completely, but going off the beaten path in Rust takes a lot of design and patience.

People have been fighting this fight forever. Should we use static types which make it slower to iterate or dynamic types that help converge on error-free behavior with less programmer intervention? The tradeoffs have become clearer over the years but the decision remains as nuanced as ever. And as the decision space remains nuanced, I'm excited about languages exploring other areas of the design space like Zig or Nim.

avgcorrection · on June 23, 2022

> But if Option A has 20 defects and takes a lot of effort to go down to 15 defects, yet Option B has 25 defects and offers a quick path to go down to 10 defects, then which option is superior?

Yes. If you change the entire premise of my example then things are indeed different.

Rust eliminates some defects entirely. Most other low-level languages do not. You would have to use a language like ATS to even compete.

That’s where the five-less-defects thing comes from.

Go down to ten effects? What are you talking about?

Hermitian909 · on June 24, 2022

I'm not strongly opinionated on Rust specifically but I'm not sure:

> Rust eliminates some defects entirely.

Is really a true premise, and to the extent it is true, is not a clear to me that it makes Rust better or safer than languages who don't eliminate this class of bugs.

Unsafe exists, is widely used, and importantly is used in places where the hairiest versions of these bugs tend to live anyways.

For safe code, I think it's clear that what we as programmers really care about is "how much does the language reduce the number of defects weighted by severity". If Rust reduces the number of memory bugs but increases other types of bugs due to increased complexity, that might not be a net win.

Personally my guess is that Rust is a net win in this regard, but I don't think we have any evidence of that.

Karrot_Kream · on June 24, 2022

Another dimension to remember here is velocity.

I've worked in heavily monadic Scala codebases where changes in object structure require minor refactors due to strict typing in the codebase. Changes that could be small in other languages would necessitate changes in our monad transformer stacks and have us changing types throughout the project. This eventually lead to engineer anxiety ("small changes take forever so I'm not willing to work on this codebase and explain to my manager why a schema change is taking so long") and aversion to project ownership.

This codebase did materially have fewer defects than other, looser, codebases my team worked with. We got paged fewer times for this codebase than others. Unfortunately, other than a handful of engineers who loved working with monadic FP and were also busy, nobody wanted to touch the code and once the original authors left the team, the codebase went untouched. You could make the case that management should have let these engineers take more time to make these changes but in the meantime, other teams at the company built up high reliability ways of working in other languages and paradigms with higher velocity and similarly low defect rate.

Defects alone aren't everything. You need to look at the big picture.

avgcorrection · on June 24, 2022

Yes, yes, yes. This was already covered in my original comment.

> you don’t ever have to even think about those kinds of things as long as you trust the compiler and the Unsafe code that you rely on.

Forgive if I don’t give the lawyerly disclaimer in all of my follow-up comments.

Yes, you have to be able to trust the Unsafe code that you depend on.

3a2d29 · on June 24, 2022

Rust does no eliminate memory errors.

200+ memory safety errors were found in rust crates: https://www.infoq.com/news/2021/11/rudra-rust-safety/

I love rust, I think there's a good chance zig in its current state isn't the answer, but saying rust is totally memory safe is wrong. You drop into unsafe and people make the same errors C/C++ devs make.

vore · on June 24, 2022

The surface of exposure is much lower in Rust than in C/C++ however. It is unlikely you are writing your entire program in unsafe Rust, so you can still get a significant benefit from memory-safe safe Rust.

In C/C++ world, _everything_ is unsafe, so you can't even limit your exposure in the safer parts because you can do unsafe operations anywhere.

pjmlp · on June 24, 2022

While true, that depends pretty much on which libraries you depend on, and the npm like ecosystem doesn't help in that regard.

avgcorrection · on June 24, 2022

Give me a break. In my original comment:

> you don’t ever have to even think about those kinds of things as long as you trust the compiler and the Unsafe code that you rely on.

Your comment is not some kind of gotcha. I can’t be bothered to give the long-form disclaimer in all of my follow-up comments. Read the context.

kristoff_it · on June 23, 2022

Zig keeps overflow checks in the main release mode (ReleaseSafe), Rust defines ints as naturally wrapping in release. This means that Rust is not a strict superset of Zig in terms of safety, if you want to go down that route.

I personally am not interested at all in abstract discussions about sets of errors. Reality is much more complicated, each error needs to be evaluated with regards to the probability of causing it and the associated cost. Both things vary wildly depending on the project at hand.

avgcorrection · on June 23, 2022

> This means that Rust is not a strict superset of Zig in terms of safety, if you want to go down that route.

Fair.

> I personally am not interested at all in abstract discussions about sets of errors.

Abstract? Handwaving “no silver bullet” is even more abstract (non-specific).

Avshalom · on June 23, 2022

releasesafe is the main release mode because zig has a small community that is largely ideologically aligned with it.

I have absolutely no faith that, in a future where Zig is popular, it will remain so. "well it passed our unit tests and it didn't fall down when we fuzzed it, so lets squeeze a couple free % extra perf out of it and ship" already in this post we have people talking about how safer mallocs or sanitizers are too much of a hit to expect people to use in the wild.

coldtea · on June 23, 2022

>If option A has 20 defects and option B has the superset of 25 defects then option A is better

Only if "defect count" is what you care for.

What if you don't give a fuck about defect count, but prefer simplicity to explore/experiment quickly, ease of use, time to market, and so on?

notriddle · on June 23, 2022

Then you don't want Zig or Rust. Use a language with a GC. Exploratory programming is a lot more pleasant when you don't have to worry about calling free() at the right time. I've had success with PHP and Elixir for productive, exploratory programming, not just because of their GCs, but also because they both support REPL-driven development and hot code reloading.

jhgb · on June 24, 2022

> Then you don't want Zig or Rust. Use a language with a GC.

Zig allows custom allocators pretty much everywhere, right? Would it be impossible to provide it with a GC-based allocator for increased convenience at the cost of a little performance for programs (or parts of programs) where convenience is preferred? Perhaps libgc could be an inspiration here.

coldtea · on June 24, 2022

>Then you don't want Zig or Rust.

I might still want Zig just fine. E.g. because I know C well, and it's a better C. Or because of the ease of interfacing with C/native libs. And several other reasons.

qzw · on June 23, 2022

Then just use C? Heck, if you really don't give a fuck about defects, you can just have all your code in main(). You really can't beat that in terms of simplicity to explore/experiment quickly, ease of use, and time to market.

coldtea · on June 24, 2022

>Heck, if you really don't give a fuck about defects, you can just have all your code in main(). You really can't beat that in terms of simplicity to explore/experiment quickly, ease of use, and time to market.

I probably can "beat that", because e.g. breaking things into functions increases simplicity and being able to explore/experiment quickly.

Whereas eliminating defects with a type/lifetime checker you need to hand-wrestle, not so much.

avgcorrection · on June 24, 2022

Yeah what if you don’t care about memory safety bugs. Indeed.

Klonoar · on June 23, 2022

>A meta point to make here but I don’t quite understand the pushback that Rust has gotten.

The non-CS "human" answer to this is that so much of tech and programming is unfortunately tied to identity. There are developers who view their choices as bordering on religion (from editors to languages to operating systems and so on) and across the entire industry you can see where some will take the slightest hint that things could be better as an affront to their identity.

The more that Rust grows and winds up in the industry, the more this will continue to happen.

avgcorrection · on June 24, 2022

Yes. I’m as guilty of this as anyone else.

the__alchemist · on June 23, 2022

This is a concise summary of why I'm betting on Rust as the future of performant and embedded computing. You or I could poke holes in it for quite some time. Yet, I imagine the holes would be smaller and less numerous than in any other language capable in these domains.

I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

cogman10 · on June 23, 2022

> I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

The big win for rust in these domains is startup time, memory usage, and distributable size.

It may be that these things outweigh the easier programming of go or java.

Now if you have a big long running server with lots of hardware at your disposal then rust doesn't make a whole lot of sense. However, if want something like an aws lambda or rapid up/down scaling based on load, rust might start to look a lot more tempting.

modeless · on June 23, 2022

It's pretty simple. Rust's safety features (and other language choices) have a productivity cost. For me I found the cost surprisingly high, and I'm not alone (though I'm sure I'll get replies from people who say the borrow checker never bothers them anymore and made them a better programmer, let's just agree there's room to disagree).

Although I'm a big fan of safety, since experiencing Rust my opinion is that low-pause GC is a better direction for the future of safe but high performance programming. And there's also a niche for languages that aren't absolutely safe which I think Zig looks like a great fit for.

avgcorrection · on June 24, 2022

Please read the fricking context.

My comment was about preferring other new, low-level languages over Rust when they don’t give the same safety guarantees. If you can deal with a GC then fine—my comment has got nothing to do with that.

So it was much, much more narrow than making a case for Rust in general.

Rust and GC both eliminate certain defects. And if you can use a GC then you don’t need Rust (w.r.t. memory safety).

Admittedly maybe I could have made it more clear that my comment does not make an argument against new low-level capable languages when used with some kind of automatic memory management scheme, like I guess Nim.

crabbygrabby · on June 24, 2022

After having seeing GC after GC fail to live up to expectations... I'm still voting for Rust. So much more control over wether you even want to allocate or not. I know where you are coming from but, I see it differently I guess.

modeless · on June 24, 2022

Whether you allocate or not is a property of the language, not the GC. A lot of GC'd languages encourage or even force allocations all over the place. But maybe we could do better in a new language.

crabbygrabby · on June 24, 2022

I see a lot of people put rust down for what it is today, and then in the next breathe wistfully wonder about the future of zig, or some other language, etc. It's just not fair.

Rust today let's you avoid allocations, if you want smart pointers you can have them today, I'd you want a GC inside rust, you can have that today as well. Don't need to wait for another language lol.

modeless · on June 24, 2022

Enough of Rust is set in stone now that it will never be the language I want. There's nothing fair or unfair about it. That's just the way it is. I still think it is a good language.

dilap · on June 23, 2022

What Rust does is incredibly cool and impressive.

But as someone that's dabbled a bit in both Zig and Rust, I think there's a lot of incidental complexity in Rust.

For example, despite having used them and read the docs, I'm still not exactly sure how namespaces work in Rust. It takes 30s to understand exactly what is going on in Zig.

pitaj · on June 24, 2022

Can you explain what you mean by your namespaces comment? AFAIK, Rust has modules and crates, not namespaces.

brabel · on June 24, 2022

Namespace is a more generic name for what modules and crates in Rust provide.

avgcorrection · on June 24, 2022

Yeah, it was quite mind-bending when I tried to learn it. I still don’t understand why they went in that direction.

proto_lambda · on June 24, 2022

The module system is something that some people struggle with for a while, but every feedback I've heard so far is "it actually makes complete sense, I don't know why I didn't get it earlier". There just seems to be dead-end somewhere that people frequently think themselves into.

voidhorse · on June 24, 2022

The premise of eliminating entire classes of errors in the abstract is nice and all, and definitely something we should do, but it isn’t the sole deciding factor in choosing an implementation. language:

- If the language is not well known, that’s bad. It will be harder to hire a proficient team. More time will be spent on learning the language. - If the syntax is needlessly verbose, that’s bad. It increases the chance for typos and time spent fixing typos to get things to compile. Eventually it leads to ide completion based programming which results in the degradation of the skill set of the pool of programmers that know that language. - If the concepts the language uses are difficult to manipulate and remember, it takes more time to engage with any given piece of code - If you often need to drop into unsafe modes that’s also not great because now you effectively are using two languages not just one. The safe language and the unsafe language. they interact but play by totally different rules. yikes.

I think rust is a great language and the borrow checker is an amazing innovation. However I think rust has a lot of warts that will harm its success in the long run. I think the next language that leverages the borrow checker idea but does so with a bit better ergonomics will really take off.

dleslie · on June 23, 2022

And here is the table with Nim added; though potentially many GC'd languages would be similar to Nim:

https://uploads.peterme.net/nimsafe.html

Edit: noteworthy addendum: the ARC/ORC features have been released, so the footnote is now moot.

3a2d29 · on June 23, 2022

Seeing Nim danger made me think, shouldn’t rust unsafe be added?

Seems inaccurate to display rust as safe and not include what actually allows memory bugs to be found in public crates.

IshKebab · on June 23, 2022

I don't know why Rust gets "runtime" and Nim gets "compile time" for type confusion?

shirleyquirk · on June 23, 2022

yes, for tagged unions specifically, (which the linked post refers to for that row) Nim raises an exception at runtime when trying to access the wrong field, (or trying to change the discriminant)

jewpfko · on June 23, 2022

Thanks! I'd love to see a Dlang BetterC column too

Snarwin · on June 23, 2022

Here's a version with D included:

https://gist.github.com/pbackus/0e9c9d0c83cd7d3a46365c054129...

The only difference in BetterC is that you lose access to the GC, so you have to use RC if you want safe heap allocation.

kzrdude · on June 24, 2022

I guess rust should say "wraps" for integer overflow as well, as that's what it does in default release compile.

verdagon · on June 23, 2022

A lot of embedded devices and safety critical software sometimes don't even use a heap, and instead use pre-allocated chunks of memory whose size is calculated beforehand. It's memory safe, and has much more deterministic execution time.

This is also a popular approach in games, especially ones with entity-component-system architectures.

I'm excited about Zig for these use cases especially, it can be a much easier approach with much less complexity than using a borrow checker.

jorangreef · on June 23, 2022

This is almost what we do for TigerBeetle, a new distributed database being written in Zig. All memory is statically allocated at startup [1]. Thereafter, there are zero calls to malloc() or free(). We run a single-threaded control plane for a simple concurrency model, and because we use io_uring—multithreaded I/O is less of a necessary evil than it used to be.

I find that the design is more memory efficient because of these constraints, for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM. Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

[1] “Let's Remix Distributed Database Design” https://www.youtube.com/channel/UC3TlyQ3h6lC_jSWust2leGg

infamouscow · on June 23, 2022

> Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

This has also been my experience building a database in Zig. It's such a joy.

catlifeonmars · on June 24, 2022

> for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM.

I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed. (For example, you could trivially address 100TiB using 1 address width of memory if all 100TiB belongs to the same key).

jorangreef · on June 24, 2022

> I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed.

Thanks for the question!

What's in view here is an LSM-tree database storage engine. In general, these typically store keys between 8 and 32 bytes and values up to a few MiB.

In our case, the question is how much memory is required for the LSM-tree to be able to index 100 TiB worth of key/value storage, where:

  * keys are between 8 and 32 bytes,
  * values are between 8 and 128 bytes,
  * keys and values are stored in tables up to 64 MiB,
  * each table requires between 128 and 256 bytes of metadata to be kept in memory,
  * auxiliary data structures such as a mutable/immutable table must be kept in memory, and where
  * all memory required by the engine must be statically allocated.

That's alot of small keys and values!

Typically, a storage system might require at least an order of magnitude more than 1 GiB of memory to keep track of that many keys using an LSM-tree as index, even using dynamic allocation, which only needs to allocate as needed.

Another way to think of this is as a filesystem, since it's a very similar problem. Imagine you stored 100 TiB worth of 4096 byte files in ZFS. How much RAM would that require for ZFS to be able to keep track of everything?

catlifeonmars · on June 30, 2022

Thanks for taking the time to reply in detail! The metric makes much more sense when put into context.

jorangreef · on July 3, 2022

It's a pleasure. Let me know if you have any more questions about TigerBeetle. Our design doc is also here: https://github.com/coilhq/tigerbeetle/blob/main/docs/DESIGN....

pcwalton · on June 23, 2022

Even in this environment, you can still have dangling pointers to freed stack frames. There's no way around having a proper lifetime system, or a GC, if you want memory safety.

verdagon · on June 23, 2022

Yep, or generational references [0] which also protect against that kind of thing ;)

The array-centric approach is indeed more applicable at the high levels of the program.

Sometimes I wonder if a language could use an array-centric approach at the high levels, and then an arena-based approach for all temporary memory. Elucent experimented with something like this for Basil once [1] which was fascinating.

[0] https://verdagon.dev/blog/generational-references

[1] https://degaz.io/blog/632020/post.html

com2kid · on June 23, 2022

> Yep, or generational references [0] which also protect against that kind of thing ;)

First off, thank you for posting all your great articles on Vale!

Second off, I just read the generational references blog post for the 3rd time and now it makes complete sense, like stupid obvious why did I have problems understanding this before sense. (PS: The link to the benchmarks is dead :( )

I hope some of the novel ideas in Vale make it out to the programming language world at large!

verdagon · on June 23, 2022

Thank you! I just fixed the link, thanks for letting me know! And if any of my articles are ever confusing, feel welcome to swing by the discord or file an issue =)

I'm pretty excited about all the memory safety advances languages have made in the last few years. Zig is doing some really interesting things (see Andrew's thread above), D's new static analysis for zero-cost memory safety hit the front page yesterday, we're currently prototyping Vale's region borrow checker, and it feels like the space is really exploding. Good time to be alive!

yw3410 · on June 24, 2022

It feels like it would work really well (you could even swap between arenas per frame). I've been wanting to try something similar but it's early days.

im3w1l · on June 23, 2022

Well if get rid of not just the heap, but the stack too... turn all variables into global ones, then it will be safe.

This means we lose thread safety and functions become non-reentrant (but easy to prove safe - make sure graph of A-calls-B is a acyclical).

infamouscow · on June 23, 2022

> Even in this environment, you can still have dangling pointers to freed stack frames.

How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

> There's no way around having a proper lifetime system, or a GC, if you want memory safety.

If you're building an HTTP caching program where you know the expiration times of objects, a Rust-style borrow-checker or garbage collector is not helping anyone.

seoaeu · on June 24, 2022

> > Even in this environment, you can still have dangling pointers to freed stack frames.

> How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

This happens rarely. However, the reason it isn't an issue is because C programmers are (and have to be) extremely paranoid about this kind of thing.

Rust, however, lets you recklessly pass around pointers to local variables while guaranteeing that you won't accidentally use one as a return value. One example is scoped thread pools which let you spawn a bunch of worker threads and then pass them pointers to stack allocated variables that get concurrently accessed by all the threads. The Rust type system/borrow checker ensures both thread safety and memory safety.

Would you trust a novice C programmer to use something like that?

Arnavion · on June 23, 2022

>I learned not to return pointers to stack allocated variables when I was 12 years old.

So, if you slip while walking today, does that mean you didn't learn to walk when you were one year old?

infamouscow · on June 24, 2022

Your analogy doesn't answer the question. How frequently does this happens in real software?

brundolf · on June 23, 2022

Rust's borrow checker would be much calmer in these scenarios too, wouldn't it? If there are no lifetimes, there are no lifetime errors

thecatster · on June 23, 2022

Rust is definitely different (and calmer imho) on bare metal. That said (as much of a Rust fanboy I am), I also enjoy Zig.

the__alchemist · on June 23, 2022

Yep! We've entered a grey areas, where some Rust embedded libs are expanding the definitions of memory safety, and what the borrow checker should evaluate beyond what you might guess. Eg, structs that represent peripherals, that are now checked for ownership; the intent being to prevent race conditions. And Traits being used to enforce pin configuration.

snicker7 · on June 23, 2022

How exactly is pre-allocation safer? If you would ever like to re-use chunks of memory, then wouldn’t you still encounter “use-after-free” bugs?

verdagon · on June 23, 2022

The approach can reuse old elements for new instances of the same type, so to speak. Since the types are the same, any use-after-free becomes a plain ol' logic error. We use this approach in Rust a lot, with Vecs.

olig15 · on June 23, 2022

But if you have a structure that contains offsets into another buffer somewhere, or an index, whatever - the wrong value here could be just as bad as a use-after-free. I don’t see how this is any safer. If you use memory after free from a malloc, with any chance you’ll hit a page fault, and your app will crash. If you have a index/pointer to another structure, you could still end up reading past the end of that structure into the unknown.

verdagon · on June 23, 2022

That's just a logic error, and not memory unsafety which might risk UB or vulnerabilities. The type system enforces that if we use-after-"free" (remember, we're not free'ing or malloc'ing), we just get a different instance of the same type, which is memory-safe.

You do bring up a valid broader concern. Ironically, this is a reason that GC'd systems can sometimes be better for privacy than Ada or Rust which uses a lot more Vec+indexes. An index into a Vec<UserAccount> is riskier than a Java List<UserAccount>; a Java reference can never suddenly point to another user account like an index could.

But that aside, we're talking about memory safety, array-centric approaches in Zig and Rust can be appropriate for a lot of use cases.

veber-alex · on June 24, 2022

If you are holding indexes to a collection and plan to delete elements you will never use Vec.

There are dedicated data structures for this [1] that will not let you access another item by mistake.

[1] https://crates.io/crates/slotmap

verdagon · on June 24, 2022

You should never use Vec in that situation, but we see it all the time. People love indexing into Vecs and reusing elements.

One of my favorite alternatives is generational_arena [0] which also happens to be the library that inspired Vale's generational references!

[0] https://docs.rs/generational-arena/latest/generational_arena...

pjmlp · on June 23, 2022

In high integrity computing that is pretty much safety related, if that logic error causes someone to die due to corrupt data, like using the wrong radiation value.

tialaramex · on June 23, 2022

But, Java has exactly the same behaviour, the typical List in Java is ArrayList which sure enough has an indexed get() method.

There seems to be no practical difference here. Rust can do a reference to UserAccount, and Java can do an index into an ArrayList of UserAccounts. Or vice versa. As you wish.

verdagon · on June 23, 2022

In Java, you can hold onto a reference to the UserAccount directly, for as long as you want.

The borrow checker, however, forces you to hold onto an index instead.

anonymoushn · on June 23, 2022

Right, it sounds like you are circumventing the borrow checker and then experiencing some of the classes of bugs it was supposed to prevent. And this seems common: https://www.youtube.com/watch?v=4t1K66dMhWk

verdagon · on June 24, 2022

> Circumventing the borrow checker

Programs often require inherent state with data that refers to other data. In these cases, one must circumvent the borrow checker, whether it be with indices, IDs, Rc, or whatever. The borrow checker simply does not allow changing data when someone has a reference to it (except for the rare case where we can use Cell).

It's a myth that we can rewrite any program to not circumvent the borrow checker.

kaba0 · on June 23, 2022

These are 1000 times worse than even a segfault. These are the bugs you won’t notice until they crop up at a wildly different place, and you will have a very hard time tracking it back to their origin (slightly easier in Rust, as you only have to revalidate the unsafe parts, but it will still suck)

bsder · on June 23, 2022

Normally you do this on embedded so that you know exactly what your memory consumption is. You never have to worry about Out of Memory and you never have to worry about Use After Free since there is no free. That memory is yours for eternity and what you do with it is up to you.

It doesn't, however, prevent you from accidentally scribbling over your own memory (buffer overflow, for example) or from scribbling over someone else's memory.

nine_k · on June 23, 2022

No; every chunk is for single, pre-determined use.

Imagine all variables in your program declared as static. This includes all buffers (with indexes instead of pointers), all nested structures, etc.

LAC-Tech · on June 23, 2022

Safe enough. You can use `std.testing.allocator` and it will report leaks etc in your test cases.

What rust does sounds like a good idea in theory. In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops. Zigs solution is hands down better for actually getting work done, plus it's so dead simple to use arena allocation and fixed buffers that you're likely allocating a lot less in the first place.

Rust tries to make allocation implicit, leaving you confused when it detects an error. Zig makes memory management explicit but gives you amazing tools to deal with it - I have a much clearer mental model in my head of what goes on.

Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

Klonoar · on June 23, 2022

>Zigs solution is hands down better for actually getting work done

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

>Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

>In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

LAC-Tech · on June 23, 2022

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

I didn't say it wasn't usable. I said I found Zig more usable.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

Mate, I didn't start learning rust in order to wage war against the borrow checker. I had no idea what the hell it wanted a lot of the time. Each time I fixed an error I thought I got it, and each time I was wrong. The grind got boring.

As for specific examples no, I've tried to put rust out of my mind. I certainly can't remember specific issues from 3 months ago.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

Yeah that's how rust as sold, the compiler is your friend and stuff will compile and it will never fail.

In reality the compiler was so irritating I hardly got anything done at all. The output wasn't super reliable software, it was no software.

crabbygrabby · on June 24, 2022

Sounds like you got annoyed trying to learn something new, quit, and decided it's not worth your time. When people tell you it takes a few weeks to get the hang of rust they aren't kidding. Most people aren't the exception to that, but once you do get it, it's really great... Not kidding about that either...

LAC-Tech · on June 24, 2022

I think I spent about two months. You're right, I do resent the time I spent on it. In a way I'd like to warn others.

The turning point came for me when talking to someone using rust pretty much since it first came out. He basically said "yeah, I still don't know why it doesn't compile either sometimes". He's smarter than me, and all he's managed to do in years of learning it is have a knack for fixing errors without understanding them.

I need some brain capacity left over for actually solving a problem.

voidhorse · on June 24, 2022

I agree with the parent. Rust is hard because it fundamentally inverts the semantics of pretty much every other programming language on earth by making move semantics the default instead of copying.

Yet there’s no syntax to indicate this. Worse, actual copies are hidden behind a trait that you have no way of knowing whether a particular external lay defined type implements it or not outside of reading documentation. A lot of Rust’s important mechanics are underrepresented syntactically, which makes the language harder to get used to imo. I agree with the parent that in general it’s better for things to be obvious as you’re writing them—if rust had syntax that screamed “you’re moving this thing” or “you’re copying this thing because it implements copy” that’d be a lot easier to get used to than what beginners are currently stuck with which is a cycle of “get used to the invisible semantics by having the compiler yell at you at build time until you’ve drilled it into your head past the years of accumulated contrary models” and oh, as soon as you have to use another language this model becomes useless, so expertise in it does not translate to other domains (though that will hopefully change in the future)

brabel · on June 24, 2022

> they wouldn't be using it unless it was usable for "real work".

We had to introduce Rust at work because we really needed some WASM functionality in one of our mostly JS frontends... anyway, I was excited about Rust and all and pushed the idea, implemented the whole thing and made presentations for other developers about Rust. I was thinking everyone would be as excited as I was and would jump at the chance of maintaining the Rust module.

In reality , only one of the 20+ devs even tried to ever touch the Rust code. Everyone else thought the code looked like Greek (no offence to my greek friends!) ... today when the code needs change, I am pretty much the only one who can do it, or the other guy (who is more novice than me in Rust so takes a lot longer to do anything, but at least there's someone else).

For reference: we write code in Java/Kotlin/Groovy/Erlang. So, we're not a system programming shop in any way, so I can't speak for places where C and C++ were previously being used.

Klonoar · on June 24, 2022

I would be curious how your shop fares with Swift then, considering it may also look "like greek" to them.

(Legit point, no snark)

brabel · on June 25, 2022

Everyone seems alright with Kotlin, and Swift seems close enough to Kotlin that I think they would do fine (we do have a few frontend devs that do Swift BTW).

Measter · on June 24, 2022

> Can you show me a "valid program" that Rust rejects?

    #[derive(Debug)]
    struct Foo {
        a: i32
    }
    
    fn thing(foo: &mut Foo) {
        match foo {
            f @ Foo { a } if *a > 5 => {
                println!("{:?}", f)
            }
            _ => {}
        }
    }

There's no reason it should reject that, as the use of the `a` reference doesn't interleave with the use of `f`.

Klonoar · on June 24, 2022

By creating `f` you're in essence trying to borrow something that's mutably borrowed already, which the borrow checker doesn't allow. I guess I could see some logic for this being possible, but in practice I've never encountered this in any Rust codebase I've gone through.

The trivial example fix is just to... ensure it can copy, and tweak the match line:

    #[derive(Copy, Clone, Debug)]
    struct Foo {
        a: i32
    }
    
    fn thing(foo: &mut Foo) {
        match *foo {
            f @ Foo { a } if a > 5 => {
                println!("{:?}", f)
            }
        
            _ => {}
        }
    }
      

    fn main() {
        let mut x = Foo { a: 1 };
        thing(&mut x);
    }

If the struct was bigger and/or had types that couldn't copy, I'd refrain from trying to shoehorn matching like that entirely.

Measter · on June 25, 2022

The borrow checker does allow that, though, as long the uses don't interleave and the references created correctly. As long as `f` is not used between `a`'s creation and last use, and `a` comes from `f`, it's valid for that alias to exist. You can see that with this code example, which is accepted:

    #[derive(Debug)]
    pub struct Foo {
        a: i32
    }
    
    impl Foo {
        fn get_a(&mut self) -> &mut i32 {
            &mut self.a
        }
    }
    
    pub fn thing(mut foo: &mut Foo) {
        let f = &mut foo;
        let a = f.get_a();
        if *a > 5 {
            println!("{:?}", f);
        }
    }

That's why I was so surprised the compiler rejected it.

woodruffw · on June 23, 2022

This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

One thought:

> Never calling free (practical for many embedded programs, some command-line utilities, compilers etc)

This works well for compilers and embedded systems, but please don't do it command-line tools that are meant to be scripted against! It would be very frustrating (and a violation of the pipeline spirit) to have a tool that works well for `N` independent lines of input but not `N + 1` lines.

samatman · on June 23, 2022

There are some old-hand approaches to this which work out fine.

An example would be a generous rolling buffer, with enough room for the data you're working on. Most tools which are working on a stream of data don't require much memory, they're either doing a peephole transformation or building up data with filtration and aggregation, or some combination.

You can't have a use-after-free bug if you never call free, treating the OS as your garbage collector for memory (not other resources please) is fine.

woodruffw · on June 23, 2022

Yeah, those are the approaches that I've used (back when I wrote more user tools in C). I wonder how those techniques translate to a language like Zig, where I'd expect the naive approach to be to allocate a new string for each line/datum (which would then never truly be freed, under this model.)

anonymoushn · on June 23, 2022

I've been writing a toy `wordcount` recently, and it seems like if I wanted to support inputs much larger than the ~5GB file I'm testing against, or inputs that contain a lot more unique strings per input file size, I would need to realloc, but I would not need to free.

woodruffw · on June 23, 2022

Is that `wordcount` in Zig? My understanding (which could be wrong) is that reallocation in Zig would leave the old buffer "alive" (from the allocator's perspective) if it couldn't be expanded, meaning that you'd eventually OOM if a large enough contiguous region couldn't be found.

anonymoushn · on June 23, 2022

It's in zig but I just call mmap twice at startup to get one slab of memory for the whole file plus all the space I'll need. I am not sure whether Zig's GeneralPurposeAllocator or PageAllocator currently use mremap or not, but I do know that when realloc is not implemented by a particular allocator, the Allocator interface provides it as alloc + memcpy + free. So I think I would not OOM. In safe builds when using GeneralPurposeAllocator, it might be possible to exhaust the address space by repeatedly allocating and freeing memory, but I wouldn't expect to run into this on accident.

dundarious · on June 23, 2022

They don't (at least the GPA's defaulting backing allocator is the page_allocator, which doesn't). https://github.com/ziglang/zig/blob/master/lib/std/heap.zig

woodruffw · on June 23, 2022

That's interesting, thanks for the explanation!

avgcorrection · on June 23, 2022

> This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

I.e. we can choose to risk running out of memory? I don’t understand how this is a viable strategy unless you know you only will process a certain input size.

woodruffw · on June 23, 2022

Yes. There are many domains where you know exactly how much memory you’ll need (even independent of input size), so just “leaking” everything is a perfectly valid technique.

avgcorrection · on June 24, 2022

You will have to explain this to me. From the original mention (article) it seems that they mean that compilers in general can be written in this way. Is that what they mean? Or do they mean that compilers can be written in that way if they know something about the inputs that it will be fed?

tptacek · on June 23, 2022

"Temporal" and "spatial" is a good way to break this down, but it might be helpful to know the subtext that, among the temporal vulnerabilities, UAF and, to an extent, type confusion are the big scary ones.

Race conditions are a big ugly can of worms whose exploitability could probably be the basis for a long, tedious debate.

When people talk about Zig being unsafe, they're mostly reacting to the fact that UAFs are still viable in it.

jorangreef · on June 23, 2022

I see your UAF and raise you a bleed!

As you know, buffer bleeds like Heartbleed and Cloudbleed can happen even in a memory safe language, they're hard to defend against (padding is everywhere in most formats!), easier to pull off than a UAF, often remotely accessible, difficult to detect, remain latent for a long time, and the impact is devastating. All your RAM are belong to us.

For me, this can of worms is the one that sits on top of the dusty shelf, it gets the least attention, and memory safe languages can be all the more vulnerable as they lull one into a false sense of safety.

tptacek · on June 23, 2022

Has an exploitable buffer bleed (I'm happy with this coinage!) happened in any recent memory safe codebase?

jorangreef · on June 23, 2022

I worked on a static analysis tool to detect bleeds in outgoing email attachments, looking for non-zero padding in the ZIP file format.

It caught different banking/investment systems written in memory safe languages leaking server RAM. You could sometimes see the whole intranet web page, that the teller or broker used to generate and send the statement, leaking through.

Bleeds terrify me, no matter the language. The thing with bleeds is that they're as simple as a buffer underflow, or forgetting to zero padding. Not even the borrow checker can provide safety against that.

raphlinus · on June 23, 2022

I am skeptical until I see the details, and strongly suspect you are dealing with a "safe-ish" language rather than one which has Rust-level guarantees. Uninitialized memory reads are undefined behavior in basically all memory models in the C tradition. In Rust it is not possible to make a reference to a slice containing uninitialized memory without unsafe (and the rules around this have tightened relatively recently, see MaybeUninit).

I say this as someone who is doing a lot of unsafe for graphics programming - I want to be able to pass a buffer to a shader without necessarily having zeroed out all the memory, in the common case I'm only using some of that buffer to store the scene data etc. I have a safe-ish abstraction for this (BufWriter in piet-gpu, for the curious), but it's still possible for unsafe shaders to do bad things.

jorangreef · on June 23, 2022

Hackers exploit any avenue (and usually come in through the basement!), regardless of how skeptical we might be that they won't. They don't need the details, they'll figure it out. You give them a scrap and they'll get the rest. It's a different way of thinking that we're not used to, and don't understand unless we're exposed to it first-hand, e.g. through red-teaming.

For example, another way to think of this is that you have a buffer of initialized memory, containing a view onto some piece of data, from which you serve a subset to the user, but you get the format of the subset wrong, so that parts of the view leak through. That's a bleed.

Depending on the context, the bleed may be enough or it might be less severe, but the slightest semantic gap can be chained and built up into something major. Even if it takes 6 chained hoops to jump through, that's a low bar for a determined attacker.

woodruffw · on June 23, 2022

> For example, another way to think of this is that you have a buffer of initialized memory (no unsafe), containing a view onto some piece of data, from which you serve a subset to the user, but you get the format of the subset wrong, so that parts of the view leak through. That's a bleed.

If there's full initialization then this is just a logic error, no? Apart from some kind of capability typing over ranges of bytes (not very ergonomic), this would be a very difficult subtype of "bleed" to statically describe, much less prevent.

jorangreef · on June 23, 2022

Yes, exactly. That's what I was driving at. It's just a logic error, that leaks sensitive information, by virtue of leaking the wrong information. File formats in particular can make this difficult to get right. For example, the ZIP file format (that I have at least some experience with bleeds in) has at least 9 different places where a bleed might happen, and this can depend on things like: whether files are added incrementally to the archive, the type of string encoding used for file names in the archive etc.

woodruffw · on June 23, 2022

Makes sense! My colleagues work on some research[1] that's intended to be the counterpart to this: identifying which subset of a format parser is actually activated by a corpus of inputs, and automatically generating a subset parser that only accepts those inputs.

I think you mentioned WUFFS before the edit; I find that approach very promising!

[1]: https://www.darpa.mil/program/safe-documents

jorangreef · on June 23, 2022

Thanks! Yes, I did mention WUFFS before the edit, but then figured I could make it a bit more detailed. WUFFS is great.

The SafeDocs program and approach looks incredible. Installing tools like this at border gateways for SMTP servers, or as a front line defense before vulnerable AV engine parsers (as Pure is intended to be used), could make such a massive dent against malware and zero days.

raphlinus · on June 23, 2022

Thanks for the explanation. I would consider that type of logic error more or less impossible to defend at the language level, but I can see how analysis tools can be helpful.

jorangreef · on June 24, 2022

Thanks for the question.

"I would consider that type of logic error more or less impossible to defend at the language level, but I can see how analysis tools can be helpful."

This is how I see it too, as a logic error that exposes sensitive memory in a way that can be as unsafe as a UAF. It's a great leveler of languages, and why I get excited about enabling checked arithmetic by default in safe builds, explicit control flow, and minimizing complexity, abstractions and dependencies—this all helps to reduce the probability of semantic gaps and leaks.

ghusbands · on June 23, 2022

I would imagine that the scenario is simply reuse of some buffer without clearing it, maybe in an attempt to save on allocations. It can happen across so many (even safe) languages. It doesn't matter what guarantees you have around uninitialised memory if you're reusing an initialised buffer yourself.

pornel · on June 23, 2022

This error is possible in Rust, but not easy to cause.

Uninitialised memory is considered unsafe. There aren't any loopholes even for "just bytes". You can't accidentally make an uninitialised buffer, and you can't expose one without explicit `unsafe{}`.

Sych unsafe code is typically wrapped in safe interfaces, so the risk is limited to implementation side, and not spread to every usage site.

For filling buffers the language pushes towards use of growable Vec with capacity that tracks how much has been initialised, or to collect straight from an iterator, which is foolproof.

Custom buffers are not even used that often, because things like zip writing or compression use io::Write streaming interface.

notriddle · on June 23, 2022

It's actually not very hard to cause. It would happen if your system reuses buffers without actually going through the allocator. Something like this:

    let mut buf = vec![0; DEFAULT_BUFFER_SIZE];
    let database_frame_len = database_socket.read(&mut buf)?; // use 1
    let database_result = Database::parse(&buf[..database_frame_len])?;
    let html_template = HtmlTemplate::new(database_result);
    let html_len = html_template.write(&mut buf)?; // use 2
    socket.write(&buf[..html_len]);

The above code makes the tenuous assumption that HtmlTemplate::write actually clobbers everything it's supposed to write. But it doesn't even require unsafe code for it to not do that, because the underlying buffer is entirely initialized according to the type system.

The only advice I can really give to avoid this kind of bug is: don't reuse buffers unless it's actually a bottleneck.

pcwalton · on June 24, 2022

Couldn't you fix that by just clearing the buffer in between the two calls? It won't be measurably slower to do that because the buffer should retain its allocated capacity.

notriddle · on June 24, 2022

It's not a difficult bug to fix, but it is a very difficult bug to detect, since miri won't think anything is wrong.

pornel · on June 24, 2022

This is why I've mentioned io::Write, because people don't chop buffers like that with C-style separately tracked length when they can have zero-copy:

   html_template.pipe_to(socket);

and there's a foolproof io::BufWriter adapter if you need these writes buffered.

tptacek · on June 23, 2022

You have my attention!

jorangreef · on June 23, 2022

Wow, that's saying something!

The tool is called Pure [1]. It was originally written in JavaScript and open-sourced, then rewritten for Microsoft in C at their request for performance (running sandboxed) after it also detected David Fifield's “A Better Zip Bomb” as a zero day.

I'd love to rewrite it in Zig to benefit from the checked arithmetic, explicit control flow and spatial safety—there are no temporal issues for this domain since it's all run-to-completion single-threaded.

Got to admit I'm a little embarrassed it's still in C!

[1] https://github.com/ronomon/pure

tedunangst · on June 24, 2022

What's recent?

https://blog.gdssecurity.com/labs/2015/2/25/jetleak-vulnerab...

https://blog.cloudflare.com/dns-parser-meet-go-fuzzer/

https://rustsec.org/advisories/RUSTSEC-2018-0004.html

jorangreef · on June 24, 2022

Thanks for these!

kaba0 · on June 23, 2022

Would that work in the case of Java for example? It nulls every field as per the specification (at least observably at least), so unless someone writes some byte mangling manually I don’t necessarily see it work out.

lmh · on June 23, 2022

Question for Zig experts:

Is it possible, in principle, to use comptime to obtain Rust-like safety? If this was a library, could it be extended to provide even stronger guarantees at compile time, as in a dependent type system used for formal verification?

Of course, this does not preclude a similar approach in Rust or C++ or other languages; but comptime's simplicity and generality seem like they might be beneficial here.

pron · on June 23, 2022

Not as it is (it would require mutating the type's "state"), but hypothetically, comptime could be made to support even more programmable types. But could doesn't mean should. Zig values language simplicity and explicitness above many other things.

lmh · on June 23, 2022

Thanks, that's informative. This was meant to clarify the bounds of Zig's design rather than as a research proposal. Otherwise, one might read it as an open invitation to just the sort of demonic meta-thinking that its users abhor.

kristoff_it · on June 23, 2022

Somebody implemented part of it in the past, but it was based on the ability to observe the order of execution of comptime blocks, which is going to be removed from the language (probably already is).

https://github.com/DutchGhost/zorrow

It's not a complete solution, among other things, because it only works if you use it to access variables, as the language has no way of forcing you.

lmh · on June 24, 2022

Thanks, that's interesting.

anonymoushn · on June 23, 2022

It is possible in principle to write a Rust compiler in comptime Zig, but the real answer is "no."

avgcorrection · on June 23, 2022

Why would the mere existence of some static-eval capability give you that affordance?

Researchers have been working on these three things for decades. Yes, “comptime” isn’t some Zig invention but a somewhat limited (and anachronistic to a degree) version of what researchers have added to research versions of ML and Ocaml. So can it implement all the static language goodies of Rust and give you dependent types? Sure, why not? After all, computer scientists never had the idea that you can evaluate values and types at compile-time. Now all those research papers about static programming language design will wither on their roots now that people can just use the simplicity and generality of `comptime` to prove programs correct.

ptato · on June 23, 2022

Not an expert by any means, but my gut says that it would be very cumbersome and not practical for general use.

TinkersW · on June 24, 2022

Falsely representing the state of C & C++ doesn't really lead to a convincing argument. All those safety checks Zig supports are easily enabled in C++, and widely used. Sometimes they are even on by default.

dkersten · on June 23, 2022

I’m not sure I understand the value of an allocator that doesn’t reuse allocations, as a bug prevention thing. Is it just for performance? (Since its never reused, allocation can simply be incrementing an offset by the size of the allocation)? Because beyond that, you can get the same benefit in C by simply never calling free on the memory you want to “protect” against use-after-free.

anonymoushn · on June 23, 2022

The allocations are freed and the addresses are never reused. So heap use-after-frees are segfaults.

dkersten · on June 25, 2022

Does that mean that each allocation is always page-aligned?

anonymoushn · on June 26, 2022

Not sure on the details here. I'd have to try it out and see.

Larger allocations will be page-aligned, but if you make a bunch of very small allocations, they may go into the same pages, and freeing all but one per page of them may leave you with the pages still mapped. I've skimmed the GeneralPurposeAllocator code and know it has this sort of behavior at least sometimes, but I'm not really familiar with which things change in safe builds.

dkersten · on June 26, 2022

Thanks for the reply! I wasn't sure if you'd see mine, since it was two days late :)

So I suppose the "use after free" protection is "best effort": it will try to trap such use by unmapping the page, but there's no guarantee that it will definitely be the case.

I suppose that's ok, since use after free is a bug anyway. We want to catch the bugs and protect against data corruption, but the programmer is still on the hook for not making the bugs in the first place, so if the protection does miss one, well, at least it tried -- its just an extra layer. It would certainly be wasteful if allocating a few bytes uses up an entire page, just so it can be unmapped when the memory is freed.

kaba0 · on June 23, 2022

I believe it is only for performance, as malloc will have to find place for the allocation, while it is a pointer bump only for a certain kind of allocator.

ajross · on June 23, 2022

> In practice, it doesn't seem that any level of testing is sufficient to prevent vulnerabilities due to memory safety in large programs. So I'm not covering tools like AddressSanitizer that are intended for testing and are not recommended for production use.

I closed the window right there. Digs like this (the "not recommended" bit is a link to a now famous bomb thrown by Szabolcs on the oss-sec list, not to any kind of industry consensus piece) tell me that the author is grinding an axe and not taking the subject seriously.

Security is a spectrum. There are no silver bullets. It's OK to say something like "Rust is better than Zig+ASan because", it's quite another to refuse to even treat the comparison and pretend that hardening tools don't exist.

This is fundamentally a strawman, basically. The author wants to argue against a crippled toolchain that is easier to beat instead of one that gets used in practice.

klyrs · on June 23, 2022

As a Zig fan, I disagree. I think it's really important to examine the toolchain that beginners are going to use.

> I'm also focusing on software as it is typically shipped, ignoring eg bounds checking compilers like tcc or quarantining allocators like hardened_malloc which are rarely used because of the performance overhead.

To advertize that Zig is perfectly safe because things like ASan exist would be misleading, because that's not what users get out of the box. Zig is up-front and honest about the tradeoffs between safety and performance, and this evaluation of Zig doesn't give any surprises if you're familiar with how Zig describes itself.

ajross · on June 23, 2022

> To advertize that Zig is perfectly safe because things like ASan exist would be misleading

Exactly! And for the same reason. You frame your comparison within the bounds of techniques that are used in practice. You don't refuse to compare a tool ahead of time, especially when doing so reinforces your priors.

To be blunt: ASan is great. ASan finds bugs. Everyone should use ASan. Everyone should advocate for ASan. But doing that cuts against the point the author is making (which is basically the same maximalist Rust screed we've all heard again and again), so... he skipped it. That's not good faith comparison, it's spin.

KerrAvon · on June 23, 2022

ASAN doesn’t add memory safety to the base language. It catches problems during testing, assuming those problems occur during the testing run (they don’t always! ASAN is not a panacea!). It’s perfectly fair to rule it out of bounds for this sort of comparison.

lmm · on June 23, 2022

> You frame your comparison within the bounds of techniques that are used in practice.

Well, is ASan used in practice, by the relevant target audience (i.e. mainstream C++ developers)? My guess is that the vast majority of the people both Rust and Zig are aiming for are people who don't use ASan with C++ today and wouldn't use ASan with Rust or Zig if they switched to them.

klyrs · on June 24, 2022

Wait, are you saying that because the author didn't push your personal agenda, that's spin? Hardly.

longrod · on June 24, 2022

I think Zig has a lot more footguns due to it's explicit nature. When you don't hide away the details from the human, you are increasing the risk of writing bad code and it becomes increasingly harder to make the compiler detect each potentially bad decision.

Rust did it but they had to rethink the whole problem from the ground up. Rust is safe but that safety had quite the learning cost as compared to, say, Zig or Go. The good thing about this is that you can't use many of the bad practices from other languages that have become habits.

But Zig is still in beta/alpha stage so let's see how they increase the overall safety in the coming months/years. My experience with Zig has left me quite satisfied, especially the comptime features but the explicitness sometimes gets in the way of readability.

galangalalgol · on June 24, 2022

why does go always get brought up when talking about rust or nim or now zig? Its got gc, a large std lib, less latency than java but noticably slower, and its not suited for embedded or drivers. Its a completely different language for a completely different niche than zig or rist, though maybe I could see a comparison to nim... Maybe. I do like go for what it is, batteries included back end language with syntax nicer than java, but I mostly do embedded hpc, so I don't reach for go often.

I can believe that abstractions prevent errors, but can't a language allow footguns but have syntax that makes them easy to detect?

pjmlp · on June 24, 2022

I am not big Go fan, yet its suited for embedded or drivers is a mindset question, as proven by USB hardware keys being shipped by F-Secure with Go firmware, goVisor, Android GPU debugger, ARM official support for TinyGo...

And since you speak of Java, companies like PTC, Aicas, microEJ, ExelsiorJET (now gone) have been happily doing business on embedded, with bare metal or RTOS based deployments.

And then there is Meadow/Netduino on .NET world, and Astrobe for Oberon as commercial products as well. Astrobe has been in business for 20 years now.

avgcorrection · on June 24, 2022

Go vs. Rust: the false rivalry that will simply. never. die.

longrod · on June 24, 2022

Go is a compiled language with comparable performance to Zig/Rust/C in a lot of use cases. It's heavily ergonomic for network/server side programming, it's safe, easy to use with a tiny footprint.

Rust is not only or even primarily used in embedded or driver programming. Indeed, I see more user space software built in rust than in embedded nowadays. As for Zig, it's so new and alpha that it didn't even find a niche yet. Rust also hasn't yet found a specific niche. It's all over the place in GUI, networking, embedded etc. which isn't a bad thing, to be clear.

It might be harder to do embedded in Go but that's not the point here, is it? It's about safety. Go has a special place in that it is compiled, easy to learn, quite performant for most tasks, and safe to boot.

> I can believe that abstractions prevent errors, but can't a language allow footguns but have syntax that makes them easy to detect?

What would be the point though? If a language can detect footguns, it's time to prevent them which is what Rust does, essentially. Footguns are rarely useful and there's always an alternative way. For these rare cases, Rust includes the unsafe escape hatch but then all bets are off. Don't expect the compiler to help you if you are intent on going down that road.

galangalalgol · on June 24, 2022

A very good point about rust (and c++ and c) being used outside embedded and systems. That's fair. But when people do that, they do it for performance, and go is easily 5x slower in my experience. It feels faster than java because the lower latency makes it feel more responsive, but its takes half again as long to finish the same tasks. And c# absolutely blows its doors off, closer to twice as long as c#. And c# feels responsive as well. And don't all those languages make the same safety guarantees as go?

About detecting footguns, I was honestly asking, I thought maybe you could have a borrow checker, but none of the zero cost abstractions that rust uses to make it mostly tolerable. I'm not saying I want to use that language, but I do like explicit languages if they aren't too verbose. Go is a good example of that I think, until you get into the abstractions around parallelism.

hndc · on June 24, 2022

> "in my experience"

Here's some data: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Ignoring the programs using x86 intrinsics to do vectorized math, the top-performing Rust, Go, Java, and C# programs are all written in a simple, straightforward style; each is practically a direct translation from the other. The Rust program is fastest, but the others come in at 1.6x, 1.7x, and 1.7x the Rust program, respectively. These are not significant differences for the vast majority of applications.

Of course this is one artificial benchmark, and on that website you will find others where the Java, C#, or Go program will do especially well or especially poorly. But it's clear that those three are all roughly in the same performance class..

igouy · on June 24, 2022

> … Java, C#, or Go … all roughly in the same performance class.

As-in "Notice which box plot IQRs overlap."

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

pjmlp · on June 24, 2022

Just point people to F-Secure TamaGo unikernel for embedded firmware as one possible example,

https://www.withsecure.com/en/solutions/innovative-security-...

pjmlp · on June 23, 2022

This is why for me, Zig is mostly a Modula-2 with C syntax in regards to safety.

All the runtime tooling it offers, already exists for C and C++ for at least 30 years, going back to stuff like Purify (1992).

belter · on June 23, 2022

1 year ago, 274 comments.

"How Safe Is Zig?": https://news.ycombinator.com/item?id=26537693

afdbcreid · on June 23, 2022

Do compilers really can never call `free()`?

Simple compiler probably can. Most complex probably cannot (I don't want to imagine a Rust compiler without freeing memory: it has 7 layers of lowering (source code->tokens->ast->HIR->THIR->MIR->monomorphized MIR, excluding the final LLVM IR) and also allocates a lot while type-checking or borrow-checking).

What is most interesting to me is the average compiler. Does somebody have statistics on the average amount compilers allocate and free?

com2kid · on June 23, 2022

> Do compilers really can never call `free()`?

I worked on, one of the many, Microsoft compiler teams, though as a software engineer in test not directly on the compiler itself, and I believe the lead dev told me they don't free any memory, though I could be misremembering since it was my first job out of college.

Remember C compilers are often one file at a time (and a LOLWTF # of includes), and the majority of work goes into making a single output file, and then you are done. Freeing memory would just take time, better to just hand it all back to the OS.

Also compilers are obsessed with correctness, generating incorrect code is to be avoided at all costs. Dealing with memory management is just one more place where things can go wrong. So why bother?

I do remember running out of memory using link time code gen though, back when everything was 32bit.

Related, I miss the insane dedication to quality that team had. Every single bug had a regression test created for it. We had regression tests 10-15 years old that would find a bug that would have otherwise slipped through. It was a great way to start my career off, just sad I haven't seen testing done at that level since then!

Arnavion · on June 24, 2022

>Related, I miss the insane dedication to quality that team had. Every single bug had a regression test created for it.

Compilers in particular are usually easy to have rigorous regression testing for. Investigating any issue usually forces you to produce a minimal repro because of how complicated a compiler is. Also, all the inputs and outputs are known. Then it's just a one tiny extra step of putting that all together into a checked-in test.

kaba0 · on June 23, 2022

Bootstrapping aside, a compiler written in a GCd language would make perfect sense. It really doesn’t have any reason to go lower level than that (other than of course, if one wants to bootstrap it in the same language that happens to be a low-level one)