Lovely article, I am amazed at the lengths compilers go through to get the best performance. Thanks for writing this!
Reading the part about pure functions (including the extension [[gnu::pure]]) made me feel that this is a part where the C++ default is actually wrong. Not just because it could be better/hindsight/other languages have it/etc. but because it is in conflict with its own motto, don't pay for what you don't use. I really hope I can enable a compiler flag some time in the future that would make my variables const, and my functions pure by default.
I can see how the author came to the conclusion that libmodem written in rust would have prevented the issue, but isn't it simply pushing the problem further down the stack?
The author needed to use unsafe in order to pass his pointer to libmodem, but libmodem is going to require a pointer with static lifetime itself. Which would have prevented the issue in the first place had the author done this.
I can see why you wouldn't want to use static, it hinders testability, but that means you need to ensure that the pointer you supply libmodem outlives libmodem. I would use RAII to do that in C++ and I am sure in rust you could/would do the same.
I guess I am asking, is there anything here that a libmodem written in rust would have magically solved? It feels like wishful thinking, but I am open to learn where I am mistaken.
In any case, kudos for finding this bug. Having worked with Zephyr/NRF connect SDK and this exact chip myself I can definitely relate to the pain they (can) bring.
The existing C interface doesn't have means to describe the lifetime of the data being passed in. It just takes a pointer. An experienced C programmer would often understand what's happening by convention and not encounter the problem.
But the custom Rust wrapper was composed as a game of telephone (ugh), with the author blindly mimicking "Jonathan" who seemed to have been blindly mimicking a sloppy (and later repaired) example from Nordic.
The argument is that if the library and its internals were originally written in Rust, which has richer semantics for object lifetimes, Rust would have been able to formally convey that the input data needed to outlive the individual function call, throwing an error at compile time.
The wrapper could have enforced this constraint itself, as it probably does now, but the handoff between Rust and C needs somebody to account for and understand the by convention stuff in C so that it can be expressed formally in Rust, and that human process failed to happen here.
Part of the issue is that there's not really a convention in C. If it's not documented, you should probably read the source code to find out. (C programmers often think there's a convention, but that's because there's one option that's obvious to them but then other programmers will have a different 'obvious' option, which is why this is so often not documented at all)
If I read the article correctly, Nordic changed the rules on this function without saying anything. It used to work with a stack-allocated config and now it doesn’t. The only way for the caller to know about that in C is a comment.
That's possible, but unlikely given conventions in embedded development and how something like this interface would generally need to work.
More likely (but not necessarily), Nordic's early example was either bugged or conditionally valid (benefiting from other implicit details of their implementation) and then was revised either because the mistake was identified or something else about the example change.
That's all pretty common in this domain. Inadvertently stumbling because you uncritically followed some vendor example is also pretty common and completely understandable. Better tools, like using a language with richer semantics, are indeed something that can help with that.
I dunno, in my experience if you see something worded the way this was:
int something_init(const something_init_params* params);
the convention is that the params are temporary -- really just a way of passing a bunch of parameters to the init function. It would be a surprise that the params are expected to be static. E.g., the whole STM32 HAL is done this way, and it would be a disaster if you thought the init structs all had to be static!
BTW, you can see the assumptions of the non-embedded programmers talking about "taking ownership" being the default interpretation of a signature like that...if you don't have a heap, what does that even mean?
In any case, C is a mess, embedded is a mess, no argument there!
Not to mention that one of the things experienced people learn is that vendor code is hot flaming garbage and must never be trusted. Writing a Rust implementation based on vendor code is like building a skyscraper on a landfill. Don't do that. If you have to do that, tread bloody carefully!
I am more on the hardware side these days, but Nordic's hardware docs are pretty crap. As in, they're pretty, and they're crap. (The prettiness lulls people, especially managers, into a false sense of confidence. Don't fall for the trap!) There are obvious poor choices in there, and if you call FAEs out on them, they say to just follow the docs. Experienced engineers should not follow the docs.
> I guess I am asking, is there anything here that a libmodem written in rust would have magically solved?
I'm not following your comment, but I think the point is simply "the lifetime of the config is in the function signature, rather than hopefully (sometimes) being in the documentation, and hopefully (sometimes) correct".
It sounds like one function in libmodem accepts a pointer to a configuration struct, then stores that pointer (or an interior pointer from within it), which is then later used by another libmodem function later. If all of libmodem were written in Rust, this could be done without any use of unsafe, but it would require the lifetime on the original "reference" to provably outlive the second function getting called, probably by being static.
The author mentioned in the first chapter that everything works fine in rust, since it solves all problems. So I guess they throw "better in rust" against every problem.
The assumption nobody ever makes mistakes is mistake one.
I'm sure, but could you point me in the direction of these tools?
I would love to supply GDB with some aliases for much used templates so I can have less verbose outputs
I'm not a C++ dev, but here are a couple of IDEs that I know have strong C++ support and much more user friendly debugging than using GDB via the command line...
I love it too, it tends to bend to my will, for better or worse. It feels like it doesn't stand in my way, and that further translates to feeling like there is nothing between my program and the hardware it runs on.
No need to school me on how that isn't true, I'm just describing my feeling and why I love C++.
This is what keeps tripping me up, it is very difficult to be both general and not opinionated. C++ is general, you can use it however you want. Especially in many unsafe ways. Like it or not, this is great news for a general low-level language. A language that tells me how to build, where my files should live and what tools I am to use may be a great user experience, it is not very general. Still not sure what to make of this and where I land on this.
I do not give a flying fuck. Language is a tool. I work with many languages. If clients wants language XYZ then XYZ it is. My private opinion does not matter in this case
I don't care how much you pay me (well, not in a practical sense), I'm not going to bash a bunch of screws in with a hammer. In my industry, the languages and tools do in fact matter and impact the quality of the project. It's hard enough as is without some unreasonable clients saying "oh yeah, can you make this in powerpoint?"
I'm exaggerating with that metaphor, but it's honestly not as much as I'd wish. I envy your clients. But I'm not too surprised, either. That's just a result of my domain, sadly.
The comment in your frobulate code suggests one is to forbulate, yet reading The code it is very clear that there will only be frobulation happening in the outlined proceedings. Perhaps a minor revision is required.
can you show me how rust does this? I'm genuinely curious.
I've made a toy example to show how c++ checks for undefined behavior at compile time, I am unaware of rust being able to do the same without runtime costs (however small they may be, this is a toy example after all)
https://godbolt.org/z/cT9bqz8z7
The point is that Option in Rust doesn't have undefined behavior in any case, even if the values aren't known at compile time. Exhaustiveness is always checked at compile time, unlike C++ where operator* offers an escape hatch where nothing is checked in non-constexpr contexts.
"Make everything constexpr" isn't a real solution to UB, in the same way that "make all functions pure" isn't a solution for managing side effects.
Not adding UB to your APIs, on the other hand, is a real solution.
You don't have to write this, it already exists as the (unsafe of course) method Option::unwrap_unchecked
Because all Rust's methods can be called as free functions, you can literally write Option::unwrap_unchecked for the same behaviour, or you can some_option.unwrap_unchecked() (in both cases you will need to be in unsafe context for this to be allowed and should write a SAFETY comment explaining why you're sure it's correct)
That matches the 'static_assert' portion of my sample code. The implied claim of the parent I replied to was that rust could do this even for runtime values, such as the one I am using in the main of my sample.
In c++ it is the same function running both the compile time check and the unchecked runtime variant, so there is zero overhead at runtime. I can't possibly think of a way how rust would be able to make the same code in my sample safe without adding runtime checks. If I am mistaken here I sure would like to know.
You’re correct. Rust can’t statically prove which enum variant is inhabited. You do need a runtime switch, the difference is (at least in safe code) it statically forces you to indeed do that runtime switch.
You aren't mistaken. I should've written "runtime overhead" - my point is that there is no runtime performance penalty for getting rid of the UB in the Option API.
An equivalent API with no UB is just strictly better.
This borrow checker runs at runtime, which I find not as interesting. Everything starts to look a lot like std::unique_ptr which I think is mostly unneeded as it ads pointer indirection.
Could someone explain to me when one would use this? Is it for educational purposes perhaps?
I don't think it is intended to be used in a real system, this was more of an experiment to see what was possible. C++ as a language isn't well-suited to supporting a compile-time borrow checker. The difficulty of retrofitting C++20 modules to the language is probably just a glimmer of the pain that would be involved in making a borrow checker work.
There is a place for runtime borrow checking. Some safe cases in well-designed code are intrinsically un-checkable at compile-time. C++ is pretty amenable to addressing these cases using the type system to dynamically guarantee that references through a unique_ptr-like object are safe at the point of dereference. Much of what the borrow checker does at compile-time could potentially be done at runtime with the caveat that it has an overhead.
This has more than a passing resemblance to how deadlock-free locking systems work. They don't actually prevent the possibility of deadlocks, as that may not be feasible, but they can detect deadlock conditions and automatically edit/repair the execution graph to eliminate the deadlock instance. If a deadlock occurs in a database and no one notices, did it really happen?
Hey, I am the author of this,
I made this mostly for the purpose of experimenting and playing around and trying out things rather than actually using this for production projects. Making a proper compile time checker is pretty complicated(possibly impossible) without actually getting into the compiler, this just intends emulate that behavior to some extent and have a similar interface.
"educational purposes" -> well kinda, I had some free time and had an interesting idea perhaps
C++ cannot because it does not have the necessary information present in its syntax. It’s really that simple. C++ could add such syntax, but outside of what Circle is doing, I’m not aware of any real proposal to add it.
Also, Google (more specifically, the Chrome folks) tried to make it work via templates, but found that it was not possible. There’s a limit to template magic, even.
Although it's not as extensive as Rust's lifetime management, Nim manages to infer lifetimes without specific syntax, so is it really a syntax issue?
As you say, though, C++ template magic definitely has its limits.
Nim is stack allocated unless you specifically mark a type as a reference, and "does not use classical GC algorithms anymore but is based on destructors and move semantics": https://nim-lang.org/docs/destructors.html
Where Rust won't compile when a lifetime can't be determined, IIRC Nim's static analysis will make a copy (and tell you), so it's more as a performance optimisation than for correctness.
Regardless of the details and extent of the borrow checking, however, it shows that it's possible in principle to infer lifetimes without explicit annotation. So, perhaps C++ could support it.
As you say, it's the semantics of the syntax that matter. I'm not familiar with C++'s compiler internals though so it could be impractical.
I did not hear that Nim made ORC the default, thanks for that!
I still think that my overall point stands: sure, you can treat this as an optimization pass, but that kind of overhead isn't acceptable in the C++/Rust world. And syntax is how you communicate programmer intent, to resolve the sorts of ambiguous cases described in some other comments here.
> Where Rust won't compile when a lifetime can't be determined, IIRC Nim's static analysis will make a copy (and tell you), so it's more as a performance optimisation than for correctness.
Wait, how does that work? For example, take the following Rust function with insufficient lifetime specifiers:
pub fn lt(x: &i32, y: &i32) -> &i32 {
if x < y { x } else { y }
}
You're saying Nim will change one/all of those references to copies and will also emit warnings saying it did that?
It will not emit warnings saying it did that. The static analysis is not very transparent. (If you can get the right incantation of flags working to do so and it works, let me know! The last time I did that it was quite bugged.)
Writing an equivalent program is a bit weird because: 1) Nim does not distinguish between owned and borrowed types in the parameters (except wrt. lent which is bugged and only for optimizations), 2) Nim copies all structures smaller than $THRESHOLD regardless (the threshold is only slightly larger than a pointer but definitely includes all integer types - it's somewhere in the manual) and 3) similarly, not having a way to explicitly return borrows cuts out much of the complexity of lifetimes regardless, since it'll just fall back on reference counting. The TL;DR here though is no, unless I'm mistaken, Nim will fall back on reference counting here (were points 1 and 2 changed).
For clarity as to Nim's memory model: it can be thought of as ownership-optimized reference counting. It's basically the same model as Koka (a research language from Microsoft). If you want to learn more about it, because it is very neat and an exceptionally good tradeoff between performance/ease of use/determinism IMO, I would suggest reading the papers on Perseus as the Nim implementation is not very well-documented. (IIRC the main difference between Koka and Nim's implementation is that Nim frees at the end of scope while Koka frees at the point of last use.)
Oh, that's interesting. I think not distinguishing between owned and borrowed types clears things up for me; it makes a lot more sense for copying to be an optimization here if reference-ness is not (directly?) exposed to the programmer.
Thanks for the explanation and the reading suggestions! I'll see about taking a look.
> Could someone explain to me when one would use this? Is it for educational purposes perhaps?
The goal/why is, as almost always, explained in the README:
> rusty.hpp as the time or writing this is a very experimental thing. Its primary purpose is to experiment and test out different coding styles and exploring a different than usual C++ workspace.
I feel your pain. For me the biggest hurdle was taken away once I realized the following 3 things:
- everything in Rust-land is named differently. They are not different, but use different names.
- nothing wants to be a value
- nothing is implemented by default.
If you can get those 3 items in your brain, you can start thinking in Rust. Coming from one c++ dev to another; You may still not like the language. I know I dont, even though I agree with the premise that defaults should be ‘safe’
I've had an electric one (heater in the base) for more than 15 years and it recently broke in a way that was not repairable. I really want to buy a new electric one with a reasonable volume, but they seem to be very rare. I'm thinking of brazing a heater element to a normal one, the automatic shut-off feature and consistent temperature profile is a must-have for me.
Reading the part about pure functions (including the extension [[gnu::pure]]) made me feel that this is a part where the C++ default is actually wrong. Not just because it could be better/hindsight/other languages have it/etc. but because it is in conflict with its own motto, don't pay for what you don't use. I really hope I can enable a compiler flag some time in the future that would make my variables const, and my functions pure by default.