Hey, thank you for spreading the joy of the borrow checker beyond Rust; awesome stuff, sounds very interesting, challenging, and useful!
One question that came to mind as a single-track-Rust-mind kind of person: in D generally or in your experience specifically, when you find that the borrow checker doesn't work for a data structure, what is the alternative memory management strategy that you choose usually? Is it garbage collection, or manual memory management without a borrow checker?
Personally, I frankly do not need the borrow checker. I have been writing manual memory management code for so long I have simply internalized how to avoid having problems with it. I've been called arrogant for saying this, but it's true.
But I still like the borrow checker style of programming because it makes the code easier to understand.
I find it convenient in the D compiler implementation to use the GC for the AST memory management, as the algorithms that manipulate it are easier if they needn't concern themselves with memory management. A borrow checker approach doesn't fit it comfortably, either.
Many of the data structures persist to the end of the program, as a compiler is a batch program. No memory management strategy is even necessary for those.
Hello. I assume tracepoints mean kprobes/uprobes or something along those lines? I've just this weekend worked on implementing/adapting a crate for DTrace USDTs aka DTrace probes to also work on Linux and generate SystemTap SDTs (aka USDTs aka dtrace probes).
This is probably a little different from tracepoints in the kernel space but I'm somewhat interested in going deeper and into the kernel side of things. Let me know if you have any pointers as to where I might be of concrete assistance to you!
QEMU has several tracepoint providers, the main ones are (a slightly fancy version of) printf and USDT. There is a Python program that generates the C code for the chosen backend(s), so the thing to do would be to adjust the script to produce either an FFI bridge or the equivalent Rust code.
This crate gives you a nifty macro that translates Rust probe definitions into the correct ELF sections and other related assembly. Other ways to declare probes are also supported, though.
Their support is currently only for illumos and OS X but I managed to get Linux SystemTap SDTs working as well, and a PR for FreeBSD exists as well but has been languishing in review purgatory for two years: I imagine QEMU might enjoy this wide support (if they land).
We do have all safe integers inline (and most doubles too).
I answered about NaN boxing somewhere here but basically, we get quite a bit of mileage from our tagged union / enum / ADT based Value, so I don't think I'd change to NaN boxing now even if I could.
That's how I got interested in this kind of memory layout in the first place. I wanted a nice Lisp for WebAssembly and had recently gotten into Zig. When I started defining the word structure I remembered Andy Kelley's talk about using data-oriented design to make the Zig compiler fast, so I thought I'd try it, and the more I thought about it the more reasonable it seemed.
There are like a dozen object types with different growing multiarrays. Words are 32 bit with 1 for GC state and 27 for index and the rest are the type tag. Ints are 28 bits. Byte arrays have their own heap too, as well as general 32 bit vectors.
Well I'll be damned! That sounds very much like what I want Nova to eventually be :) We don't have fields split apart at present, mostly because Rust doesn't make that quite as easy as I would want to. Otherwise it sounds like it's very much all the same, in a good way.
I'll definitely be taking a look at wisp, thank you very much for the link! If you ever have the time, I'd love seeing a comparison of this sort of engine design against a more traditional one.
Nope. It rather explains the memory ownership of a JavaScript heap in a way that Rust understands: The heap owns all data in the heap, and JavaScript objects holding "references" to other objects do not imply memory ownership in the sense that Rust understands it.
So, safery properties are not being silenced: The indexes definitely _are_ Rust wise unreliable where a pointer wouldn't be so bounds checks need to be done. But memory safety is not under threat here.
This does mean that we have to take care of garbage collection ourselves, Rust will not do that for us, but that was the case anyhow since Rust doesn't have a garbage collector we could use (thank heavens). If we make mistakes here, it will lead to the JavaScript heap being corrupted from the JS code point of view but from the engine point of view the memory is still fully safe: The worst thing that can happen is a panic from out of bounds vector indexing.
I think the point is that "memory safety", understood broadly, goes beyond that. You still have the notion of dangling pointers and such with indices, and while it doesn't allow for e.g. stack corruption, it still exposes many similar classes of bugs - for example, a "dangling" index to a deallocated object can potentially allow the code to read or write into a completely different object than it was originally pointing to. Hence, silent logic errors, potentially exploitable as well.
Yeah, sure in that sense this partially turns off Rust's safety features. That being said: A big part of making the engine be safe for interleaved GC is about using ZST's with lifetimes held inside them to bind any JavaScript Values when they appear on the stack and getting back parts of those security guarantees.
We can still make mistakes, especially in the garbage collector, but that is again somewhat helped by code-sharing and coding conventions enabled by Rust ie. using destructuring in GC to make sure we don't forget any part of the heap data. (Coding conventions are not a solution, they are a mitigation at most. I _can_ write the heap GC as a map from one heap data of 'old lifetime to 'new, but that leads to worse code generation than mutating a 'static lifetime heap data :( )
I don't know what "security safety" is so I must've gotten confused. If you mean type safety, then we do make sure to stay on top of that: Our JS Value is an enum that contains either stack data or a typed index that corresponds to the tag. So the Array variant holds an Array index etc. So it is not possible to take type of index and turn it into another type of index without transmute.
If you refer to referential safety, so that your reference to object X still refers to X later on, then that is indeed something we "lose" because we need to implement GC ourselves. But that wouldn't actually really meaningfully change with using pointers either, as updating pointers after a move would need to be done manually as well.
Using references is right out because we cannot explain the JavaScript memory ownership model to Rust: The two are simply not compatible. There are of course safe GC crates that give you reference APIs but they do the pointer updating manually on the inside (if they have moving GC anyway), so the situation doesn't meaningfully change.
Thank you for asking! I've not implemented and thus haven't proved this in action yet, but my thinking is that this interacts very well indeed: Each heap vector can designate an index that marks the beginning of the young generation. We don't need separate old and new spaces, instead promotion is just the act of moving the young generation beginning index up.
Side note: I have a corollary on the "most objects die young" that is very much at the heart of Nova: Most objects live together. If they are created at the same time, then they're likely also used together. Hence why I don't swap items around in the heap vectors, or use a free list for allocation: It would mess up the temporal order of items in the vectors, leading to less chances at useful cache line sharing.
Don’t you need to move the surviving young generation objects after the ones they’re surrounded by die? Otherwise the array is going to end up rather sparse, with a lot of unused array entries.
Without either a free list or compaction, I don’t really see how you’re collecting garbage at all.
Yes, I do need to compact the young generation during GC. Eg. Let's say I have the young generation starting at index 1000 and I do GC with 1100 items, with 10 items surviving. I'll have to compact the remaining 10 items to the 1000..1010 span of the vector, after which I can also decide to promote the bottom two young generation indexes to the old generation, making the next young generation start index 1002.
Excellent question: In a theoretical sense the answer would be that we cannot know since it depends on the JavaScript being run. But: In practice I think that is indeed the case. Especially for the more common an object is, the likelier it is that it is used in conjunction with others around it. At the same time, the more important their memory placement becomes.
eg. Say you have a JS programs that has about a 100 DataViews: I'd say it's unlikely these are used in conjunction with others very often, but they're also only a small part of the program, so their placement is mostly whatever.
Now what if that number is a million instead? Now I'm betting they're mostly all created together, used together, and that their placement is critical to the program's performance.
So, I'm betting that making random memory access performance worse while guaranteeing that data created together stays together and improving linear memory performance will be an overall win.
Whether this is true data-oriented design is then in the eye of the beholder: Maybe someone will think I'm wrong, my assumptions are wrong, and I'm thus not doing things in a data-oriented way.
Hey, thank you for the comment! I'll try to answer as best I can.
A pointer is 64 bits, though carrying much less useful payload than that. A JavaScript engine only rarely deals with more than 4 GiB of memory, so a 32 bit integer would be enough to index the entire memory needed. If you turn that though into indexes, a 32 bit index can speak of 4 billion separate items: Most programs never have that many distinct heap items alive at the same time. Note that this index doesn't now really correspond to indexable memory so we're no longer bound by the 4 GiB limit.
We actually do keep the 64 bit Value though! We just use the massive amounts of data to store a lot of data on the stack, avoiding heap allocations altogether.
> That just sounds like a pointer.
A pointer points to one place and one place only: An index can points to as many places as there are "parallel vectors" associated with it. eg. Think of a table: A row index refers to as many cells as there are columns, whereas a cell pointer only identifies one cell.
> The last case also seems like a security hole, not protection.
Usually JS engines don't consider the JS-accessible contents of the JS heap itself part of the threat model: Any object in the heap is liable to be leaked by the JS code running in the engine anyway. eg. V8's object placement is fairly static and easy to exploit. The important thing for safety is to avoid type confusion which can be used to create read/write primitives to punch out of the sandbox. So; an attacker can freely read through the heap data by creating heap indexes out of thin air but they cannot use that to reinterpret one type of data as another type and then feed that back to the engine to cause it to misbehave.
> But this I don't get. If you are splitting things by type, how can you cluster them by how they are related? An object like {a: 1, b: 2} is an object with two strings and two numbers, presumably spread out over three different type-specific heaps?
Yes, this would split into the ordinary object vector, and the object property vector. If the keys were longer they'd end up in the strings vector and if the values were heap allocated doubles then they'd end up in yet another vector. Looking at it one thing at a time, it is split here and there.
That being said, this doesn't really much change from how traditional engines do it: Strings are not going to be near the objects that use them as keys, nor are heap numbers, and (added) properties also go into a separate backing store which is likely not next to the object. Worst of all, even if all of these were next to the object, they'd span multiple cache lines and wouldn't really benefit from being close to each other as they're pointer chased and thus wouldn't get much guarantees of prefetching.
When you look at multiple objects, however, then you'll see that Nova's object data is still found in those 4 vectors, whereas the traditional engine design... It may have tried it's best to keep the data together but it's probably still spread out here and there. And you're loading all unnecessary stuff like the elements pointer (for indexed properties) and any other inline properties etc. together with the properties that you actually wanted to read.
Sorry, this ended up a bit disjointed. Let me know if you have more questions! Thanks.
One question that came to mind as a single-track-Rust-mind kind of person: in D generally or in your experience specifically, when you find that the borrow checker doesn't work for a data structure, what is the alternative memory management strategy that you choose usually? Is it garbage collection, or manual memory management without a borrow checker?
Cheers!