I'm impressed by Filip's knack for explaining complex topics in an approachable manner. His previous post on WebKit's locking infrastructure[1] is a similarly good read.
A few amusing parts of this post that stood out to me:
* A single sentence containing half a dozen links to Filip's previous work on garbage collection.
* "See here for the proof", linking to a 200+ line comment in WebKit's source.
* "But since this is JavaScript, we get to have a lot more fun".
While that proof is amusing I was hoping for... an actual proof that I can give to a verifier. Ah well. A hundred or so lines of comments is probably right.
Still this looks to be some impressive work and I did enjoy this post.
It and others I've read on machine-checked proof showed the computer-centric one could catch quite a few problems with higher assurance of correctness. The peer review problem in science also makes me think it's important given I can't be sure the huge proofs will be adequately checked. Far as reliability on computer end, I've read on verified processors, proof assistants, compilers, and so on. Much of the risk can be knocked out but the black box that is human brains is another story.
Some accounting only needs to happen when visiting the
object for the first time. The complete barrier is simply:
object->field = newValue;
if (object->cellState == Old)
remember(object);
This surely elides some synchronization? One would think that either this has to communicate with the collector (to prevent a racing concurrent collection from missing the referent), or happen in the opposite order.
Or perhaps it is relying on the calling mutator thread necessarily holding a reference to the young object in question? Even then there is some non-trivial ordering work to be done here (in particular on weakly ordered architectures.)
I'm sure this is well handled but as this is one of the main difficulties in implementing good write barriers I am curious what they did.
The barrier code you quote is for our old generational GC. Keep reading to see what the riptide barrier looks like. See the section on retreating wavefront.
Aha, thanks; I didn't realize the first generational section was for an older version.
However, the riptide barrier description still isn't 100% clear how it avoids terminating a sweep (I believe "drain" in your termination) in the middle of a barrier execution; in particular I'm not concerned about the data race between the field write and the mark bit, but the GC reading mark bits, believing the drain has terminated, and freeing memory while a mutator thread has written a field but not written the mark bit.
Are you just relying on the stop-the-world behavior relative to roots that you mention briefly?
The collector can only flip at safepoints. Safepoints cannot happen inside the barrier or in between the store the barrier protects and the barrier itself.
Flip = term of art for when the GC terminates and decides which objects are free.
Sometimes I still wish Apple release Safari for Windows.
On another note, Webkit seems to be following a much more open manner, and has shift to working more towards Web App optimization rather then Web Pages.
You are listing Web browser engines. This article relates to JavaScript engines (eg: SpiderMonkey (Firefox, Servo), ChakraCore (Edge), V8 (Blink)).
V8 has some details about their GC (garbage collector) here[0]; ChakraCore here[1]; SpiderMonkey here[2].
Now, they are all generational, incremental and concurrent, but each with particular tweaks.
The approach detailed in this blog post is similar to one taken in ChakraCore, except it sets barriers on fixed-size blocks of memory, while Riptide uses objects.
In principle the downsides compared to compacting collectors is reduced cache locality, fragmentation, somewhat higher allocation overhead and higher memory footprint.
On the other hand it is a lot easier to implement concurrent collectors when objects are not moved.
> The changes that transformed WebKit’s collector were landed over the past six months, starting with the painful work of removing WebKit’s previous use of copying.
It is curious that mozilla chose the opposite, doing the painful work of turning a conservative, non-moving collector into a precise, compacting one.
> In principle the downsides compared to compacting collectors is reduced cache locality, fragmentation, somewhat higher allocation overhead and higher memory footprint.
In actuality the downside of not copying is:
- Your allocator has to be tuned to resemble first-fit in its allocation order. This is empirically enough to control memory usage.
- You better have more virtual address space than physical address space. This means that the Robson bound for memory usage goes from M log M to M log P where M is memory size, P is page size, and log M or log P is the fragmentation wastage multiplier. People needed compaction really badly back when the state-of-the-art GCs ran on maxed-out 32-bit address spaces.
- Mark-sweep collectors have to sweep, copying collectors don't have to. But this is a fixable problem. Sweeping can be made concurrent, parallel, incremental, or very cheap. We pick the very cheap, by using bump'n'pop.
Initially we had some regressions from removing copying, but we fixed them by making sure that the first and third mitigation listed above were in place. WebKit rarely runs on a maxed-out 32-bit address space.
Clarification: our previous collector did conservative root scanning, used a less evolved mark-sweep system for some objects, and used a Bartlett-style of quick-release compaction for other objects.
Riptide removes the copying because concurrent or event incremental copying is sure to introduce overhead. Are a minimum you either need a barrier on all pointer reads from the heap or all writes to the heap, and none of those barriers resemble our old generational barrier so this would be new overhead. Riptide only barriers writes of pointers to the heap, and uses a single unified barrier for both concurrency and generations to get maximum throughput. Riptide's concurrency actually improves end-to-end execution time because the overhead of the concurrency is negligible so you just get to enjoy the benefit of continuing to run while the GC runs, so you run faster overall. The speed-up was like 5% on the splay test, and I measured speed-ups in other things I cared about, like page load times.
"Riptide is an improvement to WebKit’s collector and retains most of the things that made the old algorithm great. The changes that transformed WebKit’s collector were landed over the past six months, starting with the painful work of removing WebKit’s previous use of copying. Riptide combines Guy Steele’s classic retreating wavefront write barrier with a mature sticky-mark-sweep collector and lots of concurrency tricks to get a useful combination of high GC throughput and low GC latency."
Personally, I'm a fan of explicit lifetime management, such as RAII in C++ and especially in Rust. It solves the general problem of resource management, as opposed to only memory management. Most languages that use garbage collection leave generalized resource management in a C like state of malloc and free. A few (not JS) have syntax for binding the lifetime of an object to a scope, such as "with" in Python or "using" in C#, but the burden is still on the consumer of an interface to know that they need to use it (you need to know that object must be .close()d, or .unlock()ed, etc.) and you need to remember at each and every call site; programmers make errors.
That said, changing from GC to explicit lifetime management requires wide-reaching language changes that I don't see how you would apply to JavaScript, or nearly any GC language. It's a very fundamental aspect of the language, and not easily changed. Hence the work into better collectors by the languages that do GC, so the parent's comment is not really enlightening.
I know the root's comment wasn't actually useful. People who have decided they hate GC like this will always do so. Much like vim/emacs :D
It was more a question of whether the commenter had any better solutions to propose. I was /really/ trying not to just be a dick and say "cool, so you work out how to do that without changing the language at all".
The post is written/formatted in a confusing way for those not already versed in garbage collection. In particular, the important but idiosyncratic term "draining" is not given a reference link, but is instead given a different font that makes it fade into the background.
A few amusing parts of this post that stood out to me:
* A single sentence containing half a dozen links to Filip's previous work on garbage collection.
* "See here for the proof", linking to a 200+ line comment in WebKit's source.
* "But since this is JavaScript, we get to have a lot more fun".
[1]: https://webkit.org/blog/6161/locking-in-webkit/