Bonus points if you can effectively implement the "copy on write" ability of the...

mlyle · on April 26, 2020

If you just pull things on demand, you're going to get a lot of round-trip-time penalties to page things in.

I think you should still be pushing the memory as fast as you can, but maybe you start the child while this is still in progress, and prioritize sending stuff the child asks for (reorder to send that stuff "next"), if you've not already sent it.

trishume · on April 26, 2020

Yah that is indeed a super important optimization for avoiding round trips. CRIU does this and calls it "pre-paging", their wiki also mentions that they adapt their page streaming to try to pre-stream pages around pages that have been faulted: https://en.wikipedia.org/wiki/Live_migration#Post-copy_memor...

edit: lol I didn't realized that isn't CRIU's wiki since they just linked to a Wikipedia page and both use WikiMedia software. This is the actual CRIU wiki page, and it's way harder to tell if they do this, although I suspect they do and it's in the "copy images" step of the diagram https://criu.org/Userfaultfd

lunixbochs · on April 26, 2020

That’s a great idea. One of my thoughts was to “pre-heat” the process by executing a bit locally with side effects disabled to see what would get immediately accessed and send that first.

If your systems strictly match somehow (machine image with auto update disabled? or regularly hash and timestamp files on both systems) you can also cheat by mapping some of the files locally on the other side.

trishume · on April 26, 2020

I do in fact mention this idea in the article. In fact userfaultfd was added to the kernel so that CRIU and KVM live migration could implement exactly this.

Another cool project that does something like this is https://github.com/gamozolabs/chocolate_milk which is a fuzzing hypervisor kernel which can back a VM snapshot memory mapping over the network to only pull down the pages that the VM actually reads during the fuzz case.

juancampa · on April 27, 2020

If you ever needed to bring the process back, you could use soft-dirty-bit[1] to determine which pages were modified since forking and only transfer those. CRIU uses it for incremental snapshots (in fact, they wrote the kernel patch afaik)

[1] https://www.kernel.org/doc/Documentation/vm/soft-dirty.txt

touisteur · on April 26, 2020

If you're at the hypervisor level you can also use Intel PML, seems it's made for this. https://arxiv.org/abs/2001.09991

I'm guessing @gamozolabs (twitter) is going to use it some day soon to be fuzz billions of cases per second, with snapshot fuzzing...