So does this mean that creating a process (or forking one) can be fast if you omit the Win32 state table and initialisation for the new process? And if so, do you think it would be possible to get fast forking by using Native processes instead of using the Linux emulation layer (assuming that NT actually has an API to CoW-map an existing process' memory, which might be internal to the Linux-on-Windows subsystem)?