Hi everyone. I'd like to please figure out something that I don't quite understand. Why did we ever have the fork function call?
Back in my youth when I was first taught the fork call, I got how how it worked, two copies of the process return from the function call and they both continue in parallel. The only thing that was bothering me was the long list of caveats my text book discussed.
It told me that a copy of the process was made, except for file handles, and that the memory wasn't really copied until one of the two processes tried to modify something. It seemed all terribly complicated but I figured there was a good reason for it that I didn't yet understand, grasshopper.
Time passed and I started working in embedded systems and later coding for Windows. I never used fork beyond those juvenilia programs I made. These OSs started new processes by passing an executable filename to the OS and telling it to start a new program. That new process started with a clean slate, no memory, empty stack, no open file handles except stdin/out/err. Simple.
Now, I've just been reminded of the fork call in Unix, and I'm prompted to ask; Why was it ever there? Who wants the ability to do a fork when simpler ways of starting a new process exist.
Nearly all the uses of fork I've seen are usually followed by an exec call. So the OS goes to all that trouble setting up a duplicate process only for all that hard work to be eliminated by running exec.
Even when concurrency within a program is needed, the thread model seems far more useful with a lot less complications.
So please, I need to know, why fork?
I actually think it is one of the most elegant system calls in unix.
Think of all the alternative clunky ways that OS's before unix had to use to start a process at a given depth into the process. Lots of flags to make sure that you started off where you left in the 'parent', to recreate all or most of the state required for the child process. Fork passes all that state 'for free'. And copy-on-write makes it fast.
It's a bit like biology. Split the cell, then let them both specialize a bit towards what they have to become. The moment of splitting is almost 100% symmetrical, the only difference being who is the 'parent' and who is the 'child' process.
Other ways of starting new processes feel clunky in comparision, you have to specify a binary to run, you have to know all kinds of details about parameters to pass and so on.
Fork essentially abstracts the creation of a sub-process to the absolute minimum.
Fork is atomic, it's got 0 parameters and it returns only one integer (or it fails for a simple reason, no more process slots).