How a process is started in Unix
First, let us define what a process is. There are many definitions for the term process, but, for the purpose of this post, let’s go with the most basic definition for the term; “A process is a program in execution”.
OK, now that we know what a process means, let us try to imagine for a second how a program resident on secondary storage, e.g., a hard drive, can make the transition to “processhood”. What does this transition entail?
Remember that a program is, essentially, made up of three main segments. The code segment containing the code that needs to be executed (usually referred to as the text segment in Unix), a data segment containing global variables, and a stack segment used for function execution.
On the face of it, creating a new process should entail creating a new data structure to represent the process in the OS, and then allocating memory locations for the text, data and stack segments of the process. The addresses of these memory locations should then be stored in the data structure representing the process so that they can be accessed in the future.
The content of these segments should then be retrieved from the hard disk and loaded into memory — i.e., the binary file representing the program that we wish to execute should be read from the disk and the read values should be used to populate the memory segments reserved for the process.
Simple enough, right? On first examination, you may think that a single system call — e.g., spawn_process() — should be used to read an image from the hard disk, create the necessary data structure to represent the new process, allocate memory locations for each segment in the program and load the values read from the disk into those segments.
While some operating systems do in fact use a single system call for this, Unix does things a bit differently. There are two system calls that are used to create a new process. First, a call to the fork() system call is issued to create a copy of the current process. Let us discuss this statement for a bit. Note that I say that fork() is issued to create a copy of the current process. This implies two things; first, a new process can only be started from within another process and second, the created process is going to be a copy of the process that created it.
An observant reader will be wondering about the boot process at this point. If processes can only be started from within other processes, how on Earth does a PC manage to boot up? Well, the answer to that question is that the boot loader starts the kernel, which then starts a special process known as the init process. The init process then starts all the processes that need to get the system up and running — essentially, init is the parent process of all processes on your system.
What about processes you start from the command line? What process is responsible for starting them? Well, the command line you see in front of you is a shell, and the shell is a process. So the shell is responsible for calling fork() to create a new process.
This brings us to the second point mentioned. Remember that I said that the fork() system call creates a copy of the current process. So, when init or the shell calls fork(), the process created is a copy of init or the shell. This means that the newly created process has the same code, data and stack segments as the process that forked it. How do we manage to morph all these copies into the processes that we actually want to start? The answer to that question lies in the second system call that Unix uses to start processes — exec().
The exec() family of system calls load an executable image from disk and overlay the image on the address space of the process making the invocation — in lay terms, this means that exec() will load the contents of a program on disk and store the loaded values into the memory locations allocated to the current process. After this is done, the current process is no longer a copy of the process that forked it, rather it is now a running version of the program that has been loaded into its address space by the exec() system call.
So, to summarize, starting a process in Unix is a two step procedure, first, create a copy of the process from which you wish to start a new process using the fork() system call. Then overlay the address space of the newly created process with the image of the new process you need to execute using the exec() family of system calls.
An attentive reader will probably be thinking that all this copying is highly inefficient. After all, the work done by the fork() system call — copying the address space of the invoking process into the newly created process — is a total waste of time. We are going to overwrite the address space with the image loaded in the exec() system call anyway, so why waste time copying the address space of the original process?
Most modern Unixes have Copy On Write (COW) fork() semantics. This means that when fork() is called, no actual copying is performed. The original process and the forked process are marked to be using the same memory segments instead of doing any copies. However, as soon as either the original or new process write to a page, two individual copies of the page are made, one with the modification applied, this page belongs to the process that performed the write, and the other without the modification applied, this page belongs to the process that did not issue the write. That way, copying is minimized to include only the pages that have been written to by either process — this greatly speeds up things.
Since exec() is called immediately after a fork() when creating a new process, COW ensures that the initial fork() does almost no page copying at all. Note that while fork() can be used to create a process, it also has other uses, hence the copying of the original address space semantics and the COW optimization; there are uses of fork() that do not entail an immediate call to the exec() system call, and these uses may actually need the newly forked process to use a copy of the address space of its parent.
Hope this has clarified things for those asking about the fork/exec process creation semantics in Unix. I will create another blog post soon to show you how this can be practically applied on Linux, the most popular Unix-like OS.
First, let us define what a process is. There are many definitions for the term process, but, for the purpose of this post, let’s go with the most basic definition for the term; “A process is a program in execution”. OK, now that we know what a process means, let us try to imagine for…