Saturday, December 8, 2012

The fork function


UNIX / Linux Processes: C fork() Function

by HIMANSHU ARORA on MAY 18, 2012
Every running instance of a program is known as a process. The concept of processes is fundamental to the UNIX / Linux operating systems. A process has its own identity in form of a PID or a process ID. This PID for each process is unique across the whole operating system. Also, each process has its own process address space where memory segments like code segment, data segment, stack segment etc are placed. The concept of process is very vast and can be broadly classified into process creation, process execution and process termination.

Linux Processes Series: part 1part 2part 3, part 4 (this article).
In this article we will concentrate on the process creation aspect from programming point of view. We will focus on the fork() function and understand how it works.

The fork() Function

The fork() function is used to create a new process by duplicating the existing process from which it is called. The existing process from which this function is called becomes the parent process and the newly created process becomes the child process. As already stated that child is a duplicate copy of the parent but there are some exceptions to it.
  • The child has a unique PID like any other process running in the operating system.
  • The child has a parent process ID which is same as the PID of the process that created it.
  • Resource utilization and CPU time counters are reset to zero in child process.
  • Set of pending signals in child is empty.
  • Child does not inherit any timers from its parent
Note that the above list is not exhaustive. There are a whole lot of points mentioned in the man page of fork(). I’d strongly recommend readers of this article to go through those points in the man page of fork() function.

The Return Type

Fork() has an interesting behavior while returning to the calling method. If the fork() function is successful then it returns twice. Once it returns in the child process with return value ’0′ and then it returns in the parent process with child’s PID as return value. This behavior is because of the fact that once the fork is called, child process is  created and since the child process  shares the text segment with parent process and continues execution from the next statement in the same text segment so fork returns twice (once in parent and once in child).

C Fork Example

Lets take an example to illustrate the use of fork function.
#include <unistd.h>
#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <sys/wait.h>
#include <stdlib.h>

int var_glb; /* A global variable*/

int main(void)
{
    pid_t childPID;
    int var_lcl = 0;

    childPID = fork();

    if(childPID >= 0) // fork was successful
    {
        if(childPID == 0) // child process
        {
            var_lcl++;
            var_glb++;
            printf("\n Child Process :: var_lcl = [%d], var_glb[%d]\n", var_lcl, var_glb);
        }
        else //Parent process
        {
            var_lcl = 10;
            var_glb = 20;
            printf("\n Parent process :: var_lcl = [%d], var_glb[%d]\n", var_lcl, var_glb);
        }
    }
    else // fork failed
    {
        printf("\n Fork failed, quitting!!!!!!\n");
        return 1;
    }

    return 0;
}
In the code above :
  • A local and a global variable  (var_lcl and var_glb ) declared and initialized with value ’0′
  • Next the fork() function is called and its return value is saved in variable childPID.
  • Now, the value of childPID is checked to make sure that the fork() function passed.
  • Next on the basis of the value of childPID, the code for parent and child is executed.
  • One thing to note here is why the variables var_lcl and var_glb are used?
  • They are used to show that both child and parent process work on separate copies of these variables.
  • Separate values of these variables are used to show the above stated fact.
  • In linux, a copy-on-write mechanism is used where-in both the child and the parent keep working on the same copy of variable until one of them tries to change its value.
  • After fork, whether the child will run first or parent depends upon the scheduler.
Now, when the above code is compiled and run :
$ ./fork

Parent process :: var_lcl = [10], var_glb[20]

Child Process :: var_lcl = [1], var_glb[1]
We see that in the output above, both the child and parent process got executed and logs show separate values of var_lcl and var_glb. This concludes that both parent and child had their own copy of var_lcl and var_glb.
NOTE: A function getpid() can be used to fetch the process ID of the calling process and the function getppid() can be used to fetch the PID of the parent process.

No comments:

Post a Comment