Process
2026-02-19 Thu
As was earlier stated, a process is just a running program. On a computer, there’s a limited amount of CPUs that seemingly run hundreds or thousands of processes at the same time. This illusion of limitless CPUs is created by the OS by virtualizing the CPU. It’s done by the time sharing mechanism of the CPU, in which the OS runs one process, stops it and switches to a different one providing the illusion that multiple programs are running simultaneously at the cost of performance. On top of this mechanism resides a policy which is an algorithm for making decisions within the OS.
A mechanism answers a how question (lower-level mechanism) about a system, whereas the policy provides the answer to a which question (high-level). For example: how does an OS perform a context switch v/s which process should it run right now.
It is necessary to understand the machine state of a process to understand what constitutes it. One of the obvious components of the state is the memory that the process can address (aka address space). Also part of the state are the registers: special ones being program counter (PC aka instruction pointer IP), stack pointer, frame pointer etc.
Process API
- Create: New processes.
- Destory: Said processes forcefully.
- Wait: For graceful exit/completion of processes.
- Miscellaenous Control: Process suspension, resuming etc.
- Status: How long did a process run for, what’s its current state.
Process Creation
The first things the OS must do is to load the program code and static data (initialized variables) in memory, into the address space of the process. In early OSes, loading was done eagerly i.e. the entire program was loaded all at once but modern OS do lazy loading where only the pieces of data that’s needed is loaded during execution (done via paging/swapping).
Once the data’s loaded, some memory must be allocated for program’s stack (used for local variables, function parameters, return addresses). The OS also initializes the stack with arguments i.e. fill parameters to main() function. Then it must also allocate some heap memory, used for explicitly requested dynamically-allocated data.
The OS must also do I/O related initialization, i.e. setup the three file descriptors for the process to read input from terminal and print output/error to the screen. Once all the setup is done, program execution can begin by jumping to the main() routine, and the control is shifted from CPU to the newly-created process.
Process States
A process can be in Running, Ready or Blocked state. Moving from Ready to Running means the process has been scheduled, if a running process gets blocked due to I/O, the OS doesn’t move it to ready unless the blocking event finishes.
stateDiagram-v2
Running --> Ready: Descheduled
Ready --> Running: Scheduled
Running --> Blocked: I/O initiated
Blocked --> Ready: I/O done
Process Data Structures
OS must have a process list to keep track of state of all processes, which ones are ready/running or blocked. For blocked processes, the I/O events must also be tracked somehow. The register context holds the contents of registers for a stopped process, which can then be used to restore the values in physical registers when running of the process resumes.
struct context {
int eip;
int esp;
...
int ebp;
};
enum proc_state { UNUSED, EMBRYO, SLEEPING,
RUNNABLE, RUNNING, ZOMBIE };
struct proc {
char *mem; // Start of proc mem
uint sz; // Size of proc mem
char *kstack; // Process kernel stack bottom
enum proc_state state; // Proc state
int pid;
struct proc *parent; // Parent process
void *chan; // If !zero, sleeping on chan
int killed; // If !zero, has been killed
struct file *ofile[NOFILE]; // Open files
struct inode *cwd;
struct context context;
struct trapframe *tf; // Trap frame for current interrupt
};
Homework
The below program process-run.py allows for observing how process states change as programs run
process-run.py
#! /usr/bin/env python
from __future__ import print_function
import sys
from optparse import OptionParser
import random
# to make Python2 and Python3 act the same -- how dumb
def random_seed(seed):
try:
random.seed(seed, version=1)
except:
random.seed(seed)
return
# process switch behavior
SCHED_SWITCH_ON_IO = 'SWITCH_ON_IO'
SCHED_SWITCH_ON_END = 'SWITCH_ON_END'
# io finished behavior
IO_RUN_LATER = 'IO_RUN_LATER'
IO_RUN_IMMEDIATE = 'IO_RUN_IMMEDIATE'
# process states
STATE_RUNNING = 'RUNNING'
STATE_READY = 'READY'
STATE_DONE = 'DONE'
STATE_WAIT = 'BLOCKED'
# members of process structure
PROC_CODE = 'code_'
PROC_PC = 'pc_'
PROC_ID = 'pid_'
PROC_STATE = 'proc_state_'
# things a process can do
DO_COMPUTE = 'cpu'
DO_IO = 'io'
DO_IO_DONE = 'io_done'
class scheduler:
def __init__(self, process_switch_behavior, io_done_behavior, io_length):
# keep set of instructions for each of the processes
self.proc_info = {}
self.process_switch_behavior = process_switch_behavior
self.io_done_behavior = io_done_behavior
self.io_length = io_length
return
def new_process(self):
proc_id = len(self.proc_info)
self.proc_info[proc_id] = {}
self.proc_info[proc_id][PROC_PC] = 0
self.proc_info[proc_id][PROC_ID] = proc_id
self.proc_info[proc_id][PROC_CODE] = []
self.proc_info[proc_id][PROC_STATE] = STATE_READY
return proc_id
# program looks like this:
# c7,i,c1,i
# which means
# compute for 7, then i/o, then compute for 1, then i/o
def load_program(self, program):
proc_id = self.new_process()
for line in program.split(','):
opcode = line[0]
if opcode == 'c': # compute
num = int(line[1:])
for i in range(num):
self.proc_info[proc_id][PROC_CODE].append(DO_COMPUTE)
elif opcode == 'i':
self.proc_info[proc_id][PROC_CODE].append(DO_IO)
# add one compute to HANDLE the I/O completion
self.proc_info[proc_id][PROC_CODE].append(DO_IO_DONE)
else:
print('bad opcode %s (should be c or i)' % opcode)
exit(1)
return
def load(self, program_description):
proc_id = self.new_process()
tmp = program_description.split(':')
if len(tmp) != 2:
print('Bad description (%s): Must be number <x:y>' % program_description)
print(' where X is the number of instructions')
print(' and Y is the percent change that an instruction is CPU not IO')
exit(1)
num_instructions, chance_cpu = int(tmp[0]), float(tmp[1])/100.0
for i in range(num_instructions):
if random.random() < chance_cpu:
self.proc_info[proc_id][PROC_CODE].append(DO_COMPUTE)
else:
self.proc_info[proc_id][PROC_CODE].append(DO_IO)
# add one compute to HANDLE the I/O completion
self.proc_info[proc_id][PROC_CODE].append(DO_IO_DONE)
return
def move_to_ready(self, expected, pid=-1):
if pid == -1:
pid = self.curr_proc
assert(self.proc_info[pid][PROC_STATE] == expected)
self.proc_info[pid][PROC_STATE] = STATE_READY
return
def move_to_wait(self, expected):
assert(self.proc_info[self.curr_proc][PROC_STATE] == expected)
self.proc_info[self.curr_proc][PROC_STATE] = STATE_WAIT
return
def move_to_running(self, expected):
assert(self.proc_info[self.curr_proc][PROC_STATE] == expected)
self.proc_info[self.curr_proc][PROC_STATE] = STATE_RUNNING
return
def move_to_done(self, expected):
assert(self.proc_info[self.curr_proc][PROC_STATE] == expected)
self.proc_info[self.curr_proc][PROC_STATE] = STATE_DONE
return
def next_proc(self, pid=-1):
if pid != -1:
self.curr_proc = pid
self.move_to_running(STATE_READY)
return
for pid in range(self.curr_proc + 1, len(self.proc_info)):
if self.proc_info[pid][PROC_STATE] == STATE_READY:
self.curr_proc = pid
self.move_to_running(STATE_READY)
return
for pid in range(0, self.curr_proc + 1):
if self.proc_info[pid][PROC_STATE] == STATE_READY:
self.curr_proc = pid
self.move_to_running(STATE_READY)
return
return
def get_num_processes(self):
return len(self.proc_info)
def get_num_instructions(self, pid):
return len(self.proc_info[pid][PROC_CODE])
def get_instruction(self, pid, index):
return self.proc_info[pid][PROC_CODE][index]
def get_num_active(self):
num_active = 0
for pid in range(len(self.proc_info)):
if self.proc_info[pid][PROC_STATE] != STATE_DONE:
num_active += 1
return num_active
def get_num_runnable(self):
num_active = 0
for pid in range(len(self.proc_info)):
if self.proc_info[pid][PROC_STATE] == STATE_READY or \
self.proc_info[pid][PROC_STATE] == STATE_RUNNING:
num_active += 1
return num_active
def get_ios_in_flight(self, current_time):
num_in_flight = 0
for pid in range(len(self.proc_info)):
for t in self.io_finish_times[pid]:
if t > current_time:
num_in_flight += 1
return num_in_flight
def check_for_switch(self):
return
def space(self, num_columns):
for i in range(num_columns):
print('%10s' % ' ', end='')
def check_if_done(self):
if len(self.proc_info[self.curr_proc][PROC_CODE]) == 0:
if self.proc_info[self.curr_proc][PROC_STATE] == STATE_RUNNING:
self.move_to_done(STATE_RUNNING)
self.next_proc()
return
def run(self):
clock_tick = 0
if len(self.proc_info) == 0:
return
# track outstanding IOs, per process
self.io_finish_times = {}
for pid in range(len(self.proc_info)):
self.io_finish_times[pid] = []
# make first one active
self.curr_proc = 0
self.move_to_running(STATE_READY)
# OUTPUT: headers for each column
print('%s' % 'Time', end='')
for pid in range(len(self.proc_info)):
print('%14s' % ('PID:%2d' % (pid)), end='')
print('%14s' % 'CPU', end='')
print('%14s' % 'IOs', end='')
print('')
# init statistics
io_busy = 0
cpu_busy = 0
while self.get_num_active() > 0:
clock_tick += 1
# check for io finish
io_done = False
for pid in range(len(self.proc_info)):
if clock_tick in self.io_finish_times[pid]:
io_done = True
self.move_to_ready(STATE_WAIT, pid)
if self.io_done_behavior == IO_RUN_IMMEDIATE:
# IO_RUN_IMMEDIATE
if self.curr_proc != pid:
if self.proc_info[self.curr_proc][PROC_STATE] == STATE_RUNNING:
self.move_to_ready(STATE_RUNNING)
self.next_proc(pid)
else:
# IO_RUN_LATER
if self.process_switch_behavior == SCHED_SWITCH_ON_END and self.get_num_runnable() > 1:
# this means the process that issued the io should be run
self.next_proc(pid)
if self.get_num_runnable() == 1:
# this is the only thing to run: so run it
self.next_proc(pid)
self.check_if_done()
# if current proc is RUNNING and has an instruction, execute it
instruction_to_execute = ''
if self.proc_info[self.curr_proc][PROC_STATE] == STATE_RUNNING and \
len(self.proc_info[self.curr_proc][PROC_CODE]) > 0:
instruction_to_execute = self.proc_info[self.curr_proc][PROC_CODE].pop(0)
cpu_busy += 1
# OUTPUT: print what everyone is up to
if io_done:
print('%3d*' % clock_tick, end='')
else:
print('%3d ' % clock_tick, end='')
for pid in range(len(self.proc_info)):
if pid == self.curr_proc and instruction_to_execute != '':
print('%14s' % ('RUN:'+instruction_to_execute), end='')
else:
print('%14s' % (self.proc_info[pid][PROC_STATE]), end='')
# CPU output here: if no instruction executes, output a space, otherwise a 1
if instruction_to_execute == '':
print('%14s' % ' ', end='')
else:
print('%14s' % '1', end='')
# IO output here:
num_outstanding = self.get_ios_in_flight(clock_tick)
if num_outstanding > 0:
print('%14s' % str(num_outstanding), end='')
io_busy += 1
else:
print('%10s' % ' ', end='')
print('')
# if this is an IO start instruction, switch to waiting state
# and add an io completion in the future
if instruction_to_execute == DO_IO:
self.move_to_wait(STATE_RUNNING)
self.io_finish_times[self.curr_proc].append(clock_tick + self.io_length + 1)
if self.process_switch_behavior == SCHED_SWITCH_ON_IO:
self.next_proc()
# ENDCASE: check if currently running thing is out of instructions
self.check_if_done()
return (cpu_busy, io_busy, clock_tick)
#
# PARSE ARGUMENTS
#
parser = OptionParser()
parser.add_option('-s', '--seed', default=0, help='the random seed', action='store', type='int', dest='seed')
parser.add_option('-P', '--program', default='', help='more specific controls over programs', action='store', type='string', dest='program')
parser.add_option('-l', '--processlist', default='', help='a comma-separated list of processes to run, in the form X1:Y1,X2:Y2,... where X is the number of instructions that process should run, and Y the chances (from 0 to 100) that an instruction will use the CPU or issue an IO (i.e., if Y is 100, a process will ONLY use the CPU and issue no I/Os; if Y is 0, a process will only issue I/Os)', action='store', type='string', dest='process_list')
parser.add_option('-L', '--iolength', default=5, help='how long an IO takes', action='store', type='int', dest='io_length')
parser.add_option('-S', '--switch', default='SWITCH_ON_IO', help='when to switch between processes: SWITCH_ON_IO, SWITCH_ON_END', action='store', type='string', dest='process_switch_behavior')
parser.add_option('-I', '--iodone', default='IO_RUN_LATER', help='type of behavior when IO ends: IO_RUN_LATER, IO_RUN_IMMEDIATE', action='store', type='string', dest='io_done_behavior')
parser.add_option('-c', help='compute answers for me', action='store_true', default=False, dest='solve')
parser.add_option('-p', '--printstats', help='print statistics at end; only useful with -c flag (otherwise stats are not printed)', action='store_true', default=False, dest='print_stats')
(options, args) = parser.parse_args()
random_seed(options.seed)
assert(options.process_switch_behavior == SCHED_SWITCH_ON_IO or options.process_switch_behavior == SCHED_SWITCH_ON_END)
assert(options.io_done_behavior == IO_RUN_IMMEDIATE or options.io_done_behavior == IO_RUN_LATER)
s = scheduler(options.process_switch_behavior, options.io_done_behavior, options.io_length)
if options.program != '':
for p in options.program.split(':'):
s.load_program(p)
else:
# example process description (10:100,10:100)
for p in options.process_list.split(','):
s.load(p)
assert(options.io_length >= 0)
if options.solve == False:
print('Produce a trace of what would happen when you run these processes:')
for pid in range(s.get_num_processes()):
print('Process %d' % pid)
for inst in range(s.get_num_instructions(pid)):
print(' %s' % s.get_instruction(pid, inst))
print('')
print('Important behaviors:')
print(' System will switch when ', end='')
if options.process_switch_behavior == SCHED_SWITCH_ON_IO:
print('the current process is FINISHED or ISSUES AN IO')
else:
print('the current process is FINISHED')
print(' After IOs, the process issuing the IO will ', end='')
if options.io_done_behavior == IO_RUN_IMMEDIATE:
print('run IMMEDIATELY')
else:
print('run LATER (when it is its turn)')
print('')
exit(0)
(cpu_busy, io_busy, clock_tick) = s.run()
if options.print_stats:
print('')
print('Stats: Total Time %d' % clock_tick)
print('Stats: CPU Busy %d (%.2f%%)' % (cpu_busy, 100.0 * float(cpu_busy)/clock_tick))
print('Stats: IO Busy %d (%.2f%%)' % (io_busy, 100.0 * float(io_busy)/clock_tick))
print('')
# '-l': 'a comma-separated list of processes to run, in the form X1:Y1,X2:Y2,...
# where X is the number of instructions that process should run
# Y the chances (from 0 to 100) that an instruction will use the CPU or issue an IO
# i.e., if Y is 100, a process will ONLY use the CPU and issue no I/Os;
# if Y is 0, a process will only issue I/Os
python process-run.py -l 5:100,5:100 -c -p
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:cpu READY 1 2 RUN:cpu READY 1 3 RUN:cpu READY 1 4 RUN:cpu READY 1 5 RUN:cpu READY 1 6 DONE RUN:cpu 1 7 DONE RUN:cpu 1 8 DONE RUN:cpu 1 9 DONE RUN:cpu 1 10 DONE RUN:cpu 1 Stats: Total Time 10 Stats: CPU Busy 10 (100.00%) Stats: IO Busy 0 (0.00%)
python process-run.py -l 4:100,1:0 -c -p
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:cpu READY 1 2 RUN:cpu READY 1 3 RUN:cpu READY 1 4 RUN:cpu READY 1 5 DONE RUN:io 1 6 DONE BLOCKED 1 7 DONE BLOCKED 1 8 DONE BLOCKED 1 9 DONE BLOCKED 1 10 DONE BLOCKED 1 11* DONE RUN:io_done 1 Stats: Total Time 11 Stats: CPU Busy 6 (54.55%) Stats: IO Busy 5 (45.45%)
# switching the order changes the total time taken
python process-run.py -l 1:0,4:100 -c -p
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:io READY 1 2 BLOCKED RUN:cpu 1 1 3 BLOCKED RUN:cpu 1 1 4 BLOCKED RUN:cpu 1 1 5 BLOCKED RUN:cpu 1 1 6 BLOCKED DONE 1 7* RUN:io_done DONE 1 Stats: Total Time 7 Stats: CPU Busy 6 (85.71%) Stats: IO Busy 5 (71.43%)
# -S determines how system reacts when process issues I/O
python process-run.py -l 1:0,4:100 -c -S SWITCH_ON_END
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:io READY 1 2 BLOCKED READY 1 3 BLOCKED READY 1 4 BLOCKED READY 1 5 BLOCKED READY 1 6 BLOCKED READY 1 7* RUN:io_done READY 1 8 DONE RUN:cpu 1 9 DONE RUN:cpu 1 10 DONE RUN:cpu 1 11 DONE RUN:cpu 1
python process-run.py -l 1:0,4:100 -c -S SWITCH_ON_IO
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:io READY 1 2 BLOCKED RUN:cpu 1 1 3 BLOCKED RUN:cpu 1 1 4 BLOCKED RUN:cpu 1 1 5 BLOCKED RUN:cpu 1 1 6 BLOCKED DONE 1 7* RUN:io_done DONE 1
# -I defines the type of behavior when IO ends
python process-run.py -l 3:0,5:100,5:100,5:100 -S SWITCH_ON_IO -I IO_RUN_LATER -c -p
Output
Time PID: 0 PID: 1 PID: 2 PID: 3 CPU IOs 1 RUN:io READY READY READY 1 2 BLOCKED RUN:cpu READY READY 1 1 3 BLOCKED RUN:cpu READY READY 1 1 4 BLOCKED RUN:cpu READY READY 1 1 5 BLOCKED RUN:cpu READY READY 1 1 6 BLOCKED RUN:cpu READY READY 1 1 7* READY DONE RUN:cpu READY 1 8 READY DONE RUN:cpu READY 1 9 READY DONE RUN:cpu READY 1 10 READY DONE RUN:cpu READY 1 11 READY DONE RUN:cpu READY 1 12 READY DONE DONE RUN:cpu 1 13 READY DONE DONE RUN:cpu 1 14 READY DONE DONE RUN:cpu 1 15 READY DONE DONE RUN:cpu 1 16 READY DONE DONE RUN:cpu 1 17 RUN:io_done DONE DONE DONE 1 18 RUN:io DONE DONE DONE 1 19 BLOCKED DONE DONE DONE 1 20 BLOCKED DONE DONE DONE 1 21 BLOCKED DONE DONE DONE 1 22 BLOCKED DONE DONE DONE 1 23 BLOCKED DONE DONE DONE 1 24* RUN:io_done DONE DONE DONE 1 25 RUN:io DONE DONE DONE 1 26 BLOCKED DONE DONE DONE 1 27 BLOCKED DONE DONE DONE 1 28 BLOCKED DONE DONE DONE 1 29 BLOCKED DONE DONE DONE 1 30 BLOCKED DONE DONE DONE 1 31* RUN:io_done DONE DONE DONE 1 Stats: Total Time 31 Stats: CPU Busy 21 (67.74%) Stats: IO Busy 15 (48.39%)
python process-run.py -l 3:0,5:100,5:100,5:100 -S SWITCH_ON_IO -I IO_RUN_IMMEDIATE -c -p
Output
Time PID: 0 PID: 1 PID: 2 PID: 3 CPU IOs 1 RUN:io READY READY READY 1 2 BLOCKED RUN:cpu READY READY 1 1 3 BLOCKED RUN:cpu READY READY 1 1 4 BLOCKED RUN:cpu READY READY 1 1 5 BLOCKED RUN:cpu READY READY 1 1 6 BLOCKED RUN:cpu READY READY 1 1 7* RUN:io_done DONE READY READY 1 8 RUN:io DONE READY READY 1 9 BLOCKED DONE RUN:cpu READY 1 1 10 BLOCKED DONE RUN:cpu READY 1 1 11 BLOCKED DONE RUN:cpu READY 1 1 12 BLOCKED DONE RUN:cpu READY 1 1 13 BLOCKED DONE RUN:cpu READY 1 1 14* RUN:io_done DONE DONE READY 1 15 RUN:io DONE DONE READY 1 16 BLOCKED DONE DONE RUN:cpu 1 1 17 BLOCKED DONE DONE RUN:cpu 1 1 18 BLOCKED DONE DONE RUN:cpu 1 1 19 BLOCKED DONE DONE RUN:cpu 1 1 20 BLOCKED DONE DONE RUN:cpu 1 1 21* RUN:io_done DONE DONE DONE 1 Stats: Total Time 21 Stats: CPU Busy 21 (100.00%) Stats: IO Busy 15 (71.43%)
# -s is the seed
python process-run.py -s 1 -l 3:50,3:50 -c -p
python process-run.py -s 2 -l 3:50,3:50 -c -p
python process-run.py -s 3 -l 3:50,3:50 -c -p
Output
Time PID: 0 PID: 1 CPU IOs 1 RUN:cpu READY 1 2 RUN:io READY 1 3 BLOCKED RUN:cpu 1 1 4 BLOCKED RUN:cpu 1 1 5 BLOCKED RUN:cpu 1 1 6 BLOCKED DONE 1 7 BLOCKED DONE 1 8* RUN:io_done DONE 1 9 RUN:io DONE 1 10 BLOCKED DONE 1 11 BLOCKED DONE 1 12 BLOCKED DONE 1 13 BLOCKED DONE 1 14 BLOCKED DONE 1 15* RUN:io_done DONE 1 Stats: Total Time 15 Stats: CPU Busy 8 (53.33%) Stats: IO Busy 10 (66.67%) Time PID: 0 PID: 1 CPU IOs 1 RUN:io READY 1 2 BLOCKED RUN:cpu 1 1 3 BLOCKED RUN:io 1 1 4 BLOCKED BLOCKED 2 5 BLOCKED BLOCKED 2 6 BLOCKED BLOCKED 2 7* RUN:io_done BLOCKED 1 1 8 RUN:io BLOCKED 1 1 9* BLOCKED RUN:io_done 1 1 10 BLOCKED RUN:io 1 1 11 BLOCKED BLOCKED 2 12 BLOCKED BLOCKED 2 13 BLOCKED BLOCKED 2 14* RUN:io_done BLOCKED 1 1 15 RUN:cpu BLOCKED 1 1 16* DONE RUN:io_done 1 Stats: Total Time 16 Stats: CPU Busy 10 (62.50%) Stats: IO Busy 14 (87.50%) Time PID: 0 PID: 1 CPU IOs 1 RUN:cpu READY 1 2 RUN:io READY 1 3 BLOCKED RUN:io 1 1 4 BLOCKED BLOCKED 2 5 BLOCKED BLOCKED 2 6 BLOCKED BLOCKED 2 7 BLOCKED BLOCKED 2 8* RUN:io_done BLOCKED 1 1 9* RUN:cpu READY 1 10 DONE RUN:io_done 1 11 DONE RUN:io 1 12 DONE BLOCKED 1 13 DONE BLOCKED 1 14 DONE BLOCKED 1 15 DONE BLOCKED 1 16 DONE BLOCKED 1 17* DONE RUN:io_done 1 18 DONE RUN:cpu 1 Stats: Total Time 18 Stats: CPU Busy 9 (50.00%) Stats: IO Busy 11 (61.11%)
System Calls
Upon fork() ing, the child isn’t an exact copy so it has its own address space and the value it returns to the caller is different. (Mandatory read: A fork() in the road: Baumann et al.)
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
printf("hello world (pid: %d)\n", (int) getpid());
fflush(stdout); // Without flushing, the child was getting copy of stdout buffer
int rc = fork();
if (rc < 0) {
// fork failed
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
// child (new process)
printf("hello, I am child (pid: %d)\n", (int) getpid());
} else {
// parent goes down this path (main)
printf("hello, I am parent of %d (pid: %d)\n", rc, (int) getpid());
}
return 0;
}
hello world (pid: 86839) hello, I am parent of 86840 (pid: 86839) hello, I am child (pid: 86840)
While the parent receives PID of newly-created child, the child receives rc 0. (Why does the second hello world also refer to the parent Fixed by flushing the output, without this the child process was getting a copy of the buffered pid in its “hello world”?stdout)
man posix_spawn | col -b
Output
POSIX_SPAWN(2) System Calls Manual POSIX_SPAWN(2)
NAME
posix_spawn posix_spawnp – spawn a process
SYNOPSIS
#include <spawn.h>
int
posix_spawn(pid_t *restrict pid, const char *restrict path,
const posix_spawn_file_actions_t *file_actions,
const posix_spawnattr_t *restrict attrp, char *const argv[restrict],
char *const envp[restrict]);
int
posix_spawnp(pid_t *restrict pid, const char *restrict file,
const posix_spawn_file_actions_t *file_actions,
const posix_spawnattr_t *restrict attrp, char *const argv[restrict],
char *const envp[restrict]);
DESCRIPTION
The posix_spawn() function creates a new process from the executable
file, called the new process file, specified by path, which is an
absolute or relative path to the file. The posix_spawnp() function is
identical to the posix_spawn() function if the file specified contains a
slash character; otherwise, the file parameter is used to construct a
pathname, with its path prefix being obtained by a search of the path
specified in the environment by the “PATH variable”. If this variable
isn't specified, the default path is set according to the _PATH_DEFPATH
definition in <paths.h>, which is set to “/usr/bin:/bin”. This pathname
either refers to an executable object file, or a file of data for an
interpreter; execve(2) for more details.
The argument pid is a pointer to a pid_t variable to receive the pid of
the spawned process; if this is NULL, then the pid of the spawned process
is not returned. If this pointer is non-NULL, then on successful
completion, the variable will be modified to contain the pid of the
spawned process. The value is undefined in the case of a failure.
The argument file_actions is either NULL, or it is a pointer to a file
actions object that was initialized by a call to
posix_spawn_file_actions_init(3) and represents zero or more file
actions.
File descriptors open in the calling process image remain open in the new
process image, except for those for which the close-on-exec flag is set
(see close(2) and fcntl(2)). Descriptors that remain open are unaffected
by posix_spawn() unless their behaviour is modified by particular spawn
flags or a file action; see posix_spawnattr_setflags(3) and
posix_spawn_file_actions_init(3) for additional information.
The argument attrp is either NULL, or it is a pointer to an attributes
object that was initialized by a call to posix_spawnattr_init(3) and
represents a set of spawn attributes to apply. If NULL, then the default
attributes are applied; otherwise, these attributes can control various
aspects of the spawned process, and are applied prior to the spawned
process beginning execution; see posix_spawnattr_init(3) for more
information.
The argument argv is a pointer to a null-terminated array of character
pointers to null-terminated character strings. These strings construct
the argument list to be made available to the new process. At least
argv[0] must be present in the array, and should contain the file name of
the program being spawned, e.g. the last component of the path or file
argument.
The argument envp is a pointer to a null-terminated array of character
pointers to null-terminated strings. A pointer to this array is normally
stored in the global variable environ. These strings pass information to
the new process that is not directly an argument to the command (see
environ(7)).
Signals set to be ignored in the calling process are set to be ignored in
the new process, unless the behaviour is modified by user specified spawn
attributes. Signals which are set to be caught in the calling process
image are set to default action in the new process image. Blocked
signals remain blocked regardless of changes to the signal action, unless
the mask is overridden by user specified spawn attributes. The signal
stack is reset to be undefined (see sigaction(2) for more information).
By default, the effective user ID and group ID will be the same as those
of the calling process image; however, this may be overridden to force
them to be the real user ID and group ID of the parent process by user
specified spawn attributes (see posix_spawnattr_init(3) for more
information).
If the set-user-ID mode bit of the new process image file is set (see
chmod(2)), the effective user ID of the new process image is set to the
owner ID of the new process image file. If the set-group-ID mode bit of
the new process image file is set, the effective group ID of the new
process image is set to the group ID of the new process image file. (The
effective group ID is the first element of the group list.) The real
user ID, real group ID and supplementary group IDs of the new process
image remain the same as the calling process image. After any set-user-
ID and set-group-ID processing, the effective user ID is recorded as the
saved set-user-ID, and the effective group ID is recorded as the saved
set-group-ID. These values may be used in changing the effective IDs
later (see setuid(2)).
The new process also inherits the following attributes from the calling
process:
parent process ID see getppid(2)
process group ID see getpgrp(2), posix_spawnattr_init(3)
access groups see getgroups(2)
working directory see chdir(2)
root directory see chroot(2)
control terminal see termios(4)
resource usages see getrusage(2)
interval timers see getitimer(2)
resource limits see getrlimit(2)
file mode mask see umask(2)
signal mask see sigaction(2), sigsetmask(2),
posix_spawnattr_init(3)
When a program is executed as a result of a posix_spawn() or
posix_spawnp() call, it is entered as follows:
int
main(int argc, char **argv, char **envp);
where argc is the number of elements in argv (the ``arg count'') and argv
points to the array of character pointers to the arguments themselves.
RETURN VALUES
If the pid argument is NULL, no pid is returned to the calling process;
if it is non-NULL, then posix_spawn() and posix_spawnp() functions return
the process ID of the child process into the pid_t variable pointed to by
the pid argument and return a 0 on success. If an error occurs, they
return a non-zero error code as the function return value, and no child
process is created.
ERRORS
The posix_spawn() and posix_spawnp() functions will fail and return to
the calling process if:
[EINVAL] The value specified by file_actions or attrp is
invalid.
[E2BIG] The number of bytes in the new process's argument list
is larger than the system-imposed limit. This limit
is specified by the sysctl(3) MIB variable
KERN_ARGMAX.
[EACCES] Search permission is denied for a component of the
path prefix.
[EACCES] The new process file is not an ordinary file.
[EACCES] The new process file mode denies execute permission.
[EACCES] The new process file is on a filesystem mounted with
execution disabled (MNT_NOEXEC in ⟨sys/mount.h⟩).
[EFAULT] The new process file is not as long as indicated by
the size values in its header.
[EFAULT] Path, argv, or envp point to an illegal address.
[EIO] An I/O error occurred while reading from the file
system.
[ELOOP] Too many symbolic links were encountered in
translating the pathname. This is taken to be
indicative of a looping symbolic link.
[ENAMETOOLONG] A component of a pathname exceeded {NAME_MAX}
characters, or an entire path name exceeded {PATH_MAX}
characters.
[ENOENT] The new process file does not exist.
[ENOEXEC] The new process file has the appropriate access
permission, but has an unrecognized format (e.g., an
invalid magic number in its header).
[ENOMEM] The new process requires more virtual memory than is
allowed by the imposed maximum (getrlimit(2)).
[ENOTDIR] A component of the path prefix is not a directory.
[ETXTBSY] The new process file is a pure procedure (shared text)
file that is currently open for writing or reading by
some process.
[EBADARCH] The new process file has no architectures appropriate
for the current system.
[EBADF] Bad file descriptor for one or more file_actions.
Additionally, they may fail for any of the reasons listed in fork(2) or
exec(3).
CAVEAT
If a program is setuid to a non-super-user, but is executed when the real
uid is ``root'', then the program has some of the powers of a super-user
as well.
SEE ALSO
exit(2), fork(2), execl(3), sysctl(3), environ(7),
posix_spawnattr_init(3), posix_spawn_file_actions_init(3),
STANDARDS
Version 3 of the Single UNIX Specification (“SUSv3”) [SPN]
HISTORY
The posix_spawn() and posix_spawnp() function calls appeared in Version 3
of the Single UNIX Specification (“SUSv3”) [SPN].
Mac OS X November 2, 2010 Mac OS X
Sometimes its useful for parent to wait() for child process to finish. In the below program, the child will print first, thus introducing a bit of determinism.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char *argv[]) {
printf("hello world (pid: %d)\n", (int) getpid());
fflush(stdout); // Without flushing, the child was getting copy of stdout buffer
int rc = fork();
if (rc < 0) { // fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) { // child (new process)
printf("hello, I am child (pid: %d)\n", (int) getpid());
} else { // parent goes down this path (main)
int rc_wait = wait(NULL);
printf("hello, I am parent of %d (rc_wait: %d) (pid: %d)\n",
rc, rc_wait, (int) getpid());
}
return 0;
}
hello world (pid: 86584) hello, I am child (pid: 86586) hello, I am parent of 86586 (rc_wait: 86586) (pid: 86584)
The exec() syscall if useful when one wants to run a program different from the calling program. Here execvp() runs wc which shows how many lines, words and bytes are in process-run.py.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
int main(int argc, char *argv[]) {
printf("hello world (pid: %d)\n", (int) getpid());
fflush(stdout); // Without flushing, the child was getting copy of stdout buffer
int rc = fork();
if (rc < 0) { // fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) { // child (new process)
printf("hello, I am child (pid: %d)\n", (int) getpid());
fflush(stdout); // Always flush!
char *myargs[3];
myargs[0] = strdup("wc");
myargs[1] = strdup("process-run.py");
myargs[2] = NULL; // marks end of array
execvp(myargs[0], myargs); // runs `wc`
} else { // parent goes down this path (main)
int rc_wait = wait(NULL);
printf("hello, I am parent of %d (rc_wait: %d) (pid: %d)\n",
rc, rc_wait, (int) getpid());
}
return 0;
}
hello world (pid: 2019) hello, I am child (pid: 2020) 342 1174 13351 process-run.py hello, I am parent of 2020 (rc_wait: 2020) (pid: 2019)
Here the exec() call, loads code (and static data) from the executable passed to it and overwrites its current code segment (and current static data) with the loaded one; the heap/stack and other parts of memory space of program are re-initialized and the OS just runs the new program. It doesn’t even create a new process, just transforms the existing one (fork_wait_exec_p3) into the exec’d program (wc).
man execve | col -b
Output
EXECVE(2) System Calls Manual EXECVE(2)
NAME
execve – execute a file
SYNOPSIS
#include <unistd.h>
int
execve(const char *path, char *const argv[], char *const envp[]);
DESCRIPTION
execve() transforms the calling process into a new process. The new
process is constructed from an ordinary file, whose name is pointed to by
path, called the new process file. This file is either an executable
object file, or a file of data for an interpreter. An executable object
file consists of an identifying header, followed by pages of data
representing the initial program (text) and initialized data pages.
Additional pages may be specified by the header to be initialized with
zero data; see a.out(5).
An interpreter file begins with a line of the form:
#! interpreter [arg ...]
When an interpreter file is execve()'d, the system runs the specified
interpreter. If any optional args are specified, they become the first
(second, ...) argument to the interpreter. The name of the originally
execve()'d file becomes the subsequent argument; otherwise, the name of
the originally execve()'d file is the first argument. The original
arguments to the invocation of the interpreter are shifted over to become
the final arguments. The zeroth argument, normally the name of the
execve()'d file, is left unchanged.
The argument argv is a pointer to a null-terminated array of character
pointers to null-terminated character strings. These strings construct
the argument list to be made available to the new process. At least one
argument must be present in the array; by custom, the first element
should be the name of the executed program (for example, the last
component of path).
The argument envp is also a pointer to a null-terminated array of
character pointers to null-terminated strings. A pointer to this array
is normally stored in the global variable environ. These strings pass
information to the new process that is not directly an argument to the
command (see environ(7)).
File descriptors open in the calling process image remain open in the new
process image, except for those for which the close-on-exec flag is set
(see close(2) and fcntl(2)). Descriptors that remain open are unaffected
by execve().
Signals set to be ignored in the calling process are set to be ignored in
the new process. Signals which are set to be caught in the calling
process image are set to default action in the new process image.
Blocked signals remain blocked regardless of changes to the signal
action. The signal stack is reset to be undefined (see sigaction(2) for
more information).
If the set-user-ID mode bit of the new process image file is set (see
chmod(2)), the effective user ID of the new process image is set to the
owner ID of the new process image file. If the set-group-ID mode bit of
the new process image file is set, the effective group ID of the new
process image is set to the group ID of the new process image file. (The
effective group ID is the first element of the group list.) The real
user ID, real group ID and other group IDs of the new process image
remain the same as the calling process image. After any set-user-ID and
set-group-ID processing, the effective user ID is recorded as the saved
set-user-ID, and the effective group ID is recorded as the saved set-
group-ID. These values may be used in changing the effective IDs later
(see setuid(2)).
The new process also inherits the following attributes from the calling
process:
process ID see getpid(2)
parent process ID see getppid(2)
process group ID see getpgrp(2)
access groups see getgroups(2)
working directory see chdir(2)
root directory see chroot(2)
control terminal see termios(4)
resource usages see getrusage(2)
interval timers see getitimer(2)
resource limits see getrlimit(2)
file mode mask see umask(2)
signal mask see sigaction(2), sigsetmask(2)
When a program is executed as a result of an execve() call, it is entered
as follows:
int
main(int argc, char **argv, char **envp);
where argc is the number of elements in argv (the ``arg count'') and argv
points to the array of character pointers to the arguments themselves.
RETURN VALUES
As the execve() function overlays the current process image with a new
process image, the successful call has no process to return to. If
execve() does return to the calling process, an error has occurred; the
return value will be -1 and the global variable errno is set to indicate
the error.
ERRORS
execve() will fail and return to the calling process if:
[E2BIG] The number of bytes in the new process's argument list
is larger than the system-imposed limit. This limit
is specified by the sysctl(3) MIB variable
KERN_ARGMAX.
[EACCES] Search permission is denied for a component of the
path prefix.
[EACCES] The new process file is not an ordinary file.
[EACCES] The new process file mode denies execute permission.
[EACCES] The new process file is on a filesystem mounted with
execution disabled (MNT_NOEXEC in ⟨sys/mount.h⟩).
[EFAULT] The new process file is not as long as indicated by
the size values in its header.
[EFAULT] Path, argv, or envp point to an illegal address.
[EIO] An I/O error occurred while reading from the file
system.
[ELOOP] Too many symbolic links were encountered in
translating the pathname. This is taken to be
indicative of a looping symbolic link.
[ENAMETOOLONG] A component of a pathname exceeded {NAME_MAX}
characters, or an entire path name exceeded {PATH_MAX}
characters.
[ENOENT] The new process file does not exist.
[ENOEXEC] The new process file has the appropriate access
permission, but has an unrecognized format (e.g., an
invalid magic number in its header).
[ENOMEM] The new process requires more virtual memory than is
allowed by the imposed maximum (getrlimit(2)).
[ENOTDIR] A component of the path prefix is not a directory.
[ETXTBSY] The new process file is a pure procedure (shared text)
file that is currently open for writing or reading by
some process.
CAVEAT
If a program is setuid to a non-super-user, but is executed when the real
uid is ``root'', then the program has some of the powers of a super-user
as well.
SEE ALSO
exit(2), fork(2), execl(3), sysctl(3), environ(7)
HISTORY
The execve() function call appeared in 4.2BSD.
BSD 4 January 24, 1994 BSD 4
The separation and interface of fork() and exec() allows the shell to run code after forking, but before execing; giving it the ability to alter the environment of the program that is about to run. What happens in a shell is that after getting the command, it finds out where the executable resides, fork() s a new child process to run the command, runs exec() for the command and then wait() s for it to return. To illustrate: in wc redir.c > newfile.txt, the output redirection happens by the shell closing stdout and opens newfile.txt before execing.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/wait.h>
int main(int argc, char *argv[]) {
int rc = fork();
if (rc < 0) { // fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) { // child (new process)
// child: redirect standard output to a file
close(STDOUT_FILENO);
open("./newfile.txt", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);
char *myargs[3];
myargs[0] = strdup("wc");
myargs[1] = strdup("redir.c");
myargs[2] = NULL; // marks end of array
execvp(myargs[0], myargs); // runs `wc`
} else { // parent goes down this path (main)
int rc_wait = wait(NULL);
}
return 0;
}
gcc -o redir redir.c
./redir
wait
cat newfile.txt
rm redir newfile.txt
27 97 786 redir.c
Other syscalls, such as kill() is used to signal to a process to pause, die etc. Some shell keys also sends these signals, for example: C-c sends SIGINT (interrupt, normal termination), C-z sends SIGSTP (stop, can resume with fg). This signal() subsystem allows for sending external events to processes. The process must “catch” these signals, suspend normal execution and run some code in response.
Homework
fork.py
#! /usr/bin/env python3
# from this sort of thing
# a forks b
# a forks c
# a forks d
# b forks e
# b forks f
# d forks g
# to a process tree
#a --- b --- e
# | |
# | |- f
# |- c
# |
# |- d --- g
from __future__ import print_function
import random
import string
from optparse import OptionParser
#
# to make Python2 and Python3 act the same -- how dumb
#
def random_seed(seed):
try:
random.seed(seed, version=1)
except:
random.seed(seed)
return
def random_randint(low, hi):
return int(low + random.random() * (hi - low + 1))
def random_choice(L):
return L[random_randint(0, len(L)-1)]
#
# class Forker does all the work
#
class Forker:
def __init__(self, fork_percentage, actions, action_list, show_tree, just_final,
leaf_only, local_reparent, print_style, solve):
self.fork_percentage = fork_percentage
self.max_actions = actions
self.action_list = action_list
self.show_tree = show_tree
self.just_final = just_final
self.leaf_only = leaf_only
self.local_reparent = local_reparent
self.print_style = print_style
self.solve = solve
# root process is always this
self.root_name = 'a'
# process list: names of all active processes
self.process_list = [self.root_name]
# for each process, it's children
self.children = {}
self.children[self.root_name] = []
# and track parents
self.parents = {}
# this is for pretty process names...
self.name_length = 1
self.base_names = string.ascii_lowercase + string.ascii_uppercase
self.curr_names = self.base_names
self.curr_index = 1
return
def grow_names(self):
new_names = []
for b1 in self.curr_names:
for b2 in self.base_names:
new_names.append(b1 + b2)
self.curr_names = new_names
self.curr_index = 0
return
def get_name(self):
if self.curr_index == len(self.curr_names):
self.grow_names()
name = self.curr_names[self.curr_index]
self.curr_index += 1
return name
def walk(self, p, level, pmask, is_last):
print(' ', end='')
if self.print_style == 'basic':
for i in range(level):
print(' ', end='')
print('%2s' % p)
for child in self.children[p]:
self.walk(child, level + 1, {}, False)
return
elif self.print_style == 'line1':
chars = ('|', '-', '+', '|')
elif self.print_style == 'line2':
chars = ('|', '_', '|', '|')
elif self.print_style == 'fancy':
# these characters taken from 'treelib', a fun printing package for trees
# https://github.com/caesar0301/treelib
# chars = ('\u2502', '\u2500', '\u251c', '\u2514')
chars = (u'\u2502', u'\u2500', u'\u251c', u'\u2514')
else:
print('bad style %s' % self.print_style)
exit(1)
# print stuff before node
if level > 0:
# main printing
for i in range(level-1):
if pmask[i]:
# '| '
print('%s ' % chars[0], end='')
else:
print(' ', end='')
if pmask[level-1]:
# '|__'
if is_last:
print('%s%s%s ' % (chars[3], chars[1], chars[1]), end='')
else:
print('%s%s%s ' % (chars[2], chars[1], chars[1]), end='')
else:
# '___'
print(' %s%s%s ' % (chars[1], chars[1], chars[1]), end='')
# print node
print('%s' % p)
# undo parent verticals
if is_last:
pmask[level-1] = False
# recurse
pmask[level] = True
for child in self.children[p][:-1]:
self.walk(child, level + 1, pmask, False)
for child in self.children[p][-1:]:
self.walk(child, level + 1, pmask, True)
return
def print_tree(self):
return self.walk(self.root_name, 0, {}, False)
def do_fork(self, p, c):
self.process_list.append(c)
self.children[c] = []
self.children[p].append(c)
self.parents[c] = p
return '%s forks %s' % (p, c)
def collect_children(self, p):
if self.children[p] == []:
return [p]
else:
L = [p]
for c in self.children[p]:
L += self.collect_children(c)
return L
def do_exit(self, p):
# remove the process from the process list
if p == self.root_name:
print('root process: cannot exit')
exit(1)
exit_parent = self.parents[p]
self.process_list.remove(p)
# for each orphan, set its parent to exiting proc's parent or root
if self.local_reparent:
for orphan in self.children[p]:
self.parents[orphan] = exit_parent
self.children[exit_parent].append(orphan)
else:
# should set ALL descendants to be child of ROOT
descendents = self.collect_children(p)
descendents.remove(p)
for d in descendents:
self.children[d] = []
self.parents[d] = self.root_name
self.children[self.root_name].append(d)
# remove the entry from its parent child list
self.children[exit_parent].remove(p)
self.children[p] = -1 # should never be used again
self.parents[p] = -1 # should never be used again
# remove the entry for this proc from children
return '%s EXITS' % p
def bad_action(self, action):
print('bad action (%s), must be X+Y or X- where X and Y are processes' % action)
exit(1)
return
def check_legal(self, action):
if '+' in action:
tmp = action.split('+')
if len(tmp) != 2:
self.bad_action(action)
return [tmp[0], tmp[1]]
elif '-' in action:
tmp = action.split('-')
if len(tmp) != 2:
self.bad_action(action)
return [tmp[0]]
else:
self.bad_action(action)
return
def run(self):
print(' Process Tree:')
self.print_tree()
print('')
if self.action_list != '':
# use specific action list
action_list = self.action_list.split(',')
else:
action_list = []
actions = 0
temp_process_list = [self.root_name]
level_list = {}
level_list[self.root_name] = 1
# this got a little yucky, too much re-doing of the work
while actions < self.max_actions:
if random.random() < self.fork_percentage:
# FORK:: pick random parent, add child to it
fork_choice = random_choice(temp_process_list)
new_child = self.get_name()
action_list.append('%s+%s' % (fork_choice, new_child))
temp_process_list.append(new_child)
else:
# EXIT:: pick random child, remove it
# exception: no killing root process, sorry
exit_choice = random_choice(temp_process_list)
if exit_choice == self.root_name:
continue
temp_process_list.remove(exit_choice)
action_list.append('%s-' % exit_choice)
actions += 1
for a in action_list:
tmp = self.check_legal(a)
if len(tmp) == 2:
fork_choice, new_child = tmp[0], tmp[1]
if fork_choice not in self.process_list:
self.bad_action(a)
action = self.do_fork(fork_choice, new_child)
else:
exit_choice = tmp[0]
if exit_choice not in self.process_list:
self.bad_action(a)
if self.leaf_only and len(self.children[exit_choice]) > 0:
action = '%s EXITS (failed: has children)' % exit_choice
else:
action = self.do_exit(exit_choice)
# if we got here, we actually did an action...
if self.show_tree:
# SHOW TREES (guess actions)
if self.solve:
print('Action:', action)
else:
print('Action?')
# print('Process Tree:')
if not self.just_final:
self.print_tree()
else:
# SHOW ACTIONS (guess tree)
print('Action:', action)
if not self.just_final:
if self.solve:
# print('Process Tree:')
self.print_tree()
else:
print('Process Tree?')
if self.just_final:
if self.show_tree:
print('\n Final Process Tree:')
self.print_tree()
print('')
else:
if self.solve:
print('\n Final Process Tree:')
self.print_tree()
print('')
else:
print('\n Final Process Tree?\n')
return
#
# main
#
parser = OptionParser()
parser.add_option('-s', '--seed', default=-1, help='the random seed', action='store', type='int', dest='seed')
parser.add_option('-f', '--forks', default=0.7, help='percent of actions that are forks (not exits)', action='store', type='float', dest='fork_percentage')
parser.add_option('-A', '--action_list', default='', help='action list, instead of randomly generated ones (format: a+b,b+c,b- means a fork b, b fork c, b exit)', action='store', type='string', dest='action_list')
parser.add_option('-a', '--actions', default=5, help='number of forks/exits to do', action='store', type='int', dest='actions')
parser.add_option('-t', '--show_tree', help='show tree (not actions)', action='store_true', default=False, dest='show_tree')
parser.add_option('-P', '--print_style', help='tree print style (basic, line1, line2, fancy)', action='store', type='string', default='fancy', dest='print_style')
parser.add_option('-F', '--final_only', help='just show final state', action='store_true', default=False, dest='just_final')
parser.add_option('-L', '--leaf_only', help='only leaf processes exit', action='store_true', default=False, dest='leaf_only')
parser.add_option('-R', '--local_reparent', help='reparent to local parent', action='store_true', default=False, dest='local_reparent')
parser.add_option('-c', '--compute', help='compute answers for me', action='store_true', default=False, dest='solve')
(options, args) = parser.parse_args()
if options.seed != -1:
random_seed(options.seed)
if options.fork_percentage <= 0.001:
print('fork_percentage must be > 0.001')
exit(1)
print('')
print('ARG seed', options.seed)
print('ARG fork_percentage', options.fork_percentage)
print('ARG actions', options.actions)
print('ARG action_list', options.action_list)
print('ARG show_tree', options.show_tree)
print('ARG just_final', options.just_final)
print('ARG leaf_only', options.leaf_only)
print('ARG local_reparent', options.local_reparent)
print('ARG print_style', options.print_style)
print('ARG solve', options.solve)
print('')
f = Forker(options.fork_percentage, options.actions, options.action_list, options.show_tree, options.just_final, options.leaf_only, options.local_reparent, options.print_style, options.solve)
f.run()
# -s is for specifying the seed
python fork.py -s 10 -c
Output
ARG seed 10
ARG fork_percentage 0.7
ARG actions 5
ARG action_list
ARG show_tree False
ARG just_final False
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve True
Process Tree:
a
Action: a forks b
a
└── b
Action: a forks c
a
├── b
└── c
Action: c EXITS
a
└── b
Action: a forks d
a
├── b
└── d
Action: a forks e
a
├── b
├── d
└── e
# -f is for controlling the fork percentage
python fork.py -a 20 -f 0.1 -c
echo "------------------------"
python fork.py -a 20 -f 0.9 -c
Output
ARG seed -1
ARG fork_percentage 0.1
ARG actions 20
ARG action_list
ARG show_tree False
ARG just_final False
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve True
Process Tree:
a
Action: a forks b
a
└── b
Action: b EXITS
a
Action: a forks c
a
└── c
Action: c forks d
a
└── c
└── d
Action: d EXITS
a
└── c
Action: c EXITS
a
Action: a forks e
a
└── e
Action: e EXITS
a
Action: a forks f
a
└── f
Action: f EXITS
a
Action: a forks g
a
└── g
Action: g EXITS
a
Action: a forks h
a
└── h
Action: h EXITS
a
Action: a forks i
a
└── i
Action: i EXITS
a
Action: a forks j
a
└── j
Action: j EXITS
a
Action: a forks k
a
└── k
Action: k EXITS
a
------------------------
ARG seed -1
ARG fork_percentage 0.9
ARG actions 20
ARG action_list
ARG show_tree False
ARG just_final False
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve True
Process Tree:
a
Action: a forks b
a
└── b
Action: b forks c
a
└── b
└── c
Action: a forks d
a
├── b
│ └── c
└── d
Action: c forks e
a
├── b
│ └── c
│ └── e
└── d
Action: c forks f
a
├── b
│ └── c
│ ├── e
│ └── f
└── d
Action: c forks g
a
├── b
│ └── c
│ ├── e
│ ├── f
│ └── g
└── d
Action: b forks h
a
├── b
│ ├── c
│ │ ├── e
│ │ ├── f
│ │ └── g
│ └── h
└── d
Action: a forks i
a
├── b
│ ├── c
│ │ ├── e
│ │ ├── f
│ │ └── g
│ └── h
├── d
└── i
Action: c forks j
a
├── b
│ ├── c
│ │ ├── e
│ │ ├── f
│ │ ├── g
│ │ └── j
│ └── h
├── d
└── i
Action: j forks k
a
├── b
│ ├── c
│ │ ├── e
│ │ ├── f
│ │ ├── g
│ │ └── j
│ │ └── k
│ └── h
├── d
└── i
Action: e forks l
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ ├── g
│ │ └── j
│ │ └── k
│ └── h
├── d
└── i
Action: j EXITS
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ └── h
├── d
├── i
└── k
Action: i forks m
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ └── h
├── d
├── i
│ └── m
└── k
Action: g forks n
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ └── n
│ └── h
├── d
├── i
│ └── m
└── k
Action: h forks o
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ └── n
│ └── h
│ └── o
├── d
├── i
│ └── m
└── k
Action: i forks p
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ └── n
│ └── h
│ └── o
├── d
├── i
│ ├── m
│ └── p
└── k
Action: g forks q
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ ├── n
│ │ └── q
│ └── h
│ └── o
├── d
├── i
│ ├── m
│ └── p
└── k
Action: a forks r
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ ├── n
│ │ └── q
│ └── h
│ └── o
├── d
├── i
│ ├── m
│ └── p
├── k
└── r
Action: b forks s
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ ├── n
│ │ └── q
│ ├── h
│ │ └── o
│ └── s
├── d
├── i
│ ├── m
│ └── p
├── k
└── r
Action: n forks t
a
├── b
│ ├── c
│ │ ├── e
│ │ │ └── l
│ │ ├── f
│ │ └── g
│ │ ├── n
│ │ │ └── t
│ │ └── q
│ ├── h
│ │ └── o
│ └── s
├── d
├── i
│ ├── m
│ └── p
├── k
└── r
# -t is for showing the process tree and not actions
python fork.py -t
Output
ARG seed -1
ARG fork_percentage 0.7
ARG actions 5
ARG action_list
ARG show_tree True
ARG just_final False
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve False
Process Tree:
a
Action?
a
└── b
Action?
a
├── b
└── c
Action?
a
├── b
└── c
└── d
Action?
a
├── b
└── c
Action?
a
└── c
# -A action list: a+b => a fork b; b- => b exit
python fork.py -A a+b,b+c,c+d,c+e,c- -c
# even though c exits, d & e are running with a as parent
# -R reparents to local parent i.e. b
python fork.py -A a+b,b+c,c+d,c+e,c- -R -c
Output
ARG seed -1
ARG fork_percentage 0.7
ARG actions 5
ARG action_list a+b,b+c,c+d,c+e,c-
ARG show_tree False
ARG just_final False
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve True
Process Tree:
a
Action: a forks b
a
└── b
Action: b forks c
a
└── b
└── c
Action: c forks d
a
└── b
└── c
└── d
Action: c forks e
a
└── b
└── c
├── d
└── e
Action: c EXITS
a
├── b
├── d
└── e
ARG seed -1
ARG fork_percentage 0.7
ARG actions 5
ARG action_list a+b,b+c,c+d,c+e,c-
ARG show_tree False
ARG just_final False
ARG leaf_only False
ARG local_reparent True
ARG print_style fancy
ARG solve True
Process Tree:
a
Action: a forks b
a
└── b
Action: b forks c
a
└── b
└── c
Action: c forks d
a
└── b
└── c
└── d
Action: c forks e
a
└── b
└── c
├── d
└── e
Action: c EXITS
a
└── b
├── d
└── e
# -F skip intermediate steps, show final tree
python fork.py -t -F
Output
ARG seed -1
ARG fork_percentage 0.7
ARG actions 5
ARG action_list
ARG show_tree True
ARG just_final True
ARG leaf_only False
ARG local_reparent False
ARG print_style fancy
ARG solve False
Process Tree:
a
Action?
Action?
Action?
Action?
Action?
Final Process Tree:
a
├── b
└── d
└── e
Code Homework
Write a program that calls fork while setting a variable value, what happens to variable when both child and parent change this value?
p1.C
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
int x = 100;
printf("hello, x is '%d' (pid: %d)\n", x, (int) getpid());
fflush(stdout);
int rc = fork();
x += 5;
if (rc < 0) { // fork failed
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) { // child (new process)
x -= 10;
printf("hello, I am child, x is '%d' (pid: %d)\n", x, (int) getpid());
} else { // parent goes down this path (main)
printf("hello, I am parent of %d, x is '%d' (pid: %d)\n", rc, x, (int) getpid());
}
return 0;
}
hello, x is '100' (pid: 90319) hello, I am parent of 90320, x is '105' (pid: 90319) hello, I am child, x is '95' (pid: 90320)
Write a program that opens a file and then forks. Can child and parent access fd returned by open()? What happens when both write to file concurrently? How to ensure child process prints before parent, can it be done without using wait()?
p2-p3.C
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <assert.h>
int main(int argc, char *argv[]) {
int fd = open("/tmp/file_ostep", O_WRONLY|O_CREAT|O_TRUNC, S_IRWXU);
assert(fd > -1);
printf("hello, fd is '%d' (pid: %d)\n", fd, (int) getpid());
fflush(stdout);
int rc = fork();
if (rc < 0) {
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
int rcw = write(fd, "hello from child\n", 17); // 17 = nbyte
assert(rcw == 17);
printf("hello, I am child, rc is '%d', rcw is '%d' (pid: %d)\n",
rc, rcw, (int) getpid());
} else {
// wait(NULL);
sleep(2); // To ensure child writes first without using `wait()`
int rcw = write(fd, "goodbye from parent\n", 20); // 20 = nbyte
assert(rcw == 20);
printf("hello, I am parent of %d, rcw is '%d', (pid: %d)\n",
rc, rcw, (int) getpid());
}
close(fd);
return 0;
}
hello, fd is '3' (pid: 18508) hello, I am child, rc is '0', rcw is '17' (pid: 18509) hello, I am parent of 18509, rcw is '20', (pid: 18508)
cat /tmp/file_ostep
hello from child goodbye from parent
Write program that forks, then calls various variants of exec() to run /bin/ls.
p4.C
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
int rc = fork();
if (rc < 0) {
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
printf("hello, I am child, rc is '%d' (pid: %d)\n", rc, (int) getpid());
char * args[] = {"ls", "-l", NULL}; // List of pointers to null-terminated strings
char * env[] = {NULL}; // Environment of exectured process
execl("/bin/ls", "ls", "-l", (char *) 0);
execv("/bin/ls", args);
execle("/bin/ls", "ls", "-l", (char *) 0, env);
execve("/bin/ls", args, env);
execlp("ls", "ls", "-l", (char *) 0);
execvp("ls", args);
} else {
printf("hello, I am parent of %d, (pid: %d)\n", rc, (int) getpid());
}
return 0;
}
hello, I am parent of 29461, (pid: 29460) total 352 -rw-r--r--@ 1 IABDCEF staff 12109 2 Mar 23:35 fork.py -rw-r--r--@ 1 IABDCEF staff 14194 19 Feb 01:52 intro.org -rwx------@ 1 IABDCEF staff 20 2 Mar 17:59 newfile.txt -rw-r--r--@ 1 IABDCEF staff 13351 2 Mar 12:14 process-run.py -rw-r--r--@ 1 IABDCEF staff 88866 3 Mar 02:33 process.org -rwxr-xr-x@ 1 IABDCEF staff 33896 2 Mar 17:59 redir -rw-r--r--@ 1 IABDCEF staff 786 2 Mar 17:59 redir.c