OSTEP - Book Club
Schedule: https://eatonphil.com/2026-ostep.html
Homepage: https://pages.cs.wisc.edu/~remzi/OSTEP/ Code: https://github.com/remzi-arpacidusseau/ostep-code Homework: https://github.com/remzi-arpacidusseau/ostep-homework Projects: https://github.com/remzi-arpacidusseau/ostep-projects
Introduction
In a socratic fashion, the first chapter is a dialogue between a Professor and a Student where the key ideas and how best to learn from the book is talked about.
The three easy pieces refer to virtualization, concurrency and persistence. These ideas will teach one how an OS works, how it decides what program to run next, how it handles memory overload in a virtual memory system, how virtual machine monitors work, how it manages information on disks etc.
The next chapter is an introduction to the three pieces with illustrative examples and how OS takes physical resources, virtualizes them, handles concurrency and stores files persistently. The goal in designing such a system is to provide high performance or to minimize overheads (extra time or space) of the OS. Furthermore, it should also offer protection (isolation) between applications, OS and the application while being reliable.
Then the book talks abouts execution of programs and tries to define the operating system.
A running program executes instructions. The processor fetches an instruction from memory, decodes it (find out what instruction), and executes it; then moves on to the next instruction until the program finishes. The body of software allowing the computer to manage running of various programs to ensure they run correctly and efficiently is known as the Operating System.
To make it easier to manage resources of the system OS takes the physical resource (CPU, memory or disk) and transforms it into a virtual form, giving the OS the nickname virtual machine or resource manager. The first part of the book aims to deal with how virtualization is done.
The chapter ends with a history of how operating systems were developed from early libraries to multiprogramming to PCs running DOS, linux and macOS.
CPU Virtualization
#ifndef __common_h__
#define __common_h__
#include <sys/time.h>
#include <sys/stat.h>
#include <assert.h>
double GetTime() {
struct timeval t;
int rc = gettimeofday(&t, NULL);
assert(rc == 0);
return (double) t.tv_sec + (double) t.tv_usec/1e6;
}
// Repeatedly checks time and returns once 1s has passed.
void Spin(int howlong) {
double t = GetTime();
while ((GetTime() - t) < (double) howlong)
; // do nothing in loop
}
#endif // __common_h__
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/time.h>
<<include-common>>
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: cpu <string>\n");
exit(1);
}
char *str = argv[1];
setbuf(stdout, NULL); // disable buffering
double start = GetTime();
while(GetTime() - start < 5.0) {
Spin(1);
printf("(%d) %s\n", getpid(), str);
}
return 0;
}
Compiling and executing the above gives the result:
gcc -o cpu cpu.c -I.
./cpu "A" & ./cpu "B" &
wait
Output
A B A B A B A B A B
Even though there's only one processor, the execution of the two programs ./cpu "A" and ./cpu "B" seems to be happening at the same time, this is because of CPU virtualization. But for two programs to run simultaneously, the OS needs to have a policy to decide which should run.
Memory Virtualization
Physical memory is just an array of bytes; to read or write memory, an address must be specified to access the data there.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
<<include-common>>
int main(int argc, char *argv[]) {
int *p = malloc(sizeof(int)); // memory allocation
assert(p != NULL);
printf("(%d) address pointed to by p -> %p\n", getpid(), p); // mem addr
*p = 0; // write 0 to location specified by memory address
double start = GetTime();
setbuf(stdout, NULL); // disable buffering
while(GetTime() - start < 5.0) {
Spin(1);
*p = *p + 1; // update value at the location by 1
printf("(%d) p => %d addr -> %p\n", getpid(), *p, p);
}
return 0;
}
gcc -o mem mem.c -I.
./mem & ./mem &
wait
Output
(19998) address pointed to by p -> 0x1047cd830 (19997) address pointed to by p -> 0x10263d830 (19997) p => 1 addr -> 0x10263d830 (19998) p => 1 addr -> 0x1047cd830 (19997) p => 2 addr -> 0x10263d830 (19998) p => 2 addr -> 0x1047cd830 (19997) p => 3 addr -> 0x10263d830 (19998) p => 3 addr -> 0x1047cd830 (19997) p => 4 addr -> 0x10263d830 (19998) p => 4 addr -> 0x1047cd830 (19997) p => 5 addr -> 0x10263d830 (19998) p => 5 addr -> 0x1047cd830
In the textbook example, for both instances of the program, the memory was allocated to the same address, this was used to illustrate that each running program has its own private memory (memory virtualization, private address space that gets mapped to physical memory) rather than sharing physical memory with other programs.
However for security purposes, macOS has enabled address-space layout randomization (ASLR) hence the different addresses for the two processes.
Concurrency
#ifndef __common_threads_h__
#define __common_threads_h__
#include <pthread.h>
#include <assert.h>
#include <sched.h>
#ifdef __linux__
#include <semaphore.h>
#endif
// wrapper around pthread_create() that ensures the return value is success
#define Pthread_create(thread, attr, start_routine, arg) assert(pthread_create(thread, attr, start_routine, arg) == 0);
#define Pthread_join(thread, value_ptr) assert(pthread_join(thread, value_ptr) == 0);
#define Pthread_mutex_lock(m) assert(pthread_mutex_lock(m) == 0);
#define Pthread_mutex_unlock(m) assert(pthread_mutex_unlock(m) == 0);
#define Pthread_cond_signal(cond) assert(pthread_cond_signal(cond) == 0);
#define Pthread_cond_wait(cond, mutex) assert(pthread_cond_wait(cond, mutex) == 0);
#define Mutex_init(m) assert(pthread_mutex_init(m, NULL) == 0);
#define Mutex_lock(m) assert(pthread_mutex_lock(m) == 0);
#define Mutex_unlock(m) assert(pthread_mutex_unlock(m) == 0);
#define Cond_init(cond) assert(pthread_cond_init(cond, NULL) == 0);
#define Cond_signal(cond) assert(pthread_cond_signal(cond) == 0);
#define Cond_wait(cond, mutex) assert(pthread_cond_wait(cond, mutex) == 0);
#ifdef __linux__
#define Sem_init(sem, value) assert(sem_init(sem, 0, value) == 0);
#define Sem_wait(sem) assert(sem_wait(sem) == 0);
#define Sem_post(sem) assert(sem_post(sem) == 0);
#endif // __linux__
#endif // __common_threads_h__
#include <stdio.h>
#include <stdlib.h>
<<include-common>>
<<include-common_threads>>
volatile int counter = 0;
int loops;
void *worker(void *arg) {
for (int i = 0; i < loops; i++) {
counter++;
}
return NULL;
}
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: threads <value>\n");
exit(1);
}
loops = atoi(argv[1]);
pthread_t p1, p2;
// setbuf(stdout, NULL); // disable buffering
printf("Initial value: %d\n", counter);
Pthread_create(&p1, NULL, worker, NULL);
Pthread_create(&p2, NULL, worker, NULL);
// The pthread_join() function suspends execution of the calling thread (main)
// until the target thread (p1/p2) terminates unless the target thread
// has already terminated.
Pthread_join(p1, NULL);
Pthread_join(p2, NULL);
printf("Final value: %d\n", counter);
return 0;
}
Think of the thread as a function running withing the same memory space as other functions, with more than one of them active at a time. Here each thread starts running in the worker() routine which increments the counter for loops number of times. Given loops as 1000, the expectation is that the final value of counter should be 2000 but that's not guaranteed.
gcc -o threads threads.c -I. -pthread
./threads 1000
wait
./threads 100000
wait
./threads 100000
wait
Initial value: 0 Final value: 1172 Initial value: 0 Final value: 100051 Initial value: 0 Final value: 101961
There's a different final counter value each time the program is executed, even with the same loops value. This is because the instructions of the worker() routine aren't atomic i.e. there are three separate instructions involved in updating the counter value:
- Load counter value in register
- Increment the value
- Store it from register back to memory
If instructions aren't atomic and there are multiple concurrent threads executing withtin the same memory space, how can the program's correctness and efficiency be ensured?
To see the documentation for pthread_create check the following man page.
man 3 pthread_create | col -b | head -n 45
Output
PTHREAD_CREATE(3) Library Functions Manual PTHREAD_CREATE(3)
NAME
pthread_create – create a new thread
SYNOPSIS
#include <pthread.h>
int
pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine)(void *), void *arg);
DESCRIPTION
The pthread_create() function is used to create a new thread, with
attributes specified by attr, within a process. If attr is NULL, the
default attributes are used. If the attributes specified by attr are
modified later, the thread's attributes are not affected. Upon
successful completion pthread_create() will store the ID of the created
thread in the location specified by thread.
The thread is created executing start_routine with arg as its sole
argument. If the start_routine returns, the effect is as if there was an
implicit call to pthread_exit() using the return value of start_routine
as the exit status. Note that the thread in which main() was originally
invoked differs from this. When it returns from main(), the effect is as
if there was an implicit call to exit() using the return value of main()
as the exit status.
Upon thread exit the storage for the thread must be reclaimed by another
thread via a call to pthread_join(). Alternatively, pthread_detach() may
be called on the thread to indicate that the system may automatically
reclaim the thread storage upon exit. The pthread_attr_setdetachstate()
function may be used on the attr argument passed to pthread_create() in
order to achieve the same effect as a call to pthread_detach() on the
newly created thread.
The signal state of the new thread is initialized as:
• The signal mask is inherited from the creating thread.
• The set of signals pending for the new thread is empty.
RETURN VALUES
If successful, the pthread_create() function will return zero. Otherwise
an error number will be returned to indicate the error.
Persistence
Final major theme of persistence relates to determinate data storage and not be dependent on volatile memory. The software in the OS that manages the disk is called file system. Unlike CPU and memory abstractions, the expectation when is comes to files is that the user would want to share the data in files with processes hence there is no need of virtualization of disk.
#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <fcntl.h>
#include <sys/types.h>
int main(int argc, char *argv[]) {
// The file name specified by path is opened for reading and/or writing, as
// specified by the argument oflag; the file descriptor is returned to the
// calling process.
// O_WRONLY: open for writing only (O_RONLY, O_RDWR, O_SEARCH, O_EXEC)
// O_CREAT: create file if not exist, O_TRUNC: truncate size to 0 if exist
// S_IRWXU: 00700 user (file owner) has read, write & execute permission
int fd = open("/tmp/file_ostep", O_WRONLY|O_CREAT|O_TRUNC, S_IRWXU);
// If successful, open() returns a non-negative integer, termed a file
// descriptor. It returns -1 on failure, and sets errno to indicate the
// error.
assert(fd > -1);
// write() attempts to write nbyte of data to the object referenced by the
// descriptor fd from the buffer pointed to by buf. Upon successful completion
// the number of bytes which were written is returned otherwise -1.
int rc = write(fd, "hello world\n", 12); // 12 = nbyte
assert(rc == 12);
// The close() call deletes a descriptor from the per-process object
// reference table. If this is the last reference to the underlying object,
// the object will be deactivated. `man 2 close`
close(fd);
return 0;
}
gcc -o io io.c
./io
wait
cat /tmp/file_ostep
hello world
Here, open(), write() and close() are system calls that are routed to the file system which handles these requests and returns their response to the user.