CSAPP Chapter 1 Excerpt

Processes

When a program runs on a modern system, the operating system provides the illusion that the program is the only one running on the system. The program appears to have exclusive use of both the processor, main memory, and I/O devices. The processor appears to execute the instructions in the program, one after the other, without interruption. And the code and data of the program appear to be the only objects in the system’s memory. These illusions are provided by the notion of a process.

About kernel

  • The kernel is the portion of the operating system code that is always resident in memory.
  • The kernel is not a separate process. Instead, it is a collection of code and data structures that the system uses to manage all the processes.

Threads

A process can actually consist of multiple execution units, called threads, each running in the context of the process and sharing the same code and global data.

  • Threads are typically more efficient than processes.

Virtual Memory

Virtual memory is an abstraction that provides each process with the illusion that it has exclusive use of the main memory. Each process has the same uniform view of memory, which is known as its virtual address space.

                  +-----------------------+  ^
                  |                       |  |
                  | Kernel virtual memory |  | Memory invisible to user code
                  |                       |  |
                  +-----------------------+  |
                  |      User stack       |
                  |  (create at run time) |
                  +-----+-----------------+
                  |     |                 |
                  |     v          ^      |
                  |                |      |
                  +----------------+------+
                  | Memory-mapped region  |
                  | for shared libraries  |  Like print function
                  +-----------------------+
                  |                       |
                  |           ^           |
                  |           |           |
                  +-----------+-----------+
                  |     Run-time heap     |
                  |  (created by malloc)  |
                  +-----------------------+
                  |                       | --+
                  |    Read/write data    |   |
                  |                       |   |
                  +-----------------------+   +-- Loaded from the executable file
                  |Read-only code and data|   |
                  +-----------------------+   |
Program start --->|                       | --+
                 0+-----------------------+

In Linux, the topmost region of the address space is reserved for code and data in the operating system that is common to all processes. The lower region of the address space holds the code and data defined by the user’s process.

  • Program code and data. The code and data areas are initialized directly from the contents of an executable object file.
  • Heap. The code and data areas are followed immediately by the run-time heap, which are fixed in size once the process begins running, the heap expands and contracts dynamically at run time as a result of calls to C standard library routines such as malloc and free.
  • Shared libraries. Near the middle of the address space is an area that holds the code and data for shared libraries such as the C standard library and the math library.
  • Stack. At the top of the user’s virtual address space is the user stack that the compiler uses to implement function calls. Like the heap, the user stack expands and contracts dynamically during the execution of the program.
  • Kernel virtual memory. The top region of the address space is reserved for the kernel. They must invoke the kernel to perform read or write operations.

Files

A file is a sequence of bytes, nothing more and nothing less. Every I/O device, including disks, keyboards, displays, and even networks, is modeled as a file, which provides applications with a uniform view of all the varied I/O devices that might be contained in the system.

Concurrency and Parallelism

We use the term concurrency to refer to the general concept of a system with multiple, simultaneous activities, and the term parallelism to refer to the use of concurrency to make a system run faster.

Hyperthreading

Hyperthreading, sometimes called simultaneous multi-threading, is a technique that allows a single CPU to execute multiple flows of control. As an example, the Intel Core i7 processor can have each core executing two threads, and so a four-core system can actually execute eight threads in parallel.