# CSAPP Chapter 2 Excerpt

## Information Storage

Suppose the variable x of type int and at address 0x100 has a hexadecimal value of 0x01234567. The ordering of the bytes within the address range 0x100 through 0x103 depends on the type of machine:

Big endian

                    0x100      0x101      0x102       0x103
+---------------+----------+----------+----------+----------+--------------+
|               |          |          |          |          |              |
|      ...      |    01    |    23    |    45    |    67    |     ...      |
|               |          |          |          |          |              |
+---------------+----------+----------+----------+----------+--------------+


Little endian

                    0x100      0x101      0x102       0x103
+---------------+----------+----------+----------+----------+--------------+
|               |          |          |          |          |              |
|      ...      |    67    |    45    |    23    |    01    |     ...      |
|               |          |          |          |          |              |
+---------------+----------+----------+----------+----------+--------------+


Most Intel-compatible machines operate exclusively in little-endian mode. On the other hand, most machines from IBM and Oracle (arising from their acquisition of Sun Microsystems in 2010) operate in big-endian mode.

Many recent microprocessor chips are bi-endian, meaning that they can be configured to operate as either little- or big-endian machines.

In practice, byte ordering becomes fixed once a particular operating system is chosen. For example, ARM microprocessors, used in many cell phones, have hardware that can operate in either little- or big-endian mode, but the two most common operating systems for these chips—Android (from Google) and IOS (from Apple)—operate only in little-endian mode.

### Boolean Algebra

The simplest Boolean algebra is defined over the two-element set {0, 1}.

Operations of Boolean algebra

~         &   0 1     |   0 1      ^   0 1
-----     --------    --------     --------
0   1     0   0 0     0   0 1      0   0 1

1   0     1   0 1     1   1 1      1   1 0


### Logical Operations in C

C also provides a set of logical operators ||, &&, and !, which correspond to the or, and, and not operations of logic. These can easily be confused with the bit-level operations, but their behavior is quite different. The logical operations treat any nonzero argument as representing true and argument 0 as representing false. They return either 1 or 0, indicating a result of either true or false, respectively.

Expression      Result
----------------------
!0x41           0x00
!0x00           0x01
!!0x41          0x01
0x69 && 0x55    0x01
0x69 || 0x55    0x01


### Shift Operations in C

C also provides a set of shift operations for shifting bit patterns to the left and to the right.

Operation              Value 1      Value 2
---------------------------------------------
Argument x            [01100011]   [10010101]
x << 4                [00110000]   [01010000]
x >> 4 (logical)      [00000110]   [00001001]
x >> 4 (arithmetic)   [00000110]   [11111001]


In practice, almost all compiler/machine combinations use arithmetic right shifts for signed data, and many programmers assume this to be the case. For unsigned data, on the other hand, right shifts must be logical.

In contrast to C, Java has a precise definition of how right shifts should be performed. The expression x >> k shifts x arithmetically by k positions, while x >>> k shifts it logically.

Shift amounts should be kept less than the word size.

## Integer Representations

### Unsigned Encodings

Definition of binary to unsigned function

For vector $x = [x_{w-1},x_{w-2},…,x_{0}]$:

$$B2U_{w}(\vec{x}) \doteq \sum_{i=0}^{w-1} x_{i}2^{i}$$

Examples:

$$B2U_{4}([0001]) = 0 \cdot 2^{3} + 0 \cdot 2^{2} + 0 \cdot 2^{1} + 1 \cdot 2^{0} = 0 + 0 + 0 + 1 = 1$$

$$B2U_{4}([0101]) = 0 \cdot 2^{3} + 1 \cdot 2^{2} + 0 \cdot 2^{1} + 1 \cdot 2^{0} = 0 + 4 + 0 + 1 = 5$$

$$B2U_{4}([1011]) = 1 \cdot 2^{3} + 0 \cdot 2^{2} + 1 \cdot 2^{1} + 1 \cdot 2^{0} = 8 + 0 + 2 + 1 = 11$$

$$B2U_{4}([1111]) = 1 \cdot 2^{3} + 1 \cdot 2^{2} + 1 \cdot 2^{1} + 1 \cdot 2^{0} = 8 + 4 + 2 + 1 = 15$$

Uniqueness of unsigned encoding Function $B2U_{w}$ is a bijection.

### Two’s-Complement Encodings

For many applications, we wish to represent negative values as well. The most common computer representation of signed numbers is known as two’s-complement form. This is defined by interpreting the most significant bit of the word to have negative weight. We express this interpretation as a function $B2T_{w}$ (for “binary to two’s complement” length $w$):

Definition of binary to two’s complement

For vector $x = [x_{w-1},x_{w-2},…,x_{0}]$:

$$B2T_{w}(\vec{x}) \doteq -x_{w-1}2^{w-1}+ \sum_{i=0}^{w-2} x_{i}2^{i}$$

Examples:

$$B2T_{4}([0001]) = -0 \cdot 2^{3} + 0 \cdot 2^{2} + 0 \cdot 2^{1} + 1 \cdot 2^{0} = 0 + 0 + 0 + 1 = 1$$

$$B2T_{4}([0101]) = -0 \cdot 2^{3} + 1 \cdot 2^{2} + 0 \cdot 2^{1} + 1 \cdot 2^{0} = 0 + 4 + 0 + 1 = 5$$

$$B2T_{4}([1011]) = -1 \cdot 2^{3} + 0 \cdot 2^{2} + 1 \cdot 2^{1} + 1 \cdot 2^{0} = -8 + 0 + 2 + 1 = -5$$

$$B2T_{4}([1111]) = -1 \cdot 2^{3} + 1 \cdot 2^{2} + 1 \cdot 2^{1} + 1 \cdot 2^{0} = -8 + 4 + 2 + 1 = -1$$

Uniqueness of two’s-complement encoding Function $B2T_{w}$ is a bijection.

#### Alternative representations of signed numbers

There are two other standard representations for signed numbers:

Ones’ complement. This is the same as two’s complement, except that the most significant bit has weight $−(2^{w−1} − 1)$ rather than $−2^{w−1}$:

$$B2O_{w}(\vec{x}) \doteq -x_{w-1}(2^{w-1}-1)+ \sum_{i=0}^{w-2} x_{i}2^{i}$$

Sign magnitude. The most significant bit is a sign bit that determines whether the remaining bits should be given negative or positive weight:

$$B2S_{w}(\vec{x}) \doteq (-1)^{x_{w-1}} \cdot \sum_{i=0}^{w-2} x_{i}2^{i}$$

Both of these representations have the curious property that there are two different encodings of the number 0.For both representations, $[00 . . . 0]$ is interpreted as +0. The value −0 can be represented in sign-magnitude form as $[10 . . . 0]$ and in ones’ complement as $[11 . . . 1]$. Although machines based on ones’-complement representations were built in the past, almost all modern machines use two’s complement. We will see that sign-magnitude encoding is used with floating-point numbers.

Note the different position of apostrophes: two’s complement versus ones’ complement. The term “two’s complement” arises from the fact that for nonnegative $x$ we compute a $w$-bit representation of $−x$ as $2^{w − x}$ (a single two.) The term “ones’ complement” comes from the property that we can compute $−x$ in this notation as $[111 . . . 1] −x$ (multiple ones).

### Conversions between Signed and Unsigned

Definition of Conversion from two’s complement to unsigned

For $x$ such that $\textit{TMin}_{w} \leq x \leq \textit{TMax}_{w}$ :

$$T2U_{w}(x)= \begin{cases} x+2^{w}, & x< 0 \\ x, & x \geq 0 \end{cases}$$

Definition of Unsigned to two’s-complement conversion

For $u$ such that $0 \leq u \leq \textit{UMax}_{w}$ :

$$U2T_{w}(u)= \begin{cases} u, & u \leq \textit{TMax}_{w} \\ u-2^{w}, & u > \textit{TMax}_{w} \end{cases}$$

## Integer Arithmetic

For $x$ and $y$ such that $0 ≤ x, y < 2_{w}$ :

$$x+_{w}^{u}y= \begin{cases} x+y, & x + y < 2^{w} & \textrm{Normal} \\ x+y-2^{w}, & 2_{w} \leq x + y \geq 2^{w+1} & \textrm{Overflow} \end{cases}$$

### Unsigned negation

Definition of Unsigned negation

For any number $x$ such that $0 ≤ x < 2^{w}$ , its $w$-bit unsigned negation $-^{u}_{w} x$ is given by the following:

$$-^{u}_{w} x= \begin{cases} x, & x = 0 \\ 2^{w}-x, & x > 0 \end{cases}$$

For integer values $x$ and $y$ in the range $−2^{w−1} ≤ x, y ≤ 2^{w-1}−1$:

$$x+_{w}^{t}y= \begin{cases} x+y-2^{w}, & 2^{w-1} \leq x + y & \textrm{Positive overflow} \\ x+y, & -2^{w-1} \leq x + y \geq 2^{w-1} & \textrm{Normal} \\ x+y+2^{w}, & x + y < -2^{w-1} & \textrm{Negative overflow} \end{cases}$$

### Two’s-Complement Negation

Definition of Two’s-Complement Negation

For $x$ in the range $\textit{TMin}_{w} ≤ x ≤ \textit{TMax}_{w}$ , its two’s-complement negation $-_{w}^{t} x$ is given by the formula:

$$-_{w}^{t} x= \begin{cases} \textit{TMin}, & x = \textit{TMin} \\ -x, & x > \textit{TMin} \end{cases}$$

### Unsigned Multiplication

Definition of Unsigned multiplication

For $x$ and $y$ such that $0 ≤ x, y ≤ \textit{UMax}_w$ :

$$x *_{w}^{u} y = (x \cdot y) \textrm{ mod } 2^{w}$$

### Two’s-Complement Multiplication

Definition of Two’s-Complement Multiplication

For $x$ and $y$ such that $\textit{TMin}_w ≤ x, y ≤ \textit{TMax}_w$ :

$$x *_{w}^{t} y = U2T_{w}((x \cdot y) \textrm{ mod } 2^{w})$$

## Floating Point

### Fractional Binary Numbers

Decimal notation uses a representation of the form:

$$d_{m} d_{m−1} . . . d_{1} d_{0} . d_{−1} d_{−2} . . . d_{−n}$$

Digits to the left of the binary point have weights of the form $2^{i}$ , while those to the right have weights of the form $1/2^{i}$.

   +------------------------------- 2^{m}
|
|      +------------------------ 2^{m-1}
|      |
|      |         +-------------- 4
|      |         |
|      |   ...   |     +-------- 2
|      |         |     |
|      |         |     |    +--- 1
|      |         |     |    |

b_{m} b_{m-1} ... b_{2} b_{1} b_{0} . b_{-1} b_{-2} b_{-3} ... b_{-n+1} b_{-n}

|     |      |           |       |
1/2 ---+     |      |           |       |
|      |           |       |
1/4 ---------+      |    ...    |       |
|           |       |
1/8 ----------------+           |       |
|       |
|       |
1/2^{-n+1} ----------------------------+       |
|
|
1/2^{-n} ------------------------------------+


For $b_{i}$ ranges between 0 and 1, $b$ can be defined as:

$$b=\sum_{i=-n}^{m}2^{i} \times b_{i}$$

### IEEE Floating-Point Representation

Positional notation such as considered in the previous section would not be efficient for representing very large numbers. For example, the representation of 5 $\times$ 2100 would consist of the bit pattern 101 followed by 100 zeros. Instead, we would like to represent numbers in a form $x \times 2^{y}$ by giving the values of $x$ and $y$.

The IEEE floating-point standard represents a number in a form $V = (−1)^{s} \times M \times 2^{E}$ :

• The sign s determines whether the number is negative ($s = 1$) or positive ($s = 1$), where the interpretation of the sign bit for numeric value 0 is handled as a special case.
• The significand M is a fractional binary number that ranges either between 1 and 2 − $\epsilon$ or between 0 and 1 − $\epsilon$.
• The exponent E weights the value by a (possibly negative) power of 2.
Single precision (32-bit)
31 30                23 22                                                       0
+---+------------------+----------------------------------------------------------+
| s |      exp         |                         frac                             |
+---+------------------+----------------------------------------------------------+

Double precision (64-bit)
63 62                         52 51                                             32
+---+---------------------------+-------------------------------------------------+
| s |      exp                  |                   frac(51:32)                   |
+---+---------------------------+-------------------------------------------------+
31                                                                               0
+---------------------------------------------------------------------------------+
|                                 frac (31:0)                                     |
+---------------------------------------------------------------------------------+


The bit representation of a floating-point number is divided into three fields to encode these values:

• The single sign bit s directly encodes the sign s.
• The k-bit exponent field exp $= e_{k−1} . . . e_{1}e_{0}$ encodes the exponent E.
• The n-bit fraction field frac $= f_{n−1} . . . f_{1}f_{0}$ encodes the significand M, but the value encoded also depends on whether or not the exponent field equals 0.

In the single-precision floating-point format (a float in C), fields s, exp, and frac are 1, k = 8, and n = 23 bits each, yielding a 32-bit representation. In the double-precision floating-point format (a double in C), fields s, exp, and frac are 1, k = 11, and n = 52 bits each, yielding a 64-bit representation.

### Rounding

Floating-point arithmetic can only approximate real arithmetic, since the representation has limited range and precision. Thus, for a value $x$, we generally want a systematic method of finding the “closest” matching value $x^{’}$ that can be represented in the desired floating-point format. This is the task of the rounding operation. One key problem is to define the direction to round a value that is halfway between two possibilities. For example, if I have $1.50 and want to round it to the nearest dollar, should the result be$1 or $2? An alternative approach is to maintain a lower and an upper bound on the actual number. For example, we could determine representable values$x^{−}$and$x^{+}$such that the value x is guaranteed to lie between them:$x^{-} \leq x \geq x^{+}$.The IEEE floating-point format defines four different rounding modes. The default method finds a closest match, while the other three can be used for computing upper and lower bounds. Mode$1.40 $1.60$1.50 $2.50$–1.50
----------------- ----- ----- ----- ----- ------
Round-to-even     $1$2    $2$2    $-2 Round-toward-zero$1    $1$1    $2$-1
Round-down        $1$1    $1$2    $-2 Round-up$1    $2$2    $3$-1


Round-to-even (also called round-to-nearest) is the default mode.It attempts to find a closest match. Thus, it rounds $1.40 to$1 and $1.60 to$2, since these are the closest whole dollar values. The only design decision is to determine the effect of rounding values that are halfway between two possible results. Round-to-even mode adopts the convention that it rounds the number either upward or downward such that the least significant digit of the result is even. Thus, it rounds both $1.50 and$2.50 to $2. The other three modes produce guaranteed bounds on the actual value. These can be useful in some numerical applications. Round-toward-zero mode rounds positive numbers downward and negative numbers upward, giving a value$\hat{x}$such that$|\hat{x}| \geq |x| $. Round-down mode rounds both positive and negative numbers downward, giving a value$x^{-}$such that$x^{-} \geq x $. Round-up mode rounds both positive and negative numbers upward, giving a value$x^{+}$such that$x \geq x^{+} \$.

Round-to-even at first seems like it has a rather arbitrary goal—why is there any reason to prefer even numbers? Why not consistently round values halfway between two representable values upward? The problem with such a convention is that one can easily imagine scenarios in which rounding a set of data values would then introduce a statistical bias into the computation of an average of the values. The average of a set of numbers that we rounded by this means would be slightly higher than the average of the numbers themselves. Conversely, if we always rounded numbers halfway between downward, the average of a set of rounded numbers would be slightly lower than the average of the numbers themselves. Rounding toward even numbers avoids this statistical bias in most real-life situations. It will round upward about 50% of the time and round downward about 50% of the time.

Round-to-even rounding can be applied even when we are not rounding to a whole number. We simply consider whether the least significant digit is even or odd. For example, suppose we want to round decimal numbers to the nearest hundredth. We would round 1.2349999 to 1.23 and 1.2350001 to 1.24, regardless of rounding mode, since they are not halfway between 1.23 and 1.24. On the other hand, we would round both 1.2350000 and 1.2450000 to 1.24, since 4 is even.

Similarly, round-to-even rounding can be applied to binary fractional numbers. We consider least significant bit value 0 to be even and 1 to be odd. In general, the rounding mode is only significant when we have a bit pattern of the form XX . . . X.Y Y . . . Y 100 . . ., where X and Y denote arbitrary bit values with the rightmost Y being the position to which we wish to round. Only bit patterns of this form denote values that are halfway between two possible results.