Dive Into Systems

4.5. Integer Overflow

Although the number of integers is mathematically infinite, in practice, numeric types in a computer’s memory occupy a fixed number of bits. As we’ve hinted throughout this chapter, using a fixed number of bits implies that programs might be unable to represent values that they’d like to store. For example, the discussion of addition showed that adding two legitimate values can produce a result that can’t be represented. A computation that lacks the storage to represent its result has overflowed.

4.5.1. Odometer Analogy

To characterize overflow, consider an example from the non-computing world: a car’s odometer. An odometer counts the number of miles a car has driven, and whether it’s digital or analog, it can display only so many (base 10) digits. If the car drives more miles than the odometer can represent, the odometer "rolls over" back to zero, since the true value can’t be expressed. For example, with a standard six-digit odometer, the maximum value it represents is 999999. Driving just one additional mile should display 1000000, but like the overflowing addition example, the 1 carries out from the six available digits, leaving only 000000.

For simplicity, let’s continue analyzing an odometer that’s limited to just one decimal digit. That is, the odometer represents the range [0, 9], so after every 10 miles the odometer resets back to zero. Illustrating the odometer’s range visually, it might look like Figure 1.

A circle with the values 0 to 9 arranged around it.

Figure 1. A visual depiction of a one-digit odometer’s potential values

Because a one-digit odometer rolls over upon reaching 10, drawing a circular shape emphasizes the discontinuity at the top of the circle (and only at the top). Specifically, by adding one to any value other than nine, the result lands on the expected value. On the other hand, adding one to nine jumps to a value that doesn’t naturally follow it (zero). More generally, when performing any arithmetic that crosses the discontinuity between nine and zero, the computation will overflow. For example, consider adding 8 + 4, as in Figure 2.

A circle with the values 0 to 9 arranged around it. The gap between 0 and 9 is labeled as the location where overflow can occur. Arrows show that adding 4 to 8 causes the arithmetic to jump across the marked overflow location.

Figure 2. The result of adding 8 + 4 with only one decimal digit. Crossing the discontinuity between 0 and 9 indicates that an overflow has occurred.

Here, the sum yields 2 instead of the expected 12. Note that many other values added to 8 (for example, 8 + 14) would also land on two, with the only difference being that the computations would take additional trips around the circle. Consequently, it doesn’t matter whether the car drives 2, 12, or 152 miles — in the end, the odometer will read 2 regardless.

Any device that behaves like an odometer performs modular arithmetic. In this case, all arithmetic is modular with respect to a modulus of 10, since one decimal digit represents only 10 values. Therefore, given any number of miles traveled, we can compute what the odometer will read by dividing the distance by 10 and taking the remainder as the result. If the odometer had two decimal digits instead of one, the modulus would change to 100, since it could represent a larger range of values: [0, 99]. Similarly, clocks perform modular arithmetic with an hour modulus of 12.

4.5.2. Binary Integer Overflow

Having seen a familiar form of overflow, let’s turn to binary number encodings. Recall that N bits of storage represent 2^N unique bit sequences and that those sequences can be interpreted in different ways (as unsigned or signed). Some operations that yield correct results under one interpretation may exhibit overflow according to the other, so the hardware needs to recognize overflow differently for each.

For example, suppose that a machine is using four-bit sequences to compute 0b0010 (2) - 0b0101 (5). Running this operation through the subtraction procedure produces a binary result of 0b1101. Interpreting this result as a signed value produces -3 (-8 + 4 + 1), the expected result for 2 - 5 without overflow. Alternatively, interpreting it as an unsigned value yields 13 (8 + 4 + 1), which is incorrect and clearly indicative of overflow. Scrutinizing this example further, it instinctively makes some sense — the result should be negative, and a signed interpretation allows for negatives, whereas unsigned does not.

Unsigned Overflow

Unsigned numbers behave similarly to the decimal odometer examples given that both represent only non-negative values. N bits represent unsigned values in the range [0, 2^N - 1], making all arithmetic modular with respect to 2^N. Figure 3 illustrates an arrangement of the unsigned interpretations of four-bit sequences into a modular space.

The numbers 0 to 15 are arranged in a circle. The gap between 15 and 0 (at the top of the circle) is labeled as the location where overflow can occur.

Figure 3. An arrangement of four-bit unsigned values into a modular space. All arithmetic is modular with respect to 2⁴ (16).

Given that unsigned interpretations can’t hold negative values, the discontinuity again sits between the maximum value and zero. Therefore, unsigned overflow results from any operation that crosses the divide between 2^N-1 and 0. Stated more plainly, if performing addition (which should make the result larger) produces a smaller result, the addition caused unsigned overflow. Symmetrically, if performing subtraction (which should make the result smaller) produces a larger result, the subtraction caused unsigned overflow.

As a shortcut for detecting unsigned overflow for addition and subtraction, recall the carry out and carry in bits of those operations. A carry out is a carry from the most significant bit in the result of the computation. When set, a carry in increments the value of the result by carrying one into the least significant bit of the arithmetic operation. The carry in is only set to 1 for subtraction as part of the negation procedure.

The shortcut for unsigned arithmetic is: the carry out must match the carry in, otherwise the operation causes overflow. Intuitively, this shortcut works because:

For addition (carry in = 0), the result should be larger than (or equal to) the first operand. However, if the sum requires an extra bit of storage (carry out = 1), truncating that extra bit from the sum yields a smaller result (overflow). For example, in the unsigned four-bit number space, adding 0b1100 (12) + 0b1101 (13) requires five bits to store the result 0b11001 (25). When truncated to only four bits, the result represents 0b1001 (9), which is smaller than the operands (therefore, overflow).
For subtraction (carry in = 1), the result should be smaller than (or equal to) the first operand. Because subtraction executes as a combination of addition and negation, the addition subproblem should produce a smaller result. The only way addition can end up with a smaller value is by truncating its sum (carry out = 1). If it doesn’t require truncation (carry out = 0), the subtraction yields a larger result (overflow).

Let’s examine two examples of four-bit subtraction: one that overflows, and one that doesn’t. First, consider 0b0111 (7) - 0b1001 (9). The subtraction procedure treats this computation as:

Problem Setup	Converted to Addition	Worked Example
0111 - 1001	1 (carry in) 0111 + 0110 (bits flipped)	1 (carry in) 0111 + 0110 (bits flipped) Result: 1110 Carry out: 0

Problem Setup

Converted to Addition

Worked Example

 
  0111
- 1001

     1 (carry in)
  0111
+ 0110 (bits flipped)

                1 (carry in)
             0111
           + 0110 (bits flipped)

   Result:   1110
Carry out:  0

The computation did not carry out of d₃, so no truncation occurs and the carry in (1) fails to match the carry out (0). The result, 0b1110 (14), is larger than either operand and thus clearly incorrect for 7 - 9 (overflow).

Next, consider 0b0111 (7) - 0b0101 (5). The subtraction procedure treats this computation as:

Problem Setup	Converted to Addition	Worked Example
0111 - 0101	1 (carry in) 0111 + 1010 (bits flipped)	1 (carry in) 0111 + 1010 (bits flipped) Result: 0010 Carry out: 1

Problem Setup

Converted to Addition

Worked Example

 
  0111
- 0101

     1 (carry in)
  0111
+ 1010 (bits flipped)

                1 (carry in)
             0111
           + 1010 (bits flipped)

   Result:   0010
Carry out:  1

The computation carries out a bit to d₄, causing the carry in (1) to match the carry out (1). The truncated result, 0b0010 (2), correctly represents the expected outcome of the subtraction operation (no overflow).

Signed Overflow

The same intuition behind overflow applies to signed binary interpretations: there exists a discontinuity in the modular number space. However, because a signed interpretation allows for negatives, the discontinuity doesn’t occur around 0. Recall that two’s complement "rolls over" cleanly from -1 (0b1111…111) to 0 (0b0000…000). Thus, the discontinuity exists at the other end of the number space, where the largest positive value and smallest negative value meet.

Figure 4 shows an arrangement of the signed interpretations of four-bit sequences into a modular space. Note that half the values are negative, the other half are non-negative, and that the discontinuity lies at the min/max divide between them.

The numbers 0 to 7 are arranged on the right half of a circle, and the numbers -1 to -8 are arranged on the left half. The gap between 7 and -8 (at the bottom of the circle) is labeled as the location where overflow can occur.

Figure 4. An arrangement of four-bit signed values into a modular space. Because a signed interpretation allows for negative values, the discontinuity no longer resides at zero.

When performing signed arithmetic, it’s always safe to generate a result that moves closer to zero. That is, any operation that reduces the absolute value of the result cannot overflow, because the overflow discontinuity resides where the magnitude of the representable values is the largest.

Consequently, systems detect overflow in signed addition and subtraction by comparing the most significant bit of the operands with the most significant bit of the result. For subtraction, first rearrange the arithmetic in terms of addition (e.g., rewrite 5 - 2 as 5 + -2).

If the addition’s operands have different high-order bit values (i.e., one operand is negative and the other is positive), there can be no signed overflow, because the absolute value of the result must be smaller than (or equal to) either operand. The result is moving toward zero.
If the addition’s operands have the same high-order bit value (i.e., both are positive or both are negative), a correct result must also have the same high-order bit value. Thus, when adding two operands with the same sign, a signed overflow occurs if the result’s sign differs from that of the operands.

Consider the following four-bit signed binary examples:

5 - 4 is equivalent to 5 + -4. The first operand (5) is positive, whereas the second (-4) is negative, so the result must be moving toward zero where no overflow is possible.
4 + 2 (both positive) yields 6 (also positive), so no overflow occurs.
-5 - 1 is equivalent to -5 + -1 (both negative) and yields -6 (also negative), so no overflow occurs.
4 + 5 (both positive) yields -7 (negative). Because the operands have the same sign and it doesn’t match the result’s sign, this operation overflows.
-3 - 8 is equivalent to -3 + -8 (both negative) and yields 5 (positive). Because the operands have the same sign and it doesn’t match the result’s sign, this operation overflows.

4.5.3. Overflow Summary

In general, integer overflow occurs when an arithmetic operation moves between the minimum and maximum values that its result can represent. If you’re ever in doubt about the rules for signed versus unsigned overflow, consider the minimum and maximum values of an N-bit sequence:

The minimum unsigned value is 0 (because unsigned encodings can’t represent negative numbers) and the maximum unsigned value is 2^N-1 (because one bit sequence is reserved for zero). Therefore the discontinuity is between 2^N-1 and 0.
The minimum signed value is -2^N-1 (because half of the sequences are reserved for negative values) and the maximum is 2^N-1-1 (because in the other half, one value is reserved for zero). Therefore, the discontinuity is between 2^N-1-1 and -2^N-1.

4.5.4. Overflow Consequences

While you may not run into integer overflow frequently, overflows have the potential to break programs in notable (and potentially devastating) ways.

For example, in 2014, PSY’s popular Gangnam Style music video threatened to overflow the 32-bit counter that YouTube used to track video hits. As a result, YouTube switched to using a 64-bit counter.

Another relatively harmless example shows up in the 1980 arcade game Pac-Man. The game’s developers used an unsigned eight-bit value to track the player’s progress through the game’s levels. As a result, if an expert player makes it beyond level 255 (the maximum value of an eight-bit unsigned integer), half of the board ends up glitching significantly, as shown in Figure 5.

The right half of the game board is completely corrupted with nonsense.

Figure 5. The Pac-Man game board "freaks out" upon reaching level 256

A much more tragic example of overflow appears in the history of the Therac-25 radiation therapy machine of the mid 1980s. The Therac-25 suffered from several design problems, including one that incremented a truth flag variable rather than setting it to a constant. After enough uses, the flag overflowed, causing it to erroneously roll over to zero (false) and bypass safety mechanisms. The Therac-25 ultimately caused serious harm to (and in some cases killed) six patients.