Created Wednesday 05 February 2020
Chapter 2 Lecture Notes
Today we'll be going over numerical data representation in computers. Floats, chars, etc.
A bit is the most basic unit of info on a computer. On or off. A byte is made up of 8 bits. It is the smallest address in a machine.
A word is a contiguous group of bytes. Any number of bits or bytes. Words sizes are 16, 32, or 64 bits most common. A word-addressable system, a word is the smallest addressable unit of storage.
A group of four bits is called a nibble. Bytes therefore takes up two nibbles. A lower-order and upper-order nibble.
Bytes store numbers using the position of each bit to represent a power of 2. Binary system or base-2. Decimal system or base-10.
Radix is the same as base. Aka, base 10, 2, 16, etc...
Observe the two patterns below:
947 = 9 * 102 + 4 * 101 + 7 * 100
110012 = 2510 = 1 * 24 + 1 * 23 + 0 * 22 + 0 * 21 + 1 * 20 = 16 + 8 + 0 + 0 + 1
The above pattern we'll always apply in the same way based upon which radix we're using.
When the base of the number is anything other than 10, then we'll do sub 10 or N10.
We also need to know how to convert between bases. For example, 19010 = 210013.
You should remember these to help with conversion:
2-1 = 1/21 = 1/2 = 0.5
2-2 = 1/22 = 1/4 = 0.25
2-3 = 1/23 = 1/8 = 0.125
2-4 = 1/24 = 1/16 = 0.0625
To convert 0.812510 to binary we get 0.11012.
The binary system plays well with computers because of 1s and 0s. Sadly, they are difficult for us to read. For compactness and read of reading of binary vlaues, we use hexadecimal or base-16.
It's easy to convert base 2 to 16 because 16 = 24. As such, we just take binary values are groups of 4 and convert to hexadecimal. A group of four binary digits is called a hextet.
Using hextets, the binary number 110101000110112 = 1359510 = 351B16
0011 = 3
0101 = 5
0001 = 1
1011 = B
There are three ways in which signed binary inteers may be expressed:
- Signed magnitude
- One's complement
- Two's complement
In an 8-bit word, signed magnitude representation places the absolute vlaue of the number in 7 bits to the right of the sign bit.
000000112 = +3
100000112 = -3
Binary addition is rather simple.
0 + 0 = 0
1 + 0 = 1
0 + 1 = 1
1 + 1 = 10
Like with addition as we're used to, you carry the 1 whenever we 1 + 1.
If the sum does not fit in 7-bits, we'll have a problem! What happens is that if we try to carry over to the 8-bit position we simply drop it! so, 10710 + 4610 = 2510 when we add them in binary! We need 8 bits to represent 15310 which is 100110012.
Adding two negative numbers is actually easy as well! Say we have -4610 and -2510. Just add them like normal and take the appropriate negative sign.
Mixed signs give us a bit of trouble though. The sign of the number gets the one that's larger. Slide 41 has a visual of how this is done.
Signed magnitude representation is easy for people to understand but requires complicated computer hardware. Another disadvantage is that we have two versions of 0. A positive and a negative one. Due to these reasons, computer systems employ complement systems for numeric value representation.
In complement systems, negative values are represented by some difference between a number and its base.
The diminished radix complement of a non-zero number N in base r with d digits is (rd-1)-N. In the binary system, this gives us one's complement. It amounts to little more than flipping the bits of a binary number.
8-bit one's complement representation:
+310 = 000000112
-310 = 111111002
As with signed magnitude, notice how the negative number has a 1 in the high order bit.
Complement systems are useful because they eliminate the need for subtraction. The difference of two values is found by adding.
Sadly, one's complement still has a +0 and -0. Two's complement solves this problem!
Two's complement requires us to represent a positive number as just regular binary. For negative, find the one's complement of the number and then add 1.
310 = 000000112
-310 = 111111002
Adding 1 gives us -3 in two's complement: 111111012
With two's complement arithmetic, all we do is add our two binary numbers. Just discard any carries emitting from the high order bit.
Excess-M representation (also known as offset binary representation) is another way for unsigned binary values to represent signed integers.
Our bias is determined by the following formula: 2n-1 - 1 where n is the number of bits we're using.
In excess-M, we use a bias to offset the values to help us determine what the actual value is. For example, our bias could be 7. So, 010 = 01112. This is our bias. We choose 7 as our bias for 24-1-1 = 7
Slide 54 of the chapter2.pptx has a nice chart comparing all our different representations.
When we use a finite number of bits to represent a number, there's always the risk of our calculations becoming too large or small. We can't always prevent overflow but we can always detect overflow. For example, complement arithmetic has an easy overflow condition to detect.
For example, 107 + 46 = -103 when we add them in binary. The overflow flips the sign bit. The final result is technically correct but it becomes negative. However! Overflow into the sign bit doesn't always mean that we have an error.
For two's complement, we know that overflow has occured when the carry in and carry out of the sign bit differ. If the carry into the sign bit equals the carry out of the sign bit, no overflow has occurred.
Signed and unsigned numbers are both useful. For example, memory addresses are always unsigned. With the same number of bits, unsigned ints can express twice are many positive values as signed ints. Trouble arises if an unsigned value wraps around though. In four bits: 1111 + 1 = 0000. Good programmers are weary of this!