2007-02-08

Graphics on the Web 1 - Binary Number System Primer

Have you ever wondered where those strange computer numbers come from: numbers like 8, 16, 256, 1024? Have you ever wondered why these same numbers keep coming up over and over again? Well, the answer is really quite simple.

Computers are binary machines. This means that, unlike humans, they work in a “base-2” number system. Humans generally do math in base-10, or decimal. Decimal means that each digit can represent 1 of 10 possible values: 0-9. Note that there is no number ten, while there are 10 possible combinations; the possible value ranges from 0 up to 9. If we want to represent the value “ten” in decimal, we have to use two digits in sequence: 10, or a one and a zero. Computers work the same way except they only have 2 possible values: 0-1. Just like decimal, with two possible values, because zero counts, the biggest number we can represent with one digit is “one.” If we want to represent the value “two” then we have to use two digits in sequence: 10, or a one and a zero. Binary seems strange to humans only because we’re used to decimal. The two systems work exactly the same, except that one has two possibilities, and the other ten.

A “bit” is a binary digit. Just like decimal, where a two digit number can represent 00-99 for 100 possible combinations, or three digit numbers can be 000-999 for 1000 combinations, bits can be strung together to form larger numbers. The following table outlines what strings of bits are capable of:

Number of Bits

Number of Possibilities

Largest Number

1

2

1

2

4

3

3

8

7

4

16

15

5

32

31

6

64

63

7

128

127

8

256

255

9

512

511

10

1024

1023

Note that 4, 8, and 10 bit combinations are the most common. The first thing you will notice is that, if you have done anything with computers, the numbers above will seem very familiar. The number 256, while it seems arbitrary in decimal, is simply the maximum number of combinations that 8 bits can be, and 255 is the largest number you can get with 8 bits. The arbitrariness of 256, 512, or 1024 is only an artifact of converting binary numbers to decimal number. Rest assured, converting decimal 100, 1000, or 10000 to binary would also produce seemingly arbitrary binary numbers.

In fact, binary does not convert well to decimal; we always wind up with these arbitrary numbers that are not intuitively obvious. When working with computer hardware or machine-level programming, this becomes such a problem that people dispense with decimal altogether and work in something called “hex.” Hex, or hexadecimal, is base-16 and follows the same rules as binary or decimal. One hex digit can represent 16 possible combinations: 0-F. People sometimes stumble over the ‘F’ part but it’s actually quite simple; it comes from exactly the same place ‘7’ comes from: somebody just said so, it’s an arbitrary choice. Hex requires 16 symbols to represent its possible combinations. Way back when, someone said “let’s use the decimal digits 0-9 and then add the letters A,B,C,D,E, and F to make up the extra 6 possibilities.

To see how this works, another table is in order.
Four bits of binary can be strung together like this:

Binary

Hex

Decimal

0000

0

0

0001

1

1

0010

2

2

0011

3

3

0100

4

4

0101

5

5

0110

6

6

0111

7

7

1000

8

8

1001

9

9

1010

A

10

1011

B

11

1100

C

12

1101

D

13

1110

E

14

1111

F

15

So, now we have binary, decimal, and hexadecimal. To understand why programmers bother with hex, let’s look at some more binary numbers.

Binary

Hex

Decimal

0011

3

3

0011 0011

33

51

1111 1111

FF

255

1111 0000 1111 0000

F0F0

61680


Now, things start to clear up. Writing out longs strings of binary ones and zeros can get tedious but converting to decimal is not intuitive so we use hex. Converting between hex and binary is quick and easy as it only involves remembering those 16 combinations in the table above. This is why programmers use hex and this is why, when your computer bombs, you see crash-dump screens full of hexadecimal numbers that seem like gibberish. It’s not gibberish, it’s just programmer-speak for “here’s what went wrong.”

There are other times you will see hex numbers as well, and when you do it’s because the programmer didn't bother converting, or chose not to convert, the numbers before displaying them. Many high-end graphics programs will display colour values in hex for this reason. So, when you see letters in the middle of numbers, you’re probably looking at hex.

One other thing to note: when you see a number like 10, you automatically think “ten” but this could also be binary “two” or hex “sixteen” depending on the context. Because of this, programmers will often follow binary numbers with “b” and hex numbers with “h” to avoid confusion. Then there’s “o” for octal but you don’t need to know about that.

Another place these numbers come up is with kilobytes, megabytes, and gigabytes. First off, a byte is 8 bits, and that’s just because somebody said so. Just like the reasoning behind hex, memory and hard drives are organised in bytes rather than bits out of convenience for system engineers. Now, in the metric system kilo is a prefix for 1000, a kilometre is 1000 metres, and mega is a prefix for 1,000,000, as in megawatts or 1 million watts. In computing, because engineers are a strange bunch, the metric system is perverted to deal with those arbitrary decimal conversions of binary numbers. Kilo is short for 1024, the decimal conversion of 10 bits. Similarly, mega is 1024*1024,giga is 1024*1024*1024, and on it goes.

Why, because it’s easier to say 128Mbytes than it is to say 134,217,728 bytes. Remember, it’s a binary system and it doesn't convert well to decimal. However, by using 1024 as the multiplier, a “half-conversion” can happen that sort of makes sense to regular humans. 128MBytes makes more sense that 8000000h to the average person. For the most part, the system makes for good shorthand as the extra 24’s don’t make too much difference. However, some manufacturers will cheat by saying their hard drive is 40Gbytes, which is true as it is over 40,000,000,000 bytes, but when it shows up on your system as only 37Gbytes you know one persons “giga” is different than the others. One is a 1000*1000*1000 in true metric fashion, the other is 1024*1024*1024 in the computer confabulated system.

As you can see, the computer numbers 4, 8, 16, 32, 64, 128, 256, 512, and 1024 are around for a reason; they are just decimal conversions of binary numbers. There really is nothing mysterious about them, they are just values that come up when you double two, and then double it again, and double it again… It’s the same reason we get 10, 100, 1000, 10000 etc. in decimal. One’s base-2 so we double it, the other is base-10 so we multiply by 10 each time. If you don’t like the arbitrariness or these numbers, you could work entirely in hex and computers would make perfect sense. No one else will understand you, but computers will make sense.

Part 2 - Colour Systems

Part 3 - File Types

No comments: