Representing Numbers and Letters with Binary: Crash Course Computer Science #4


Hi I’m Carrie Anne, this is Crash Course
Computer Science and today we’re going to talk about how computers store and represent numerical data. Which means we’ve got to talk about Math! But don’t worry. Every single one of you already knows exactly
what you need to know to follow along. So, last episode we talked about how transistors can be used to build logic gates, which can evaluate boolean statements. And in boolean algebra, there are only two,
binary values: true and false. But if we only have two values, how in the
world do we represent information beyond just these two values? That’s where the Math comes in. INTRO So, as we mentioned last episode, a single
binary value can be used to represent a number. Instead of true and false, we can call these
two states 1 and 0 which is actually incredibly useful. And if we want to represent larger things we just need to add more binary digits. This works exactly the same way as the decimal
numbers that we’re all familiar with. With decimal numbers there are “only” 10 possible values a single digit can be; 0 through 9, and to get numbers larger than 9 we just start adding more digits to the front. We can do the same with binary. For example, let’s take the number two hundred and sixty three. What does this number actually represent? Well, it means we’ve got 2 one-hundreds, 6 tens, and 3 ones. If you add those all together, we’ve got 263. Notice how each column has a different multiplier. In this case, it’s 100, 10, and 1. Each multiplier is ten times larger than the
one to the right. That’s because each column has ten possible digits to work with, 0 through 9, after which you have to carry one to the next column. For this reason, it’s called base-ten notation, also called decimal since deci means ten. AND Binary works exactly the same way, it’s just base-two. That’s because there are only two possible
digits in binary – 1 and 0. This means that each multiplier has to be two times larger than the column to its right. Instead of hundreds, tens, and ones, we now have fours, twos and ones. Take for example the binary number: 101. This means we have 1 four, 0 twos, and 1 one. Add those all together and we’ve got the
number 5 in base ten. But to represent larger numbers, binary needs a lot more digits. Take this number in binary 10110111. We can convert it to decimal in the same way. We have 1 x 128, 0 x 64, 1 x 32, 1 x 16, 0
x 8, 1 x 4, 1 x 2, and 1 x 1. Which all adds up to 183. Math with binary numbers isn’t hard either. Take for example decimal addition of 183 plus 19. First we add 3 + 9, that’s 12, so we put
2 as the sum and carry 1 to the ten’s column. Now we add 8 plus 1 plus the 1 we carried,
thats 10, so the sum is 0 carry 1. Finally we add 1 plus the 1 we carried, which equals 2. So the total sum is 202. Here’s the same sum but in binary. Just as before, we start with the ones column. Adding 1+1 results in 2, even in binary. But, there is no symbol “2” so we use 10 and
put 0 as our sum and carry the 1. Just like in our decimal example. 1 plus 1, plus the 1 carried, equals 3 or
11 in binary, so we put the sum as 1 and we carry 1 again, and so on. We end up with 11001010, which is the same as the number 202 in base ten. Each of these binary digits, 1 or 0, is called a “bit”. So in these last few examples, we were using 8-bit numbers with their lowest value of zero and highest value is 255, which requires all 8 bits to be set to 1. Thats 256 different values, or 2 to the 8th power. You might have heard of 8-bit computers, or 8-bit graphics or audio. These were computers that did most of their operations in chunks of 8 bits. But 256 different values isn’t a lot to
work with, so it meant things like 8-bit games were limited to 256 different colors for their graphics. And 8-bits is such a common size in computing, it has a special word: a byte. A byte is 8 bits. If you’ve got 10 bytes, it means you’ve really got 80 bits. You’ve heard of kilobytes, megabytes, gigabytes and so on. These prefixes denote different scales of
data. Just like one kilogram is a thousand grams, 1 kilobyte is a thousand bytes…. or really 8000 bits. Mega is a million bytes (MB), and giga is a billion bytes (GB). Today you might even have a hard drive that has 1 terabyte (TB) of storage. That’s 8 trillion ones and zeros. But hold on! That’s not always true. In binary, a kilobyte has two to the power of 10 bytes, or 1024. 1000 is also right when talking about kilobytes,
but we should acknowledge it isn’t the only correct definition. You’ve probably also heard the term 32-bit
or 64-bit computers – you’re almost certainly using one right now. What this means is that they operate in chunks of 32 or 64 bits. That’s a lot of bits! The largest number you can represent with
32 bits is just under 4.3 billion. Which is thirty-two 1’s in binary. This is why our Instagram photos are so smooth and pretty – they are composed of millions of colors, because computers today use 32-bit color graphics Of course, not everything is a positive number – like my bank account in college. So we need a way to represent positive and negative numbers. Most computers use the first bit for the sign: 1 for negative, 0 for positive numbers, and then use the remaining 31 bits for the number itself. That gives us a range of roughly plus or minus
two billion. While this is a pretty big range of numbers,
it’s not enough for many tasks. There are 7 billion people on the earth, and the US national debt is almost 20 trillion dollars after all. This is why 64-bit numbers are useful. The largest value a 64-bit number can represent is around 9.2 quintillion! That’s a lot of possible numbers and will hopefully stay above the US national debt for a while! Most importantly, as we’ll discuss in a
later episode, computers must label locations in their memory, known as addresses, in order to store and retrieve values. As computer memory has grown to gigabytes and terabytes – that’s trillions of bytes – it was necessary to have 64-bit memory addresses as well. In addition to negative and positive numbers, computers must deal with numbers that are not whole numbers, like 12.7 and 3.14, or
maybe even stardate: 43989.1. These are called “floating point” numbers, because the decimal point can float around in the middle of number. Several methods have been developed to represent floating point numbers. The most common of which is the IEEE 754 standard. And you thought historians were the only people bad at naming things! In essence, this standard stores decimal values sort of like scientific notation. For example, 625.9 can be written as 0.6259 x 10^3. There are two important numbers here: the .6259 is called the significand. And 3 is the exponent. In a 32-bit floating point number, the first bit is used for the sign of the number — positive or negative. The next 8 bits are used to store the exponent and the remaining 23 bits are used to store the significand. Ok, we’ve talked a lot about numbers, but your name is probably composed of letters, so it’s really useful for computers to also
have a way to represent text. However, rather than have a special form of storage for letters, computers simply use numbers to represent letters. The most straightforward approach might be to simply number the letters of the alphabet: A being 1, B being 2, C 3, and so on. In fact, Francis Bacon, the famous English writer, used five-bit sequences to encode all 26 letters of the English alphabet to
send secret messages back in the 1600s. And five bits can store 32 possible values
– so that’s enough for the 26 letters, but not enough for punctuation, digits, and
upper and lower case letters. Enter ASCII, the American Standard Code for Information Interchange. Invented in 1963, ASCII was a 7-bit code,
enough to store 128 different values. With this expanded range, it could encode capital letters, lowercase letters, digits 0 through 9, and symbols like the @ sign and punctuation marks. For example, a lowercase ‘a’ is represented by the number 97, while a capital ‘A’ is 65. A colon is 58 and a closed parenthesis is 41. ASCII even had a selection of special command codes, such as a newline character to tell the computer where to wrap a line to the next row. In older computer systems, the line of text would literally continue off the edge of the screen if you didn’t include a new line
character! Because ASCII was such an early standard, it became widely used, and critically, allowed different computers built by different companies
to exchange data. This ability to universally exchange information
is called “interoperability”. However, it did have a major limitation: it
was really only designed for English. Fortunately, there are 8 bits in a byte, not 7, and it soon became popular to use codes 128 through 255, previously unused, for “national” characters. In the US, those extra numbers were largely used to encode additional symbols, like mathematical notation, graphical elements, and common accented characters. On the other hand, while the Latin characters were used universally, Russian computers used the extra codes to encode Cyrillic characters, and Greek computers, Greek letters, and so on. And national character codes worked pretty well for most countries. The problem was, if you opened an email written in Latvian on a Turkish computer, the result was completely incomprehensible. And things totally broke with the rise of computing in Asia, as languages like Chinese and Japanese have thousands of characters. There was no way to encode all those characters in 8-bits! In response, each country invented multi-byte encoding schemes, all of which were mutually incompatible. The Japanese were so familiar with this encoding problem that they had a special name for it: “mojibake”, which means “scrambled text”. And so it was born – Unicode – one format
to rule them all. Devised in 1992 to finally do away with all of the different international schemes it replaced them with one universal encoding
scheme. The most common version of Unicode uses 16 bits with space for over a million codes – enough for every single character from every language ever used – more than 120,000 of them in over 100 types of script plus space for mathematical symbols and even graphical characters like Emoji. And in the same way that ASCII defines a scheme for encoding letters as binary numbers, other file formats – like MP3s or GIFs – use binary numbers to encode sounds or colors of a pixel in our photos, movies, and music. Most importantly, under the hood it all comes down to long sequences of bits. Text messages, this YouTube video, every webpage on the internet, and even your computer’s operating system, are nothing but long sequences of 1s and 0s. So next week, we’ll start talking about how your computer starts manipulating those binary sequences, for our first true taste of computation. Thanks for watching. See you next week.

100 thoughts on “Representing Numbers and Letters with Binary: Crash Course Computer Science #4

  1. This is really helpful but you move too fast I have to download your videos and play them in slow motion to understand everything

  2. I love this course so much!!
    Each video I learn a looot & can combine infos that I've seen before.
    It's so very well displayed & explained, THANKS Carrie Anne! 😉

  3. Thank you so much I took gcse computer science and passed (on a grade 8) an A+ you made it so much easier ps I’m rewatching all of the cs vids

  4. Hi, let me clear one thing up.
    1 byte = 8 bits
    1 KiloByte = 1,000 bytes
    1 KibiByte = 1,024 bytes.

    KibiByte was introduced to remove the confusion of 1,000 vs 1,024 bytes,
    so, as of today, when ever someone says KiloByte, its 1,000 bytes, and KibiByte = 1,024 bytes.

  5. So much

    interesting! I am a french student (15yo) and i loved this video! I can thanks to the translator who allowed me to understand this video!

  6. Our computers use 32bit colour? I thought it was 255x255x255 (RGB) and that's 24 bit colour range? am I getting confused here

  7. Funny seeing Latvian being mentioned here and really the text on the monitor seems to be Latvian with all the " %¾.. characters marking ā, ē, ū and other our language specific characters 🙂

  8. If the numbers don't possess anything, you shouldn't use apostrophes! It should be 8s, 4s, 2s and 1s. Ugh, math people, with your mysterious numbers and your bad spelling and grammar. 🙁

  9. Pretty sure we do not use true 32 bit color. We use r g b 8/8/8 in most cases, for 24 bit color. Some professional displays and apps use 10/10/10 for 30 bit color.

    Some image encoding can have an 8 bit alpha channel for 8/8/8/8 or 32 bits, but there are others as well… 8/8/8/1 is good for simple masks. Obviously there can be more or less bits used for encoding, but most displays and games work with 24 bit color (8/8/8) as it can represent the colors r g b as 0-255, and luminance. 255 0 0 would be bright red, 128 0 0 would be red at half luminance…

    Most devices have poor gamut though, and can only show 60-70% of those colors, but that's another thing.

  10. This is the bee's knees!! I finally get it! I love how you splash in history with it too I love learning the history of computers. It makes the learning experience so much more authentic!

  11. Been learning about computer science more in depth lately and now I feel like I need to rethink the meaning of life. This takes "woke" to a whole new level.

  12. 1+1 is 2 but there is no such number in base 2 binary . According to Base 2 binary, if it is 2 , it has to be 10 and if it is 3, it is 11.

  13. I sorry but the part where she's describing coveting decimal numbers to binary makes no sense. At 6:37, the "exponent" part doesn't seem to me like 3. It looks more like 136 to me. How do you get that?

  14. 6:40 (float) Won't this representation method create Redundancies ?

    like if i have 625 as significand and 1 as exponent i will get the number 6250

    but if i have 6250 as significand and 0 as exponent i will also get 6250 …. how to sort that out ?

  15. For the rest of my life i will have the maximum countable (signed) integer on a 32 bit system memorized

    Thanks Runescape

  16. 0100111001100101011101100110010101110010001000000110011101101111011011100110111001100001001000000110011101101001011101100110010100100000011110010110111101110101001000000111010101110000000011010000101001001110011001010111011001100101011100100010000001100111011011110110111001101110011000010010000001101100011001010111010000100000011110010110111101110101001000000110010001101111011101110110111000001101000010100100111001100101011101100110010101110010001000000110011101101111011011100110111001100001001000000111001001110101011011100010000001100001011100100110111101110101011011100110010000100000011000010110111001100100001000000110010001100101011100110110010101110010011101000010000001111001011011110111010100001101000010100100111001100101011101100110010101110010001000000110011101101111011011100110111001100001001000000110110101100001011010110110010100100000011110010110111101110101001000000110001101110010011110010000110100001010010011100110010101110110011001010111001000100000011001110110111101101110011011100110000100100000011100110110000101111001001000000110011101101111011011110110010001100010011110010110010100001101000010100100111001100101011101100110010101110010001000000110011101101111011011100110111001100001001000000111010001100101011011000110110000100000011000010010000001101100011010010110010100100000011000010110111001100100001000000110100001110101011100100111010000100000011110010110111101110101

  17. Just wait when alien languages become a thing, Unicode will not longer be sufficient for every language character.

  18. But how does a computer to convert binary to decimal?and how does computer translate binary to characters print on screen?

  19. I am really struggling to understand the target audience for this crashcourse, very fast and overview-ish to learn something useful for a beginner, and very basic for someone studying the subject.

  20. Wow amazing series you really make understand what's happening in a computer. I have one question though. How does the computer know if a sequence of bits is a number or just ascii code (a number to represent a letter or sth)? Same thing about floats.

  21. 09:22 Uh, oh. 16-bits is only spacious enough for 65,536 characters, not 'over one million'. But that's the first mistake I've spotted. LOVE these videos and your presentation!

  22. I get anxiety everytime I see that stupid guy with his laptop in the bath tup. I wouldn't hold my laptop in one hand like this even if there was no water underneath.

  23. Every unicode encoding can encode all unicode characters. It's not like utf-16 has more characters than utf-8, it's just encoded differently.

  24. this series is simply amazing and unfolded the mystery of what goes inside the machine when we program something.. Thank You so much 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *