5 years ago

#3 What is a Byte?

We explain why ones and zeros matter.

Transcript

Speaker A:

It's one of the most fundamental units in a computer. But what really is a byte? Welcome to COPEC Explain Software, the podcast where we make computing intelligible.

Speaker B:

Well, Dave, today our question is going back to basics. What is a byte?

Speaker A:

Well, I think before we talk about bytes, we first should lay the groundwork by talking about bits. Have you heard the term bit before?

Speaker B:

I have.

Speaker A:

So what do you think a bit means?

Speaker B:

Something little?

Speaker A:

It is something little. It's something so little that a single bit can only represent a zero or a one. I think we really need to think about bits on two different levels, the hardware level and the software level. At the hardware level, all computers are really composed of machinery that looks at electrical signals, of course. And we look for a bit by looking whether a signal is on or off. We think about an on signal as a one and we think about an off signal as a zero. Just having different electrical signals that are on or off, composing them together, we can do all kinds of more interesting, more complex calculations. At the software level, we're always thinking in binary. Binary is a mathematical system that's base two. Are you familiar with what that means for a system of base two? This is going back to elementary school. I'm sure you saw it before, but do you remember it?

Speaker B:

So that would be that each value can only be two things, right.

Speaker A:

That's what binary means, right. Binary means two. And yeah, every single place is a one or a zero, representing a power of two. So every place in a binary number, every digit, represents some power of two. And we're either turning that power of two on or we're turning it off. And then when we combine those different on or off powers of two, we can actually represent any number. So let's think about what we can represent using bits. So if I just have one bit, how many different values can I represent?

Speaker B:

Two.

Speaker A:

Two, right. Zero or one. What if I have two bits?

Speaker B:

Four.

Speaker A:

Why is it four?

Speaker B:

Because you have two for the first place and then two for the next one.

Speaker A:

Well, not exactly. You have two for the first place. Sure. And then when you have another one, you're multiplying the number that you can represent by two. Think about the different values I could represent with two ones or zeros. I could have one one. I could have zero one. I could have 10 or I could have zero. Zero.

Speaker B:

Okay.

Speaker A:

So that if I was going to think about that as an integer, it could be the number zero through three. Let's say I have three bits. How many values can I represent with three bits?

Speaker B:

So it's two to the third. Right. Which is eight.

Speaker A:

Right.

Speaker B:

Eight. Right.

Speaker A:

What if I have four bits?

Speaker B:

So it's two to the fourth. Right. Which is 16 right.

Speaker A:

So I can represent 16 different ones. And I keep adding one more bit. I keep doubling how many different values I can represent. So I go to 16, then I go to 32, then 64, then 128, then 256, then 512, then 1024, then 2048, then 4096. And we keep going and going, going. As software developers, we get really, really familiar with the powers of two because we have to work with binary so much.

Speaker B:

So is it binary because it goes back to just being on or off?

Speaker A:

Yeah. So bits at the fundamental level are representing electrical signals that are either on or off. That's absolutely true. But the system is called binary because it's in base two. Yeah. So have you heard the term byte before?

Speaker B:

I have.

Speaker A:

Okay, so what do you think a byte is?

Speaker B:

A group of bits, right?

Speaker A:

It's a group of bits, and specifically, usually a group of eight bits.

Speaker B:

Okay.

Speaker A:

And for our purposes, we're going to say it's exactly a group of eight bits. In some historical contexts, a long time ago, there was debate about how many bits it should be, but today the standard is eight. So a byte is eight bits. So that's two to the 8th different values that a byte can represent. So a byte can represent 256 different possible values. So if we think about it as integers and positive integers, it would be the numbers zero through 255. So if I have a computer that's limited to eight bit numbers, it literally cannot represent easily without combining multiple different bytes together, which might be memory locations in the computer values higher than 255.

Speaker B:

It's kind of crazy though, that just eight bits or one byte gets you so much, right?

Speaker A:

That's what's really cool, right, is just having eight ones and zeros. I can represent 256 different kinds of values. And that's really the power of the binary system, is using something so fundamental of just whether or not an electrical signal is on or off. And doing that multiple times allows me to represent actually many different possibilities. And those possibilities in software can represent everything on the computer. So we use some number to represent some character. So the letter A is some number. Some position on your screen is some number, right? Some number is some number. So we can really represent anything with numbers. Everything, of course, in computers fundamentally is mathematics. And we just choose a different number to represent different kinds of values and then different types of those kinds of values.

Speaker B:

So all software, when it gets really boiled down, is coming to bits and bytes, right?

Speaker A:

And so here's another term you've probably heard before. What's a kilobyte.

Speaker B:

Would that be just 1000 bytes.

Speaker A:

Yeah, kilobytes, 1000 bytes. And that's using the international standard units. There are other types of systems where we consider a kilobyte to be 1024 bytes, but for today's purposes, we're going to use the standard units. So 1000 bytes. So how many bits is that?

Speaker B:

8000, right?

Speaker A:

Because eight times 1000 is 8000. So that's 8000 ones and zeros that you need to represent 1 KB. What about a megabyte?

Speaker B:

Is that a million?

Speaker A:

Right, that's a million bytes, 8 million bits. So you need 8 million ones and zeros to represent a megabyte. And I'll just give you a couple more that people have probably heard about. A gigabyte is a billion bytes, so that's going to be 8 billion ones and zeros, right? And then a terabyte is a trillion bytes, and so that is going to represent 8 trillion ones and zeros. And that's pretty typical for a hard disk today. A hard disk today will typically be around a terabyte, and so that hard disk can actually represent 8 trillion ones and zeros.

Speaker B:

Those are huge numbers, like you can't really even picture them, or at least I can't. Even a megabyte is actually a huge number.

Speaker A:

Yeah, I know it is a huge number. But let's use some context by talking about some things that people types of files that people use on their computer all the time. For example, an M P three file, typically at a 128 kilobit per second rate of encoding, it will be about a megabyte per minute. So a three minute song will be about three megabytes as an MP3 file. So how many ones and zeros do you need to represent Baby One More Time by Britney Spears?

Speaker B:

Sorry, I got distracted. By Britney Spears.

Speaker A:

So if Britney Spears Baby One More Time is three megabytes, how many ones and zeros do I need to represent britney Spears baby one more time Three times eight.

Speaker B:

So 24 times 1000.

Speaker A:

No, a million, right.

Speaker B:

Whoops.

Speaker A:

24 million ones and zeros to represent that song on a computer.

Speaker B:

So that's what's happening when I play a song on my computer.

Speaker A:

24 on anything, really.

Speaker B:

Not just a computer, my phone or anything.

Speaker A:

24 million ones and zeros are either being downloaded from the Internet or loaded from your hard drive, loaded into memory, and then being processed by the microprocessor to interpret as music. Yeah, and there's a lot more going on than that, but that's how it's actually being stored as 24 million ones and zeros. So let's give some other files that people are probably familiar with. So let's say an image you take with your camera. So an image you take with your camera, if it's compressed as a JPEG, it's going to be a few megabytes. Let's just say for speaking purposes, let's say it's five megabytes. Okay? It's going to depend on the resolution of the image and how compressed the image is, but let's just say it's about five megabytes. So five megabytes again, is 5 million bytes, and each one of those bytes is made up of eight bits. So 5 million times eight is 40 million, right? Yeah, it's a lot. 40 million ones and zeros to represent that image on your screen.

Speaker B:

This is making me just appreciate more what my phone, what my computer is doing, because I feel like nowadays we throw the term like megabyte. It sounds so small, it doesn't seem like a lot of space. But when you break it down into how into the values, it's actually huge.

Speaker A:

Yeah, but let's scale down and let's scale up. So let's scale down to like a text file. So you're typing something out and each of those letters usually is represented by just one byte. So each of the individual letters in some file that you're typing in a text file is really not that much memory. It's eight ones and zeros. But let's say a movie. Now, I don't know if you've ever downloaded movies from the Internet, but how big do you think a movie is? Let's say a movie that's a couple of hours long.

Speaker B:

It's got to be like a billion bytes or something.

Speaker A:

Yeah, I mean, so when we're talking movies, depending on the file format, depending on the resolution, but we're typically in the several gigabytes. When we're talking about a couple hour movie, for example, a DVD, the old style, not Blu rays, holds, I think, 4.7gb, a standard DVD. So that's about 5GB for a couple hour movie.

Speaker B:

Right.

Speaker A:

Well, gigabyte again is a billion. So a billion times eight is 8. Billion times five is 40 billion. Right. So a DVD holds 40 billion ones and zeros.

Speaker B:

Wow.

Speaker A:

Yeah, it's really quite a bit, no pun intended. So let's think about something that is related to this, which is how we transfer bytes over the Internet. Now, I really wanted to bring this up because oftentimes I've noticed people are confused by this. When you're shopping around for an Internet connection, typically it's measured in megabits. And people don't always realize that. They think that. They just assume that it's megabytes. But oftentimes when you're going and you're thinking, oh, should I get the super fast Internet connection? Should I get the kind of fast Internet connection? It's actually being measured in megabits. So, for example, I think our Internet connection here at our house is 250 megabits. So in megabytes, what do I have to do? Divide by eight. Right, divide by eight. So really, it's really approximately 30 something megabytes per second that in an ideal case, we would be getting as a download speed. Right. And there's differences between download and upload speed. But actually that's quite different than what you might have thought, because if you had a big file that you were downloading right. And you thought you were downloading at 250 megabytes a second, well, that would be great. If I had to download a gigabyte file, then I'd do it in 4 seconds.

Speaker B:

Right.

Speaker A:

But really, it's usually something much, much smaller than that in terms of megabytes.

Speaker B:

Is that just because of the work or energy it takes to send information? Why do they do it? That way.

Speaker A:

Why do they use those units? Yeah, that's traditional in networking technology. So it used to be when you'd get an Ethernet hook up to your computer, and most people are on wireless today, of course, there was ten Base T Ethernet, which was ten megabits per second. Then there was 100 Base T Ethernet, which is 100 megabits. It kind of makes sense because on, let's say, fiber optic cable, right. That actually has data traveling at the speed of light. But each one of those little photons going through is one bit. So it kind of makes sense that we're measuring it. How many bits can we get in throughput in a certain amount of time, rather than thinking about in larger units like bytes, which are just clusters of bits?

Speaker B:

Even this Internet speeds comes back to some kind of, like, almost physical component of it.

Speaker A:

Yeah, we're always talking about physical components, ultimately, when we want to get down into the nitty gritty. And what I like to talk about also, which really helps you visualize bits, is a DVD surface, or a CD Rom, or an audio CD surface. So actually, if you put a microscope over your DVD, what you would see if you looked really close is little grooves, little indentations. And what happens is, when your DVD is in the DVD player, a laser goes over the surface of the DVD, and it's looking to either get a reflection back of the light it's shooting at the DVD, or no reflection back, depending on whether the light is hitting a groove or not hitting a groove. And so areas where it hits the groove, it might represent a zero. And where there's no groove, it might represent a one. Or I might have that in reverse. But the point is that, yes, at a fundamental level, at somewhere, we actually need to physically be storing these ones and zeros. And when I say physically, I mean it might actually be that there's a very little bit of sand, or it might actually be that there's an electrical signal. So I also might mean electrons when I say physically. But at some level, yes, we need to actually have something that's actually there that represents the one or the zero.

Speaker B:

So a bit or a byte really is like the physical foundation of software.

Speaker A:

Yeah, it's a physical foundation of software, and it's also the physical foundation of all computing and hardware. And if we go back to, let's say, the 1980s and the 1990s, people really kind of fetishized like, what bit computer is this? You'd hear that a lot. You'd hear, oh, that's an eight bit computer, or that's a 16 bit computer. In fact, it was really popular in the game console wars. So, for example, the original Nintendo Entertainment system and the Atari before it were eight bit systems. Why were they called eight bit systems? Well, the microprocessor that was in the original Atari 2600 and the nintendo Entertainment System was the Moz technology six five two. And the six five two would typically manipulate numbers that were eight bits large, so one byte large. And so they were really kind of limited compared to the microprocessors we have today in terms of how large numbers they were comfortable manipulating. Now, that doesn't mean that a Nintendo Entertainment System had no way of processing numbers larger than 255. Of course, you could take more than one byte, put it together to get larger numbers. But natively within the microprocessor itself, what was it really fast at and really capable at, and what was its instruction set really good at? Well, processing numbers that were eight bit large. So we actually evolved quite quickly in the 1980s and 1990s. I'll just do it with video game consoles because that's things that a lot of people are familiar with. But if you went just a little bit further into the future, you had the Super Nintendo and then Sega Genesis. Those were 16 bit consoles. Suddenly, 16 bits, you can represent 65,000 or so different numbers.

Speaker B:

Right?

Speaker A:

Then we go a little bit further into the future. You got the Sony PlayStation that was a 32 bit console. Each one of those 32 bit numbers could represent up to 4.2 billion different values. Then you have the Nintendo 64. And suddenly we're talking in such a large number of values because remember, these are exponential increases because they're powers of two. So with a 64 bit number, the number is so large, you would never have that many different values that you actually have to deal with. Today, the microprocessors in our phones or in our laptops are 64 bit microprocessors. And so each little chunk that they process and do instructions on can represent such a large number of values that we're never really going to run out. But even when we were just at 32 bit computing, we were reaching real limits. For example, I told you earlier that 32 bits can represent around 4 billion different values. So actually, certain 32 bit computers were limited to 4GB of memory. Now it's all kind of coming together, right? Because a gigabyte we said was 1000 bytes, right? Excuse me? A billion bytes. Right. Sorry. And so 4GB is 4 billion bytes. Right. So when I think about it, that really can't even hold an entire DVD in memory live. Not that you necessarily want to put your whole DVD into Ram, but maybe you do. So this stuff really does matter. And there really were limitations. But now that we're at 64 bits, in terms of the amount that we can store, the amount that we can represent, or the amount of memory that we can address, there really are no limitations anymore.

Speaker B:

And this all evolved pretty, like you said before, pretty quickly to get to this point, to be able to get to this large of a software system.

Speaker A:

Yeah, absolutely. And it's not just at the microprocessor level. So if I think about it at the microprocessor level, we went from eight bit, 16 bit, 32 bit, 64 bit, and then we're not even talking about the speed of the microprocessor, the different instructions it can calculate. But if I think about actually just memory capacities, there's been incredible increases in that, too. So if you think about the original Macintosh came out in 1984. It had 128 Ram kilobytes. So think about that. It's not really that much storage. You couldn't even put an MP3 file in that memory, right? And it would run on 400 kilobyte floppy disks. They later on got up to 800 megabytes. But yeah, the original Macintosh had 400 kilobyte floppy disks and 128 Ram, and it had a 32 bit, 16 bit microprocessor, depending on how you look at it. So then here we are, like in the 1990s, and it would be normal by, let's say, the mid 1990s to have tens of megabytes of Ram and to have hundreds of megabytes of storage space. And then we look about today, about 2025 years later, and here we are with computers with terabytes of storage and many, many gigabytes of Ram. It's really been an incredible increase and it's enabled all kinds of new applications. But you can now see how just because of fundamental limitations, we couldn't really do movies on computers in the 80s, we couldn't really even do MP3 files in the 80s. Wasn't invented yet, for one thing. But we also just didn't have enough memory.

Speaker B:

And we'll talk about what memory is in maybe a future episode, too.

Speaker A:

Yeah, we'll get into more depth about it. But I guess the takeaway for people today is that a byte is made up of eight bits, and each of those bits is a one or a zero. And the more ones and zeros that we have, the more different things that we can represent.

Speaker B:

It's a really important thing to understand about what our devices are able to do, and I think at least it makes me really appreciate the work they're doing more.

Speaker A:

Yeah, I mean, no industry has advanced as fast as the computing industry has over the past 50 years. It's been absolutely if you read the history, it's absolutely incredible. And really, it goes back further than that. It goes all the way back to World War II. And if you just look at even ten years ago, the advances are amazing. Look forward to seeing everybody next week for another episode, but don't forget also to subscribe to us and to like us on your podcast, Player of Choice, and we'll see you next time.

Speaker B:

Thanks for listening. Close subscribe.

What is a Byte?

What is a Byte? In this episode we go down to the fundamentals and explain how data is represented in a computer. We discuss what a bit is, both at the hardware level and the software level. Then we discuss other terms like kilobyte, megabyte, gigabyte, and terabyte. We give various examples of real world files and their storage needs. Finally, we talk about the evolution of microprocessors from 8-bit to 64-bit.

David Kopec on Twitter

Find out more at http://kopec.live