#12 Open Source Software
It's about more than seeing the code.
Transcript
It's the biggest movement in software open source. What does the term mean, and how has it changed the industry? Welcome to COPEC Explain Software, the podcast where we make computing intelligible.
Rebecca KopecAll right, Dave, what does open source mean?
David KopecOpen source is a term that refers to whether or not people have access to the source code behind the software that they run. But it's more than just having access. It actually comes with a whole philosophy and a set of requirements for a piece of software to actually meet what's called the open source definition. And we'll get into some of those requirements later today. Open source is about how software is distributed. Is it distributed just in its final binary form, the machine code that only the computer can understand? Or is the software also distributed with its source code, meaning the original programming code that actually went into the production of the binary? So what the humans used to create the software?
Rebecca KopecAn open source piece of software, one programmer could create. And when they distribute what they made, the next programmer who looks at it is going to be able to see what they wrote.
David KopecThey're going to be able to see how it was written, and they might be able to improve on it. They might be able to make it run on other platforms that it wasn't originally built for. They might be able to submit patches that fix bugs or security vulnerabilities. They might be able to collaborate with other developers on the next version. So open source really is a way of enabling collaboration and also increasing transparency. By knowing how our software runs, we can be better assured that it's not doing things that we wouldn't really want it to do, which is something we'll get to talk about later today.
Rebecca KopecCould you give us some examples of open source software?
David KopecWell, open source software has become really ubiquitous. So if you think about the Firefox Web browser, or you think about the Apache Web server that powers most websites, or you think about infrastructure software, stuff that powers the actual networking behind the Internet, or you can think about whole operating systems like Android, the main operating system on smartphones. They all are open source pieces of software. Now, they all have different levels of open source versus proprietary code. For example, in most distributions of Android, the base layers of the operating system are fully open source. But most of the apps that you run and some of Google services are proprietary and not open source. So it can really vary from app to app and platform to platform. How much of the source code as a percentage is open source? But you can't really find a modern computer today that doesn't include open source components in its software stack. Open source software is completely ubiquitous, and it underlies most of the technologies behind the Internet that run the Internet on a daily basis.
Rebecca KopecThat's really cool. So how did this movement even get started?
David KopecSo the movement really goes back to the early days of computing. In early computing, people were much more cognizant of the cost of hardware than they were about software. In fact, all the early computing companies were hardware companies because there wasn't really a software industry yet. Once hardware became more commoditized, then there started to be companies involved in the production of software. One of the first companies in the production of personal computing software, for example, was actually Microsoft. But at the time that Microsoft rose in the mid to late 1970s, what was happening was during the personal computer revolution, people were swapping source code with each other, not even really thinking about any legal ramifications or concern. With doing that because it came out of this ethos of sharing that was common in academia where computers were first invented and first really came into prominence. So there was this culture of sharing and open source was kind of a natural spring out of that culture of sharing. Then when the first commercial software companies came about, they really tried to tap down on these open exchanges of source code. And there's a famous letter, actually that Bill Gates wrote. You can go look it up. It's kind of fun to read an open letter to hobbyists where he said, how is anyone ever going to make money on software if people don't pay for it? If you just copy it and give it to each other, how are all the people who actually want to invest the time in developing it going to make money? It turns out that there actually are a lot of ways to make money on open source software, and we'll talk more about them later today. But the movement had to actually get some kind of philosophical grounding. At first it was just people exchanging source code because they wanted to get more software. It wasn't because they had some grand philosophy behind wanting to keep the software open source. What happened in the 1980s was the rise of the free software movement, and this was founded by a man named Richard Stallman. And he felt that if computers were going to become so pivotal to society, such a cornerstone of our everyday lives, it would be important that we have some kind of freedom with regards to how they operate. And he specifically delineated four important freedoms with regards to software. Freedom one, he actually calls it freedom Zero is the freedom to run the software however you want. Freedom Two or Freedom One? In his case, we start counting in zero. A lot in computing is the freedom to see and inspect the source code behind the software. Freedom Three is the freedom to modify that source code. And freedom Four is the freedom to distribute your modifications of that source code. In other words, you can basically do whatever you want with the software. If software meets those four freedoms, then it meets the criteria for being what's considered free software. So he started this movement in the mid 1980s, founded the Free Software Foundation, founded what's called the Ganu Project, GNU. It stands for Gnus Not Unix. It's a recursive acronym. And their goal was to produce a Unix like operating system. Unix, being the gold standard of operating systems at the time, still is really that was compatible with Unix, but was built completely out of free components. And they worked on that quite extensively up to today. They're still working on it. They built basically every component of a free operating system except for one. They were missing the kernel, which if you listen to our second episode, what is an operating system you know is the lowest level of the operating system, the most central layer, the layer that's closest to the hardware. They didn't have that component. They were working on one for a long time and it wasn't really coming to fruition. But then a young man in Finland, Linus Torvalds, managed to actually create one on his own that worked together with all the other GNU components. And then when you combine all the GNU components with that Linux kernel, you got a completely free operating system that really evolved quite quickly during the 1990s. So Linus came out with the first version of the kernel in 1991 and by the late 90s it became one of the major server operating systems. And today it's not just one of the major server operating systems. The Linux kernel is also in all of our Android smartphones, all of our Chromebooks, many of our Internet of Things devices. The Linux kernel is really quite ubiquitous. We think about the major operating systems often as consumers of being Windows and macOS and maybe to a lesser extent iOS. But when you think about actually the vast majority of devices in the world, we think about all the devices that have computing chips in them. The vast majority of them run a Linux kernel today. So this movement's really been pretty incredible. And when you think about the person who founded the movement, he was a great programmer, Richard Stallman, but he was also really a philosopher. And then Linus creating the Linux kernel, he actually started that in his bedroom while he was still in college. So it's a pretty incredible story. And now all of this software powers the largest corporations and all of our daily lives. So the open source movement really came out of nothing. But there was an important split in the late 1990s. So the Free Software Movement, although Open source, predates the Free Software Movement. It was the main spearhead of developments starting from when Richard Stallman founded the Free Software Foundation up to the late 1990s. Then there was a split. What happened is the Free Software movement has always been very philosophical, very ideological, not super compromising, which to their credit in many ways, but a lot of people were saying, you know what, we need to get more commercial. Like, it's great that we're building all this software, but we actually want people to use it. And so that's where a bunch of people split what's called the Open Source Movement out of the Free Software Movement. Now, they had their own definition of what is Open Source software, and it's very similar to those four freedoms that we went over earlier. It's so similar, in fact, that 99% of software that meets the Open Source definition also meets the Free Software definition. So we're really talking about the same software, whether we're talking about the Free Software Movement or the Open Source movement. What really is different is their philosophies. The Open Source movement is much more about let's create great software and let's get as many people to use it as possible and let's create the best software. The Free Software Movement is more purely about freedom. It's this philosophical idea that people should have control over the machines that they use every day because they're such a central component of our lives and we should not be compromising about the fact that all software should be free. And so the Open Source people often find the Free Software people too ideological, and the Free Software people often find the Open Source people a little too compromising. Open Source now as a movement, has been adopted by what used to be some of the biggest proprietary software companies. Companies like Microsoft and Oracle are extensive contributors and producers of open source software. Now they still produce proprietary software as well. But this was actually unthinkable even 20 years ago. It seemed like an us versus them world where it was the proprietary software companies against these collaborating upstarts in the Open Source movement. But quickly, even in the late 1990s and early OS, people saw this was such a better way of producing software in many ways that large corporations got on board. By the early O's, companies like IBM and HP had already signed up to do Linux and other kinds of Open Source development. So the Open Source movement really sprang out, originally out of academia into a philosophical movement founded by Richard Stallman, then into a more moderate movement in the late 1990s that actually got commercial support into being completely ubiquitous today. But I think where we should go next is why? Why is this a better way of doing software? Why is it so important that we have these four freedoms? Why has it led to such wide adoption?
Rebecca KopecSo Richard Stallman and Linus, were these kind of the leaders? Were there any other folks that you want to just make sure that our listeners know of?
David KopecWell, one other person I'll point out is Eric Raymond. Eric Raymond wrote a book called The Cathedral and the Bazaar, and it kind of laid out the case for Open Source software, which I'll get into in a minute. And it became the defining book, let's say, of the movement. And that came out also in the late 90s. So a lot of things were going on in the kind of solidified the modern Open Source Software Movement, even though the Free Software Movement started in the 1980s. But if you had to pick just three people who were most instrumental in the founding of the Open Source Software movement, you'd have to go. Richard Stallman for founding the Free Software Movement, which later split Linus Torvald for founding the most important project in open source Linux, and Eric Raymond for actually defining the movement with his book The Cathedral and the Bazaar, kind of shifting.
Rebecca KopecGears to that philosophy behind it. What I've heard you say before, or is a saying in the Open Source or Free Software world, is it free as in freedom, not as in beer?
David KopecYeah. So again, the Free Software Movement is so concerned with the philosophy of it all and the ideology of it all. Their number one thing is that your software has those four freedoms. Again, the right to run the software however you want, the right to see the source code, the right to modify the source code, and the right to distribute your modifications. They think it's critical that all software has those freedoms, even if the software costs money. So you can actually be distributing the software and still be charging for it. And that can still be considered Free Software, which confuses a lot of people, and which is another reason that some people prefer the Open Source term rather than the Free Software term some people use. Now the Libre term, libre is in Liberty instead of Free is in price. So it doesn't confuse people. But what they mean in the Free Software Movement is that you have those four freedoms, not that the software doesn't cost any money, but why are those freedoms so important to them? Okay, number one is that these computers already people saw in the 1980s were coming to manage our lives instead of us managing them. It's a computing system that tells you your schedule. It's a computing system that runs your car, that runs your public transportation, that runs the media that you see, that runs your household, that runs now as your AI assistant telling you what you need to do next and helping you get through the day. So it's really important that we understand how these systems work and that we have control over them. That is really at the core of the Free Software philosophy because we don't know what large corporations might be doing in our software if we don't have access to the source code. We don't know if they're putting in backdoors that allow them to spy on us. We don't know if they have security vulnerabilities that they were too lazy to fix. We don't know if they're spying on us because they want to collect our data and use our data against us or use our data to sell information about us to other companies. We don't know what they're doing if we don't have access to the source code and the freedom to use that source code. So they really saw it as a philosophical movement and that's why they think it's a better way of producing software. But people in the open source movement tend to be more pragmatic than the free software people, and they see it much more as just a better way to produce software. Why is it better way to produce software? There's something called Linus's law. It goes like this with enough eyeballs, all bugs are shallow. Very simple idea. More people can see the source code than more people can catch bugs in the source code. And for Linus's Law to really work, the software actually needs to be looked at by a lot of people. So it has to be popular enough that enough people want to look at. It. Like, I have some open source projects that are used by a lot of people, and I have some that I put out there and almost nobody but me ever looks at. So is there really a lot of people that are going to go catch bugs in those projects that nobody looks at? No, but something like the Linux kernel that is open for anybody to look at and is used so widely and so many people are looking at all the time. Yeah, maybe this actually leads to more bugs being caught and a better engineering process as a result. So the big benefits of open source freedom, transparency and better engineering. In many cases, I left out maybe the most important one for software developers, which is collaboration instead of always starting from scratch because there's so many open source libraries and components. Today, when we start a new software project, we have a huge array of resources to pick from to get us started. So it's like we're starting from the fifth floor of building our software building, rather than starting from the 0th floor, because we have all these open source components that we can use, modify and put into our own software.
Rebecca KopecSo you mentioned that having software be free software or open source software, having access to the source code would allow a user or a programmer to see, well, what security vulnerabilities or if someone built a backdoor. But if I can see the source code and I understood it all, wouldn't that lead to being more if someone was a bad actor, couldn't they get to your stuff, get to your data?
David KopecYeah, that's always a concern for people when they first learn about open source. And I used to teach a class called Open Source Software Development and we would have a debate about this in the class, like, okay, is it actually more secure or less secure? Well, yeah, if nobody's using it. Like my example earlier, and Linus's Law doesn't really apply. There's not enough eyeballs looking that will actually fix the security problems, then. Sure, yeah, being open source might actually open yourself up to more problems. But if you're just relying on the fact that your source code is secret, for it not to have bugs and security vulnerabilities, then what you're really espousing is what's called security by obscurity, which is the idea that just because people don't know about the bugs, they basically are invulnerable, which is, of course, false. Right. If I have a lock on my door and it only takes two digits right. And I don't tell anyone those two digits, is it a secure door? Well, no, because two digits, I could guess them with 99 tries, right? 99 different tries. I could try all the 99 combinations and get into your house. So just because it's obscure that you didn't tell me what the two digits are doesn't mean that that's actually a good lock. Right. A good system should be good when it's open to inspection. When I actually go in and dig into it and look at how it works, it should be so secure and so well built that there's nothing I'm going to find that actually is a vulnerability. So security by obscurity gives you a false sense of protection. And so what we really want is we want things that are the opposite of that. We want things that are so transparent and so open that we can be pretty sure that there's no vulnerabilities because we know that so many people have looked at it and are able to check for themselves. Another thing that it gives you is third party verifiability. So let me give you an example of that. Let's say that I'm not actually writing some software, but I need some software that's critical for my company. I ask another company to write it. If they write it as proprietary software and then they just give it to me, how can I verify, since I'm not a software developer myself, that they actually did a good job? If they write it as open source software, I can get another person, a third party to go verify that they actually did a good job creating that software. So open source gives us third party verifiability, which is another thing that can make us feel more secure and more safe in our decision. So for all these reasons, open source software can be argued, is actually more secure than proprietary software when it's popular enough. Again, if we're just putting it out there and nobody's really looking at it, then yeah, it's pretty likely somebody might if they wanted to be able to find some security vulnerabilities in it. Because then if I'm one person releasing the source code versus one person not releasing the source code, and no one else is helping me develop it, well, obviously if someone wanted to attack it, having access to the source code would make it easier for them to attack it. That's totally separate issue from all the philosophical stuff we talked about earlier. But yeah, certainly if software is not popular, then making it open source might.
Rebecca KopecNot actually help us with open source software. Someone is creating it, right? It is like an act of creation. Are there some legal rules around it? How does it work legally to have something be just available to everybody?
David KopecLike most software, open source software is mostly protected by copyright law, meaning that the original source code that you write is an original work in the same way that if you wrote a novel, it would be original work that you then own the rights to. In the United States and in most countries in the world, you get automatic copyright the moment that you write something. You don't even need to submit to the Library of Congress anymore. You had to do that pre the 1970s, but today you can just write a piece of software, or if you write a novel, you instantly have the copyright to that work. So open source software is protected by copyright and copyright is also what can be used to enforce these freedoms. So open source software has to come with some kind of license saying how it's allowed to be used by other people. There are different kinds of licenses. There's two main kinds of licenses. There are what are called Les a Fair licenses, which basically say this software can be used by basically anybody who wants to use it for any purpose. And then there's also what are called copy left licenses, most famously the GPL. The Ganu public license comes again from Richard Stallman's movement, and he had a big hand in creating which is a viral license. Copy left licenses are viral because they say, yes, you can use this software for any of those four freedoms that we talked about earlier, but if you modify it, then your modifications must also be covered under the same license. So people who look at it negatively say, oh, it's almost like it infects the software. You go and modify this open source software and now your software is open source too under the same license.
Rebecca KopecWell, it's a clever way to very.
David KopecClever, as Richard Stallman would say, it was a great hack. He uses the word hack in a positive sense, but yeah, it was really clever. And it is actually the license used for a lot of major pieces of open source software, including the Linux kernel itself is released under the GPL license. So these two different philosophies, the GPL is much more aligned with the ideological wing, the Free Software Movement, whereas these laissez faire licenses are more aligned with the modern Open Source movement, which includes a lot of corporations. Corporations, of course, don't want to buy into this viral license because they want to still write some proprietary software. And so they like using the Les Faire licenses, which just say, here it is, open source, and you can do what you want with it, but if you use it and modify it, you don't actually have to release anything. You can go and make commercial software out of this open source software that's already out there without having to release the source code. So a lot of corporations don't like the GPL, they don't like viral licenses and copy left licenses, on the other hand, a lot of people who are more on the philosophical side prefer the copy left licenses. So they both exist. I would say that the Les Faire licenses, as corporations have started to more and more embrace open source software, have become the dominant force over the last decade or so. So most new open source software that you see is under some kind of Les Faire license rather than a copy left license. And I would say 1020 years ago it might have been more the opposite, where the copy left licenses were still very popular. That's not to say there's not still a ton of software coming out under the copy left licenses, just that the Les Farrell licenses are even more popular.
Rebecca KopecWhy is it called Copy left?
David KopecCopy left, because it's like a hack on copyright that's, again, I think a term that Richard Stallman came up with, copy left. So yeah, he's a really smart guy. I should also add that he also created himself some of the original most important open source software projects, including the Open Free Software version of Emacs, which is a major text editor. He was the main proponent of and creator of a lot of the important modifications to and he also was the person who started the GNU Compiler Project, GCC, which is one of the main C C plus plus compilers in the world. So he himself was an amazing programmer, really talented hacker in the 1980s, 1990s, but then he transitioned to more of an activist kind of role. And I guess since we're talking so much about him, I should mention that he's had a lot of very serious controversies for some things that certainly I don't agree with, that he said over the years. And so there's been some significant controversy around him. And I don't want to derail this episode and go into them, but if you're interested, you can look more into them. But he did an amazing job on the Free Software Movement, open Source.
Rebecca KopecAnd Free Software is really in a lot of ways, like Embodiment of this culture of programming. And it does allow people, allow programmers to collaborate. But are there specific tools that they're using to collaborate and create software together?
David KopecWell, one of the most basic things is that we couldn't really have open Source software the way that we think about it today without the Internet. That's why it was cool that we had people exchanging source code early on, on what they call a kind of sneaker net because people would have to walk over to each other and give each other a floppy disk or a cassette tape to exchange programs back in the 70s. But then with the commercial Internet and getting Internet access to everybody, we're able to have a large distributed movement that everyone can work on independently and then come together online to actually make the final product. So the Internet has really enabled the modern open source movement. If we want to talk about specific tools, the most important tools of open source software development are what are called version control systems. And these are systems that allow people to keep track of what changes to the source code different people have made and then merge those changes together in a safe way that maintains the entire history. And there's older versions of this that were client server models like CVS and SVN. Modern version control systems tend to be distributed, meaning that every person who works on the project has an entire copy of the tree of changes and then they can merge those changes back together. The most popular modern version control system used for open source development is Git, which was also developed by Linus Torvald. So he wasn't a one hit wonder, he developed Linux and then he also developed the dominant version control system, Git. And Git has a very, very popular website now owned by Microsoft, just so that we see how much Microsoft has embraced open source called GitHub. And GitHub has become the main place where people collaborate on open source software. There are certainly many alternatives and many other sites, but GitHub is really where I would say the majority of action is happening today in the open source world. So we have a real online community in GitHub. It's actually almost like a social network in some ways. You have a profile, you kind of follow and unfollow other people and you see what changes they're making to various pieces of software and everyone can work together on it.
Rebecca KopecYou've touched on this throughout the episode, but I think it's important to really hit this point home. But why is the open source software movement important?
David KopecYeah, I want to reiterate the main reasons. Number one is that it enables collaboration between software developers, allows software developers to build off of each other's work and also work together on new projects. Number two is the transparency that it provides. So it allows you to actually see how the computers in your life work. Think about like a voting machine, right? I would not want to use a proprietary voting machine, by the way. Unfortunately a lot of them are proprietary. But there really should be a law requiring, in my opinion, voting machines to be open source so that we can see that they're actually working in a way that accurately counts the votes. Right, instead of just trusting that some company software actually does it. Well, we want to have. Third party verifiability through open source to make sure that it really does work the way that they say it works. So transparency is a huge one, and then thirdly is maybe for security and bugs and just software engineering in general. This is actually a better way to produce software. We talked about earlier how there might be some limitations if you don't actually have enough people working on an open source project. But certainly for large scale software projects, you do get a lot of benefits from having so many different people looking at the source code and contributing. So for all of these reasons, open source has really become the dominant movement in software. Sure, there's still a ton of proprietary software companies, but open source is usually used in all these proprietary software companies today. And huge pieces of the software stack that we all use every day, like our web browsers, are written using open source components. So it's not surprising that it's become so popular, given all these benefits.
Rebecca KopecIs there anything else that we should know about open source software?
David KopecYeah, a big misconception is that people don't make money working on open source software. Actually, a lot of big companies work on open source software and employ people to work on that open source software. For example, companies that use Linux on their computers, for example, IBM, HP, Dell, they employ people who work on the Linux kernel. They employ people who work on components of the Linux operating system that make the system work better on their machines. There are also open source software companies, companies that actually sell an open source product. And they have many different business models. One can be that they sell support for the product. Another can be that they sell a hosted version of the product. Another can be that they sell a proprietary version of the product, but their main version that most people use is open source. But then if you want to upgrade to the better version, there's some proprietary bits built on top of the open source version. And this is not just a couple companies. There are many, many major companies that are using these models, including companies that you might have heard of, like IBM, like JetBrains, like even Google to an extent. Of course, they build a lot of open source software at Google, and it powers some of their products that they then sell more premium versions of, or that they just build on top of. So basically every major software company today employs people who work on open source software. So yes, of course you can make a lot of money working in open source. That doesn't mean that somebody who submits a project on their own and just puts it out there is really getting any benefit other than feeling good about the fact that they've given other people this software and the freedom to do what they want with this software. It so there's certainly people that are just working on open source software and not getting any money out of it. But there's also a ton of people who are open source now has actually been something that's gone into other areas of our life. So it came out of software originally, but now we see open source in areas like bioinformatics where people openly and transparently and freely license their discoveries. Maybe they discover a new gene and instead of going and patenting it, they go and actually distribute all the information about it under a license that other people can use. You see this in Wikipedia. Wikipedia is basically an open source encyclopedia where all the content on Wikipedia is openly licensed. So you can go actually reuse it how you want to on your own site under certain terms, but at the same time, you can go and edit and modify the original version of Wikipedia and distribute those modifications as well. So we see open source creeping its way into other facets of society. As people have seen how well it's worked for software, there's many attempts to use it in even further applications like media, like hardware, and we're seeing those start to come to fruition over the last few years, and it's exciting to see how they'll evolve and will. They also kind of unsettle and displace older proprietary standards and proprietary companies? We'll see. But certainly this whole philosophy of being transparent and building on each other's work wasn't invented by the open source software movement, but is highlighting it as a very effective way to collaborate across the Internet in the modern world.
Rebecca KopecNot just an effective way to collaborate, but just an effective way to create something and create powerful products?
David KopecAbsolutely. I mean, without its open source philosophy, I don't think something like Wikipedia could really exist. All right, well, it's been great having you with us this week. Don't forget to subscribe to us on your podcast player of choice, and also leave us a like what's?
Rebecca KopecOur Twitter handle at COPEC explains K-O-P-C-E-X-P-L-A-I-N-S. Great.
David KopecAnd we'll see you next week.
Rebecca KopecThanks for listening.
The open source movement has completely changed the software industry. In this episode we explain what it means for software to be open source. We dive into the origins of the movement, its split from the free software movement, and some of its key players. We explain the four freedoms, the legal model behind open source licenses, and some of the ethics. Most importantly, we explain the benefits of open source software, and why it has become so ubiquitous. At the end we dive into other areas of the world where the open source model is being introduced.
Follow us on Twitter @KopecExplains.
Theme “Place on Fire” Copyright 2019 Creo, CC BY 4.0
Find out more at http://kopec.live