Kopec Explains Software
Computing Concepts Simplified
1 year ago

#122 Open Source Licenses

Apache, MIT, GPL, what do they all mean?

Transcript
David Kopec

Understanding open source licenses is critical for any software developer today. What are you allowed to do with a piece of open source software? What are your rights? In this episode, we'll help you understand the different kinds of open source licenses. Welcome to COPEC Explained Software, the podcast where we make computing intelligible. Today we're talking about open source licenses. But before we get into open source licenses, I think it's important that you know something about open source software more generally. And we've done several prior episodes on open source software, including a broad introduction to open source software. So I'm going to put a link to those prior episodes in the show notes. The other thing I want to mention at the outset is that we're going to talk about a lot of legal terms today. And I'm not a lawyer, so please take my advice with a grain of salt. And if you have anything serious you need to do with regards to software licensing, you should probably consult a lawyer. But before we get into specific licenses, we need to understand the foundation of software licenses, and that's copyright. Copyright is a type of intellectual property protection, and the laws around it can vary from country to country. What I'm talking about today is somewhat US. Centric, although a lot of the laws around copyright are similar to those in the US. And other countries. But of course, consult your own country's laws. The purpose of copyright is to give the creator of a creative work exclusive rights to that work so that they can exploit it and make a profit. The idea is that will encourage people to make more creative works. That's why governments originally instituted copyright laws. Now folks disagree on how copyright should be enforced and how long copyright should be, and some people don't even believe copyright should exist at all. But the fact of the matter is that it does exist, and it exists for all written works. And software is a type of written work because you write the source code for the software, that's a written work. That's how courts have decided it, at least in the United States. Now in the United States, when you write a written work, you immediately own that copyright. So if I go and write a novel, I immediately have the copyright to it. Now I can actually submit it to the US. Government officially to register it and make sure that that's going to be easily enforced in a court of law. But actually, as soon as you write it, the convention is the United States that you immediately have the copyright, and that's true of source code too. So if you go and write a program right now, you go and write a Python program in your favorite editor, you immediately own the copyright to that Python program, and that gives you the exclusive rights to that program. You can do whatever you want with that program. You decide who can use that program, who cannot use that program, how that program is distributed. Now, this is all true, of course, unless you're under some specific agreement, like an employment contract or a freelancing contract that says that you're assigning the copyright to somebody else, but if you're writing something for yourself, you immediately own the copyright to it. Okay, so you have that copyright and that copyright gives you the exclusive rights to decide how that source code is going to be used. But oftentimes you actually want to allow other people to use that software or use that source code. And that is the purpose of a software license. A software license is a document created by a software developer that grants permissions to users with regard to a compiled program or with regard some source code licenses can range quite a bit, they can go from everywhere, from hey, all you can do is use this software for a very specific purpose. So being very restrictive, or they can be extremely permissive saying hey, just basically do whatever you want with this, I really don't care. And open source licenses also have a range of how restrictive they are. There are licenses that are Les A Fair which basically say do whatever you want, and there are very restrictive licenses that say, well, if you use the software, you have to abide by all of these terms and you need to do this very specific thing when you distribute your changes. So of course there are proprietary software licenses that give the users of the software almost no rights, or certainly keep the software closed source and say that we're not distributing the source code at all and don't even think about distributing the source code. So we're going to specifically focus because that's the theme of the episode on open source licenses, but we could do a whole episode on the different restrictions and stipulations that are in proprietary software licenses. But today we're talking about open source licenses and there are really two broad categories of open source licenses. There are Les Faire licenses that have very few restrictions and basically say you can do whatever you want with this source code. Often the one stipulation they'll have, you have to give the original author of the source code some kind of credit somewhere in your software that incorporates that open source software. And then there's copyleftvirallicenses which say, well, you can use this source code, but if you do, you may need to make your source code that incorporates it open source as well under the same license. That's why these copy left licenses are sometimes called viral because it's like they infect the software that they touch, forcing it to be under the same license. So in the world of Les Faire and copy left licenses, there are some standard licenses, there are some very popular licenses that many, many projects use, and that if you're creating a new piece of open source software, you're very likely to use as well, because they're well understood. They're kind of like a brand name. People know what it means, they know what the restrictions that come with those licenses are, and they also know how the licenses may protect them. So people tend to adopt one of these popular open source licenses when they release a new piece of open source software. The three most popular open source licenses are the MIT license, the BSD license, and the Apache license. The MIT and BSD licenses are very similar. There are different versions of the BSD license. You might have heard of the terms BSD two clause license versus BSD three clause license. But in their simplest form, the MIT and BSD licenses basically say, here's the software, I'm not responsible if anything goes wrong with it. So if you use my open source software that's released under the MIT license, I'm giving you no warranty. And if it breaks your computer or it causes you to lose data, that's not my problem. I put this out there for you for free to use the source code however you want to. It's your business if you use it, but don't blame me if anything goes wrong. It's your responsibility to audit it and make sure it does what you expect it to do. And they also say, and if you use it, need to give me credit. So if you are using in a commercial piece of software, you don't necessarily need to release your source code and give me credit in that way by showing where you put my code inside of your code. But you do like in an about box or in some documentation that comes with the software, have to say, hey, you're using my open source library, let's say, released under the MIT license. And here is the original text of that license that it was released with. The MIT license and the BSD license, which are both really popular. And again, there's more than one form of the BSD license are really easy to read. And so I recommend you go and actually check out the text of these licenses. And again, these licenses are a way of basically giving the user of your open source software as many rights as you can. They're saying like, hey, do whatever you want with it, just give me a little credit somewhere and don't blame me if anything goes wrong. Another license like that, that's popular is the Apache license. And the Apache license adds a couple more stipulations. It also tries to protect the original creator from patent lawsuits, patents being a different kind of intellectual property than copyrights. And they're saying, well, hey, there's a patent clause here giving a little bit of protection, saying, hey, don't infringe on patents, okay, I'm letting you use the source code, but that doesn't mean that you have the ability to infringe on a patent. And it also says something about contributors. Contributors, which we'll get back to in a couple of minutes, are folks who actually provide patches or new additions to an existing piece of open source software and the Apache license on top of those stipulations. The MIT license also says if those contributions come in, they're also going to be by default under the Apache license unless specified otherwise. So the MIT license, the various BSD licenses, and the Apache license are by far the most popular Les Fair licenses that are out there. The most popular viral licenses are the GPL and the LGPL. The GPL and the LGPL in many ways are quite similar to the Apache license with one really important and huge twist. And that is if you use a piece of GPL software and you incorporate it so that it interacts directly with your source code in a way that they're kind of meshed together, then your source code also needs to be under the GPL license and therefore also open source. Whereas with those Leslie Fair licenses, you can create a piece of closed source software that incorporates those open source MIT Apache licensed libraries, and doesn't necessarily mean that it has to be open source software as well. So you can create proprietary closed source software and include MIT licensed software within it, but you can't do that with GPL software. With GPL software, if you incorporate GPL source code directly in your proprietary product, well, actually your proprietary product now needs to be open source as well, also licensed under the GPL. So it kind of like infects your proprietary project. So there's a lot of proprietary folks who are afraid of using GPL software, and we'll get back to that in a minute. There's other forms of the GPL. There's something called the LGPL. By the way, GPL stands for GNU Public License. It was created by the Ganu project, which was created by the Free Software Foundation and Richard Stallman, and we've talked about those folks on prior episodes, but that's where the name comes from. And so most of the software released by the Free Software Foundation is under the GPL license, but so is a lot of other popular software, including, for example, the Linux kernel. And there's another variant, as I was mentioning before, of the GPL called the LGPL or the Lesser GPL, and it says, well, yeah, you got to release your source code and release under the LGPL. Two, if you're directly modifying this original library, but if you're just linking against it, you don't have to. So it's not quite as infectious. The LGPL is less infectious than the GPL because we can link against libraries that are LGPL and not have to then release our own code under the LGPL as well. There are also different versions of the GPL and the LGPL. The Free Software Foundation, the new project, from time to time updates these licenses and they add new clauses, they discover new ways that they feel the software needs to be protected. And so they often will put a clause in that says software released under GPL version blank can also be licensed under newer versions of this license. And just like software, these licenses will have version numbers. There's like GPL one, GPL two, GPL three. The Apache license also actually has versions, so there's Apache One, and the vast majority of software is under the current version, which is Apache Two. So software licenses actually can change over time and get updates and new versions, but it's pretty stable. The MIT license basically doesn't change. The Apache license has had a couple of versions, the GPL has had about three versions in about 30 years. So it's not like there's a million different versions of these popular licenses. But it's important to sometimes know the differences between the various versions and understand that a lot of pieces of software say they can be relicensed under newer versions that might include new stipulations or new freedoms as well. So we have these two broad categories. We have GPL type licenses, viral type licenses, and we also have Les Faire type licenses. And for obvious reasons, a lot of big proprietary software companies like Apple, like Google, like Microsoft, are sometimes afraid of interacting with viral licenses like the GPL because they don't want their proprietary software to have to then be released under the GPL as well, just because it used a little GPL component in it. So they will often avoid viral licenses and prefer open source projects that are released under Leslie Faire licenses, like the MIT license or the Apache license because they want to be able to use open source libraries. But they want to have to not be forced to contribute back or not be forced to go and release all their source code just because they want to use one library. But there is another alternative. So there are viral licenses, there's Les Faire licenses, but there's actually another way of getting rid of your copyright and letting other people have rights to use your source code in the way that you want. And that's called the public domain. Creative Works, which includes source code that's released into the public domain, is basically a way of the original author relinquishing all of their rights to that piece of intellectual property, explicitly saying, I no longer have any rights to this. You can do absolutely anything you want. And there are some real pieces of software like SQLite that have been released in the public domain. There are popular libraries that have been released in the public domain. So this is a real strategy. You might say, why would anybody do this? Maybe because they're just releasing the source code for the good of the community and they don't want to get tied up in all of these licensing issues. They don't even worry, like the MIT license does, about giving credit and getting that little credit in that about box somewhere. Public domain is not always valid in every jurisdiction. So there's actually a Creative Commons license called CC Zero that does the same thing as public domain, but it's still like a quote unquote license. Instead of just the author saying, here, I'm putting this in the public domain. So you might look into CC Zero if you're thinking about releasing something into the public domain so that you're legally safe in every jurisdiction. Also, old pieces of intellectual property, old things that were once copyrighted, automatically eventually go into the public domain. In the United States, that takes many, many decades. In fact, we're just now getting novels from the 1920s released into the public domain. And so of course, most software has been written since the 1970s. And so we're a long way from copyrighted software just aging out into the public domain. So there's not a lot of software that's in the public domain just as a result of its age. In fact, I don't think there's anything that's in the public domain just as a result of its age. So putting something in the public domain is something that explicitly has to to happen. But this is a common misconception amongst users of open source libraries. They think, just because something's open source, I can do whatever I want with it. No, open source licenses like the MIT license or the GPL, they do come with all kinds of restrictions and stipulations, especially the more restrictive GPL, and you need to actually follow those stipulations. When you use a piece of open source software, it's not just public domain. Just because you can see the source code doesn't mean you can do whatever you want. The only case you can do whatever you want is if the software has explicitly been put into the public domain. So let's talk about contributors. We mentioned at the beginning that in the United States, at least as soon as you write something, it's automatically under copyright. What about if you have an open source library that you have out there, and somebody else sends you a patch, they send you a pull request on GitHub, they want to contribute something new to your existing piece of open source software. They're then called contributors. And actually remember, just like you originally own whatever you write, they do too. So they actually own the copyright when they first write that patch. Now, some licenses explicitly say, like the Apache license, that contributors, unless specified otherwise, are implicitly agreeing that their contributions will be under the same license. But a lot of licenses don't say that. Like, the MIT license doesn't say that, for example. And if that's the case and you're worried about the legal consequences of people contributing to your open source project, then what you need to have is a contributor agreement. This is a document that the contributor signs saying that they're allowing their contribution to be licensed under a license that you specify or even just assign the copyright over to you or whoever you want to have ownership of that open source library or piece of software. So you need to be really careful about contributors, actually, because contributors by default own the copyright to their written source code, just like you own the copyright to your written source code. So let's talk about some best practices. If you're a software developer or even just a user, the first thing is, if you are somebody making an open source project, make sure that you include a license with it. If you don't include a license by default, you're retaining all of the copyright. And that leaves a lot of ambiguity because people see that you posted the source code online so they think they can use it in their own projects. But actually you have all that copyright because you never relinquished it. You never created a license saying other people can use it however they want to. So every piece of released open source software should have a license file that's included within it. And even better than that is you actually include a header in all of your source code files saying what license it's under, and either including the whole license there if it's short enough like the MIT license, or linking to it so that people really know, this is what I can do with this piece of software. So always include a license. The next thing, if you're getting contributions and you're worried about the ambiguity about the copyright of those contributions, you should have a contributor's agreement, especially if you're not using an open source license that already has a contributor's clause like the Apache license does. And if you're taking other people's pieces of open source software and incorporating it in your own software, make sure you read those licenses and make sure you follow them to the letter of the law, because it's really nice they're letting you use their software for free to incorporate in your software, right? So the least you can do, let's say it's released under the MIT license, is follow the MIT license and give them credit in your about box or that documentation that comes with your software. You have that responsibility. That is the agreement between you and the creator of that original open source library. And I hate to say it, but I see folks violate this all the time. I see, let's say, an app on the App Store that I download and I know it incorporates an open source library and I can't find anywhere in its settings screen and it's about screen credit to that open source library that it includes. And that is actually a violation of the license that that open source library that they're using was released under. So that always annoys me because I'm a releaser of open source libraries myself and I release them not in the public domain under licenses where I do expect to get credit. Why do I want that credit. Well, one, it feels good maybe, but two, maybe it's good for the reputation of the library. Maybe it's going to help people get some notoriety by having that credit. Probably won't do a lot, but it's the minimum you can really do. So those are some best practices. The last thing I want to talk about is what makes a license open source. I save this for last because it's something we've actually talked about on prior episodes, but it's worth repeating. The Open Source movement splintered off from the Free Software movement and the Free Software movement was very clear about what makes something free software. There were four freedoms that any piece of free software must have, and those include the right to run the software however you want to, the right to distribute the software, the right to inspect the software source code, and the right to distribute your modifications of the software. Those are the definitions that all Free software licenses must have. But we're talking about open source licenses today, not free software licenses. It's generally accepted that the Open Source Initiative is the organization that has determined what is or is not an Open Source license, although not everyone agrees that they should have that power. But they have published a definition called the Open Source Definition and it has nine criteria that specify what is and is not a valid Open Source license. And I'll link to it in the show notes. But recently, in the last few years, there have been more and more businesses that have decided they don't want to go all the way. They want to have software where people can see the source code and make contributions to the source code, but they don't want to necessarily give the users all of the rights that the Open Source Definition specifies. And therefore that software is not necessarily actually classified as Open Source software by most practitioners. I'll give you one very specific example. There's a bunch of server side software where the companies who created it have decided, you know what, I don't want my competitors just running this Open Source software on their own servers because I want to make all the money hosting the software and charging people for the privilege of using me as a host. And I don't want other hosts being able to just take my server side software and just duplicate the experience and undercut me on price. And so they've actually specifically said, well, this software can only be used for these purposes under this license. That violates a couple of the clauses in the Open Source Definition. And therefore, according to the Open Source Initiative, the licenses that these softwares have been published under are not classified as Open Source. So Open Source software is generally expected to have certain freedoms. Just having the source available does not make a piece of software open source. It just means exactly that, that the source code is available and people can look at it. But if you don't have the right to use that source code, basically however you want to, but you can read the letter of the definition from the open source initiative, then it's not truly open source software, so be mindful of that. And it's another just reminder to always read the license, because you might think just because the source code is available, this is definitely open source software, but it might not be. Even some very proprietary companies sometimes make some of their source code available, but not necessarily under a truly open source license. All right, thanks for listening to us this week. I want to remind everybody to rate us on your podcast player of choice. Give us that five star review. If you're really enjoying the podcast, leave us a written review on Apple podcasts. That always helps. And we'll be back in a couple weeks with another interesting software topic to explain to you. Have a great day and don't forget to follow us or subscribe on your podcast player of choice.

Understanding open source licenses is critical if you're a software developer. What are your rights and responsibilities when you incorporate an open source library in your program? In this episode we explain why we have licenses, the different types of open source licenses, and best practices for an open source practitioner.

Note that the licenses we refer to as laissez faire licenses in this episode, are also widely known as permissive licenses.

Show Notes

Follow us on Twitter @KopecExplains.

Theme “Place on Fire” Copyright 2019 Creo, CC BY 4.0

Find out more at http://kopec.live