Kopec Explains Software
Computing Concepts Simplified
1 year ago

#116 The Smallest Possible Programming Language

Brainf*** is a really small Turing-complete programming language.

Transcript
David Kopec

What if a programming language were so small that you could explain the entirety of it in less than five minutes? We're going to talk about such a programming language today. Welcome to COPEC Explain Software, the podcast where we make computing intelligible. In this episode, we're going to talk about a very small programming language. And when I say small, I mean that it doesn't have a lot of features. I don't necessarily mean simple because actually writing programs in the programming language we're going to talk about is quite hard. But the programming language we're going to talk about today has just eight commands and each of those commands is represented by a single character. You might be wondering, how is that possible? And how is it possible that we can write the same programs in this programming language as we might in a sophisticated language like C or Python?

Rebecca Kopec

Before we even get to that, how do we know that it's even a programming language?

David Kopec

Right? What does it mean for something to be a programming language? We talked about that in a prior episode on what is a programming language? And I'm going to link to it in the show notes. But I want you to know a little bit about what it means for a programming language to be touring complete. Because any programming language that's Turing complete, it's been proven can compute some of the same kinds of problems as any other programming language that's Turing complete. For a programming language to be Turing complete, it means that it can simulate a Turing machine. A Turing machine, which was a concept that was an abstract model of machine invented by Alan Turing, can implement computer algorithms. Let's describe the abstract model. Imagine a tape of unlimited length split into cells with characters on them. Imagine a head that can read or write the character in a cell. Imagine that the head can move left or right one cell. Imagine that it can branch move to a different cell depending on the value that the head reads. That's it. That head moving up and down the tape going from cell to cell and making some decisions occasionally. That's enough to be able to implement any computing algorithm. That might be hard to believe, but the proof of this was worked out in the 1930s and extended in the 1940s and it's been well known for a long time. To put that into practical terms, I'm going to quote from a book. This is from Programming Languages, Principles and Paradigms, first edition, published in 2002 by Tucker and Noonan. A programming language is said to be Turing complete if it contains integer variables, values and operations and has assignment statements and the control constructs of statement sequencing, conditionals and branching statements all of other statement forms while in for loops, case selections, procedure declarations and calls and data types, strings, floating point values, et cetera. Are provided in modern languages only to enhance the ease of programming various complex applications. What that quote is saying in the latter half is that we don't need all those things. They're just enhancements that make it easier to program more complex programs. But we can actually implement any program with just the simple constructs that were described at the beginning integer variables, ways of moving around, in other words, control constructs and ways of assigning values to particular variables.

Rebecca Kopec

And someone has gone ahead and used the turn complete model to develop a programming language that is only eight commands.

David Kopec

That's right. And that programming language has a bad word in it. So we're going to refer to it as Brain F for the rest of this episode. But I will link to the full name in the show notes. We don't want to have our podcast marked as explicit, so that's why we're using Brain F instead of the full name, which I'm sure you can imagine what it is.

Rebecca Kopec

So who created Brain F, when and why?

David Kopec

Urban Mueller created the language in 1993 as a way of seeing how small he could build a programming language with, and his implementations of the actual thing that takes the programs and does something with them. His initial compiler was just 296 bytes of assembly language. That's pretty incredible. So not only is the language itself super small, but the first implementation was also super small.

Rebecca Kopec

Why is that even interesting?

David Kopec

It's interesting, really, mostly from, oh, gee, look what we can do vantage. But also from an educational standpoint, developing an interpreter or a compiler for this language is often done in the classroom as a way of learning about compilers and interpreters and what it means for something to be Turing complete. So really this was more of an educational exercise or OG exercise than it was something that was going to be.

Rebecca Kopec

Practical, just kind of interesting, almost like creating a puzzle.

David Kopec

Yeah, absolutely.

Rebecca Kopec

All right, well, if it's only eight commands, let's describe the whole language.

David Kopec

Yeah. And let's start with the data model. So in the original implementation, you have 30,000 cells. Does that sound familiar? That's just like the cells in a Turing machine and each of those cells can hold one eight bit value. In other words, it's 30,000 bytes of cells, 30,001 byte cells. And you also have an instruction pointer which is pointing to what's? The current instruction in the program that's being executed. And you have a data pointer that's pointing to one of those cells.

Rebecca Kopec

Then what are the eight commands?

David Kopec

Yeah, it's amazing because there's only eight. I can explain all of them to you in just a couple of minutes. Now, keep in mind our data pointer and our instruction pointer. We're going to start with the greater than symbol and the less than symbol. The greater than symbol says we're going to move one data cell to the right, moving the data pointer basically up by one, or move one data cell to the left. Move the data pointer back by one, subtract one from the current index of the data pointer. Then we have the plus and minus symbols. The plus sign says whatever cell the data pointer currently points to increase the value in that cell by one. And the minus sign says decrement the value in the current cell that the data pointer is pointing to by one. Okay, you just learned half the language, so half the language is just moving forward and backwards amongst the cells and increasing or decreasing the value by one in a particular cell. Then we have the period character. The period character says print out the current value in the cell that the data pointer is pointing to. We have the comma character. The comma character says input a value, usually from the keyboard and put it into the current cell that the data pointer is pointing to. Last and most complicated are two brackets, an opening square bracket and a closing square bracket. If we hit an opening square bracket, we check is the value at the current cell that the data pointer is pointing to equal to zero? If it is, we're going to move to a matching closing bracket that's going to be ahead of that opening bracket. If we hit the closing bracket, we're going to check is the value at the current data pointer not equal to zero? If it's not, we're going to move backwards to the opening bracket that matches that closing bracket. And that's it. That's all eight of the symbols. That is the entire programming language. So you write programs in brain F by writing out these symbols many, many, many times.

Rebecca Kopec

While it's only a few symbols, only those eight symbols, it's still actually pretty hard to read.

David Kopec

Yeah, that's right. Some people think about programming languages like this as write once, meaning that the original person who's thinking through what they're writing can write out the code, write out these eight symbols and whatever interesting long configurations they need to to do something. But then for a person reading it, it's like really hard to see what all those symbols mean together. Let me give you an example. Suppose I was writing in C and I wanted to increase the value in some variable. I would write plus equals 25, let's say, because I wanted to increase that value by 25. In brain F, assuming that I had the data pointer pointing to the right cell and I wanted to then increase that value by 25, I would have to write a plus sign 25 times. When I'm reading, it's going to be hard to see, oh, how many times exactly did I write that plus sign? Not to mention that I would have first had to get the data pointer pointing to the right cell. In C, I would just use a variable name in brain F. I would have to know what index in my cell array is the right cell, and first move to that right cell, maybe by using a lot of greater than and less than signs. Or maybe I do it a little bit more intelligently using some brackets to kind of repeat doing the greater than or the less than sign. And by the way, I could use the brackets also to repeat doing the plus sign to increase by 25, but it would still be very hard to parse. So this language was not designed to be something that would really be used in production or for anything real. It's totally just educational. Just to prove that with so few symbols that I just explained to you the actual symbols, in less than three minutes, we can have something that is a quote unquote real programming language.

Rebecca Kopec

So how would you use comments in Brain F?

David Kopec

Yeah, the interesting thing is, because there's only eight symbols, anything that's not one of those eight symbols can just be ignored. So comments in Brain F can just be any text you want to write that's not those eight symbols, and an interpreter or compiler will just ignore them. And that's the only way that you can write programs that people can actually read back because it's so hard to parse, is you can just write text amongst all the symbols as much as you want. The lesson here is not that this is a great programming language. It's not that this is something that anyone would want to use in the real world. It's how simple it can be to actually write something that can be used for solving the same kinds of problems that any other programming language can be used to solve. And it really illustrates well the concept of a Turing machine and therefore what it means for a programming language to be Turing complete. So this is a great educational language and there have been other languages that have tried to be really small, and Brain F is probably the most famous of them. If you're a programmer out there and you want to take this to the next level, I highly recommend that you take the time to try to implement a Brain F interpreter. It's something that you can do in just a few hours. It's actually going to be one of the chapters in my next book and going into a lot more depth about the language and also about how to implement an interpreter for it. But it's something that you could totally do on your own just by reading the description of each of the eight commands. Okay, thanks for listening to us this week and learning about the world's smallest possible programming language. Rebecca. How can people get in touch with us on Twitter?

Rebecca Kopec

We're at Copeck explains K-O-P-E-C-E-X-P-L-A-I-N-S. We'll be.

David Kopec

Back in two weeks with another interesting software topic. Don't forget to subscribe to us on your podcast player of choice. Whether it's Apple podcasts or spotify and leave us a review that always helps with the show. And we'll see you in two weeks. Bye.

How small can a programming language be and still be a programming language? In order for a programming language to be able to compute the same sorts of problems as any other language it must be Turing-complete. Amazingly, there is a programming language that has just eight commands, represented by eight single symbols, that is Turing-complete. In this episode we describe what it means to be Turing-complete and how this tiny language does it.

Show Notes

Follow us on Twitter @KopecExplains.

Theme “Place on Fire” Copyright 2019 Creo, CC BY 4.0

Find out more at http://kopec.live