Definition: Buffer Overflow Kee HinckleyNovember 21, 2000Leave a comment Buffer Overflow If you read any press on computer security problems, at some point you are likely to come across the phrase “Buffer Overflow”–it’s by far the most common security error that programmers make. It’s common for several reasons. It has nothing to do (by itself) with security. It’s an easy error to make, and a hard one to detect. It’s human nature not to expect the unexpected. So what is a buffer overflow? I’ll start off extremely non-technical here, and gradually bump up the level until the final section, at which point if you don’t understand programming and call stacks you may want to stop reading, and if you do understand them, you may decide to start reading. First, here’s the non-technical explanation. You need to tell a co-worker something important, you go to their office, expecting a conversation something like this: “Hello.” “Hi.” “I though you should know about this new thing.” “Oh? What is it?” You tell them the important thing. Instead the conversation goes like this: “Hello.” “Hey! Just the person I wanted to see! Did you hear about this crazy election thing,”…followed by five minutes of political diatribe. By the end of the conversation, not only have you forgotten what you came in to say, you’re on the way out the door with a poster to protest something. Your buffer just overflowed, and you were hijacked for a purpose other than your original intent. You had an expectation of how the conversation would go (the protocol) and it was violated, with the result that you ended up doing something different. That’s exactly what happens to a program when someone exploits a buffer-overflow problem. Now a slightly more technical explanation. When a program is designed, it is designed with an interface to the outside world. That interface is not just what you see on the screen, but also how it communicates with other programs and the operating system. The interface is typically defined in terms of either an API (a set of programming conventions for direct communication with another piece of code) or a protocol (a definition of a set of data and commands to be passed between programs). Think of the API as how your brain tells you arm to pick something up, the protocol as how you ask someone to pass the salt. Of course the protocols are not always executed directly. Your brain tends to use the mouth API to tell someone to pass the salt, rather than using telepathy directly, and many programs use standard sets of code provided by the operating system when they want to use a protocol. Now, these APIs and protcols specify the form of the information to be passed back and forth. For instance, a specification might say that the correct response to an initial communication is no more than five letters long (e.g. “Hello”). In the days before people had to worry about hostile programs, code was written assuming that the program you were talking to was going to be following the rules of the protocol. If the protocol said “five letters” then there wasn’t a lot of point in leaving room for six. Sure, your program might crash if there were six, but it wasn’t your bug, it was a bug in the program talking to you–it should have sent five letters. So that’s a buffer overflow. You expect one thing, and somebody sends you something much bigger. The “buffer” that you had set aside to store that information doesn’t have room for what you get, and you end up writing those six (or six hundred) letters on top of other things that you were trying to remember. Obviously that’s not going to be a good thing for the continued functioning of your program, but it turns out it’s also a major security problem. And still a bit more technical. Computers tend to think in terms of two things–code and data. Code consists of the instructions for the computer, telling it what to do. Data is what it does it to and with. When you run a program, it loads into memory both the code and the data that code needs. When that program communicates with some other program, it is receiving data, and it will then use the code that it already has to figure out what to do next. This makes remote communication relatively safe. The remote program can only tell the local program to do within the constraints of the original code. Assuming nobody has done anything stupid (which is not generally a good assumption), the remote program cannot tell the local program to do anything that wasn’t originally intended. Modern computer architectures have an unfortunate design, however. They don’t really no the difference between data and code. If somebody can convince your program to try running the data that it has in memory, it will do so quite happily. So a malicious program has two goals. First it wants to get some code to your machine, and then it wants to persuade somebody to run it. This is of course, no different than an email virus writer’s goal. In that case, they expect you to run it, in the case of a buffer overflow, they expect the broken program to run it. Email viruses are so successful because users often don’t know the difference between data and code either (and some operating systems helpfully try to hide the difference so as no to confuse them). It turns out that if a malicious programmer can find a target program that didn’t check for a buffer overflow, it can be very trivial to get that program to execute code provided by the remote program. So easy, in fact, that there are standard packages out there that provide the entire payload for the overflow–all the script kiddie (we’ll define that sometime, but suffice to say it isn’t a compliment of someone’s hacking prowess) has to do is find the write length for the buffer overflow and bang–they have control of your computer. Before you panic, remember that doing this requires that they have remote access to a program on your computer already, and that that program have a buffer overflow problem. That means (for an internet exploit) that your computer has to have some program that is listening to external connections (e.g. print server, file sharing…) or that you have a malicious user at your computer (or you helpfully downloaded and ran their software). Now let’s get completely technical. How does a buffer overflow exploit work from a programmer’s perspective? First you find some place in that program where it’s reading data and assuming that it’s going to be reading something rational. E.g. char buf; /* Store 4 characters */ gets(buf) /* Read any number of characters from the input and put them in buf */ where the input turns out to be more than 4 characters long. Now the question is, where is the data stored in “buf” located? If “buf” is a global variable, then that data is probably allocated in a data segment somewhere, and you’re going to try and overwrite some other piece of data which will result in something useful (e.g. a place where the program was going to execute one program, now executes another). That’s tricky and hard to do without source code. However “buf” is probably a local variable, allocated on the stack. So instead of overwriting data, your goal is to overwrite the stack itself. So you are going to put in buf some amount of padding (that will overwrite the rest of the data stored on the stack), followed by some machine code that overwrites the part of the stack that had code on it. You’ll set things up so that your code will be executed (possibly when this particular function returns) instead of the code that normally would have been executed. Now you’re home free. Since there are plenty of examples of sample exploit machine code, all you need to do when you find a new buffer overflow is figure out the appropriate offset–the rest of the work has been done already. You don’t need to transfer very much data, just enough to run something that connects you to the remote machine–from there you can transfer the rest of the software you want to install remotely. This is where security-by-obscurity comes in handy. Want to lessen the chance of buffer-overflow attacks? Just run some obscure piece of hardware. Run a Mac, or even Linux on the PowerPC1Of course with Apple switching to an Intel platform, some of that obscurity goes away, but exploits still have to vary from operating system to operating system, even if the underlying processor is the same.. It’s not that there aren’t buffer-overflow problems, but their are less handy examples of how to exploit them running around. Less examples, less successful attacks. It’s not a solution of course (especially if everyone does it :-), but it is one way to slightly increase your odds of remaining secure. There are machine/OS architectures that would make buffer overflows much harder to exploit. Disable dynamic creation and execution of code on the stack for one. Or keep a separate data stack. And there are tools out there which will put watchdog data on the stack, and then watch it to make sure it doesn’t get overwritten (effective, but rather painful from a performance standpoint). But fundamentally, where there are bugs, there are exploits. And modern software, with it’s layers and layers of abstraction that no one person can fully grok, has a hell of a lot of bugs.