6. Understanding Assembly

Since the output of all of these tools is in AT&T syntax, those of you who know Intel/MASM syntax have a bit of re-learning to do.

Assembly language is one step closer to the hardware than high level languages like C and C++. So to understand assembly, you have to understand how the hardware works. Lets start with a set of memory locations known as the CPU registers.

6.1. Registers

Registers are like the local variables of the CPU, except there are a fixed number of them. For the ix86 CPU, there are only 4 main registers for doing integer calculations: A, B, C, and D. Each of these 4 registers can be accessed 4 different ways: as a 32 bit value (%eax), as a 16 bit value (%ax), and as a low and a high 8 bit value (%al and %ah). There are five more registers that you will see used occasionally - namely SI, DI, SP and BP. SI and DI are around from the DOS days when people used 64k segmented addressing, and as it turns out, may be used as integer like normal registers now. SP and BP are two special registers used to handle an area of memory called the stack. There is one last register, the instruction pointer IP that you may not modify directly, but is changed through jmps and calls. Its value is the address of the currently executing instruction. (FIXME: Check this)

Note: If gcc was called with the -fomit-frame-pointer, the BP register is freed up to be used as an extra integer register.

6.2. The stack

6.3. Two's complement

6.3.4. Byte Ordering

Why this section? One simple reason - different platforms use different byte ordering. There are two different orderings - little endian and big endian. Some of you are may be what byte ordering actually is? Byte ordering refers to the physical layout of data in memory. When a data structure or data type is represented by more than one byte, the ordering of bytes matter. For example if we consider a long (4 bytes) let's label the least significant byte 0 and the most significant one 3. If we are on little endian machine the long will be represented in memory like this (yeah, some machines do not allow addressable bytes, but let's forget about this): 0x040 0 0x041 1 0x042 2 0x043 3 On a big endian machine on the other hand, the long will be layed out like that: 0x040 3 0x041 2 0x042 1 0x043 0 Now let's look at an example. The easiest way to see the difference in byte ordering is to look at how string is stored in memory on different architectures. Here is an example program that will demontrate it.


#include <stdio.h>


int main() {

        char* test = "this is a string";

        printf("%s\n", test);
}
We compiled it and here is the output of two different ways of disassembling it first on Solaris machine (Linux xxxxxx 2.4.16 #1 Tue Dec 11 01:57:19 EST 2001 sparc64 unknown): objdump

     11850:       74 68 69 73     call  d1a2be1c <_end+0xd1a0a394>
     11854:       20 69 73 20     unknown
     11858:       61 20 73 74     call  8482e628 <_end+0x8480cba0>
     1185c:       72 69 6e 67     call  c9a6d1f8 <_end+0xc9a4b770>
     11860:       00 00 00 00     unimp  0
gdb

     0x11850 <_IO_stdin_used+8>:     0x74686973      0x20697320      0x61207374  0x72696e67
Now let's look at how the memory itself is organized and how the string is represented:
     
Address         Code    Letter
--------------------------
0x11850         74      t
0x11851         68      h
0x11852         69      i
0x11853         73      s

0x11854         20
0x11855         69      i
0x11856         73      s
0x11857         20

0x11858         61      a
0x11859         20
0x1185a         73      s
0x1185b         74      t

0x1185c         72      r
0x1185d         69      i
0x1185e         6e      n
0x1185f         67      g

0x11860         00
And if we do the same on Intel machine (Linux xxxxxx 2.4.17 #17 Thu Jan 31 23:34:35 CST 2002 i686 unknown) this is what we get:

Address         Code    Letter
--------------------------
0x8048420       73         s
0x8048421       69         i
0x8048422       68         h
0x8048423       74         t

0x8048424       20
0x8048425       73         s
0x8048426       69         i
0x8048427       20

0x8048428       74         t
0x8048429       73         s
0x804842a       20
0x804842b       61         a

0x804842c       67         g
0x804842d       6e         n
0x804842e       69         i
0x804842f       72         r
At first glance of the x86 architecture you may miss that this actually is the string we are looking for. This is the difference in byte ordering. In order for different hosts on the same nettwork to be able to communicate and the exchanged data to make sense, they agree on common byte ordering. In modern networking the data is transmitted in big endian byte ordering i.e. most significant byte comes first. On the i80x86 the host byte order is Least Significant Byte first, whereas the network byte order, as used on the Internet, is Most Significant Byte first.

6.4. Reading Assembly

6.5. Know Your Compiler

In order to learn to read assembly effectively, you really have to know what type of code your compiler likes to generate in certain situations. If you learn to recognize what a while loop, a for loop, an if-else statement all look like in assembly, you can learn to get a general feel for code more quickly. There are also a few tricks that GCC performs that may seem unintuitive at first to the neophyte reverse engineer, even if they already know how to forward-engineer in assembly.

6.5.1. Basic Control Structures

In assembly, the only flow control mechanisms are branching and calling. So every control structure is built up from a combination of goto's and conditional branches. Lets look at some specific examples.

6.5.2. Arrays

6.5.2.1. Arrays on the stack

Arrays on the stack are just memory regions that we access with variations on the disp(%base, %index, scale) idea presented earlier. So lets start with a warm-up consisting of a simple char array where we let libc do all the work.

Example .c file and gcc output with no optimization, with -O2, and with -O3 -fomit-frame-pointer

So lets do another example where we do all the work. One dimensional arrays are the easiest, as they are simply a chunk of memory that is the number of elements times the size of each element.

Example .c file and gcc output with no optimization, with -O2, and with -O3 -fomit-frame-pointer

Two dimensional arrays are actually just an abstraction that makes working with memory easier in C. A 2D array on the stack is just one long 1D array that the C compiler divides for us to make it managable. To paramaterize things, an array declared as: type array[dim2][dim1]; is really a 1D array of length dim2*dim1*type. The C compiler handles array indexing as follows: array[i][j] is the memory location array + i*dim1*type + j*type. So it divides our 1D array into dim2 sections, each dim1*type long.

FIXME: Graphics to illustrate this.

Example .c file and gcc output with no optimization, with -O2, and with -O3 -fomit-frame-pointer

As I tell my introductory computer science students, the best way to think of higher dimensional arrays is to think of a set of arrays of the next lower dimension. So the best way to think about how a 3D array can be jammed into a 1D array is to think about how a set of 2D arrays would be jammed into a 1D array: one right after another. So for array declared as type array[dim3][dim2][dim1];, array[i][j][k] means array + i*dim2*dim1*type + j*dim1*type + k*type. So this means just by looking at the assembly multiplications of the indexing variables, we should be able to determine n-1 dimensions of any n dimensional array. The remaining dimention can be determined from the total size, or the bounds of some initialization loop.

FIXME: Diagram/graphics to show this

Example .c file and gcc output with no optimization, with -O2, and with -O3 -fomit-frame-pointer

6.5.3. Structs