Now the fun stuff begins. The first step to figuring out what is going on in our target program is to gather as much information as we can. Several tools on Linux allow us to do this. Let's take a look at them.
ldd is a basic utility that shows us what libraries a program is linked against, or if its statically linked. It also gives us the addresses that these libraries are mapped into the program's execution space, which can be handy for following function calls in disassembled output (which we will get to shortly).
nm lists all of the local and library functions, global variables, and their addresses in the binary. However, it will not work on binaries that have been stripped with strip.
The Linux /proc filesystem contains all sorts of interesting information, from where libraries and other sections of the code are mapped, to which files and sockets are open where. The /proc filesystem contains a directory for each currently running process. So, if you started a process whose pid was 3137, you could enter the directory /proc/3137/ to find out almost anything about this currently running process. You can only view process information for processes which you own.
The files in this directory change with each OS. The interesting ones in Linux are: cmdline -- lists the command line parameters passed to the process cwd -- a link to the current working directory of the process environ -- a list of the environment variables for the process exe -- the link to the process executable fd -- a list of the file descriptors being used by the process maps -- VERY USEFUL. Lists the memory locations in use by this process. These can be viewed directly with gdb to find out various useful things.
netstat is handy little tool that is present on all modern operating systems. It is used to display network connections, routing tables, interface statistics, and more.
How can netstat be useful? Let's say we are trying to reverse engineer a program that uses some network communication. A quick look at what netstat displays can give us clues where the program connects and after some investigation maybe why it connects to this host. netstat does not only show TCP/IP connections, but also UNIX domain socket connections which are used in interprocess communication in lots of programs. Here is an example output of it:
Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 slack.localnet:58705 egon.acm.uiuc.edu:ssh ESTABLISHED tcp 0 0 slack.localnet:51766 gw.localnet:ssh ESTABLISHED tcp 0 0 slack.localnet:51765 gw.localnet:ssh ESTABLISHED tcp 0 0 slack.localnet:38980 clortho.acm.uiuc.ed:ssh ESTABLISHED tcp 0 0 slack.localnet:58510 students-slb.cso.ui:ssh ESTABLISHED Active UNIX domain sockets (w/o servers) Proto RefCnt Flags Type State I-Node Path unix 5 [ ] DGRAM 68 /dev/log unix 3 [ ] STREAM CONNECTED 572608 /tmp/.ICE-unix/794 unix 3 [ ] STREAM CONNECTED 572607 unix 3 [ ] STREAM CONNECTED 572604 /tmp/.X11-unix/X0 unix 3 [ ] STREAM CONNECTED 572603 unix 2 [ ] STREAM 572488As you can see there is great deal of info shown by netstat. But what is the meaning of it? The output is divided in two parts - Internet connections and UNIX domain sockets as mentioned above. Here is breifly what the Internet portion of netstat output means. The first column shows the protocol being used (tcp, udp, unix) in the particular connection. Receiving and sending queues for it are displayed in the next two columns, followed by the information identifying the connection - source host and port, destination host and port. The last column of the output shows the state of the connection. Since there are several stages in opening and closing TCP connections, this field was included to show if the connection is ESTABLISHED or in some of the other available states. SYN_SENT, TIME_WAIT, LISTEN are the most often seen ones. To see complete list of the available states look in the man page for netstat. FIXME: Describe these states.
Depending on the options being passed to netstat, it is possible to display more info. In particular interesting for us is the -p option (not available on all UNIX systems). This will show us the program that uses the connection shown, which may help us determine the behaviour of our target. Another use of this options is in tracking down spyware programs that may be installed on your system. Showing all the network connection and looking for unknown entries is invaluable tool in discovering programs that you are unaware of that send information to the network. This can be combined with the -a option to show all connections. By default listening sockets are not displayed in netstat. Using the -a we force all to be shown. -n shows numerical IP addesses instead of hostnames.
netstat -p as normal user (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 slack.localnet:58705 egon.acm.uiuc.edu:ssh ESTABLISHED - tcp 0 0 slack.localnet:58766 winston.acm.uiuc.ed:www ESTABLISHED 5587/mozilla-bin
netstat -npa as root user Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:139 0.0.0.0:* LISTEN 390/smbd tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 737/X tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 78/sshd tcp 0 0 10.0.0.3:58705 128.174.252.100:22 ESTABLISHED 13761/ssh tcp 0 0 10.0.0.3:51766 10.0.0.1:22 ESTABLISHED 897/ssh tcp 0 0 10.0.0.3:51765 10.0.0.1:22 ESTABLISHED 896/ssh tcp 0 0 10.0.0.3:38980 128.174.252.105:22 ESTABLISHED 8272/ssh tcp 0 0 10.0.0.3:58510 128.174.5.39:22 ESTABLISHED 13716/sshSo this output shows that mozilla has established a connection with winston.acm.uiuc.edu for HTTP traffic (since port is www(80)). In the second output we see that the SMB daemon, X server, and ssh daemon listen for incomming connections.
lsof is a program that lists all open files by the processes running on a system. An open file may be a regular file, a directory, a block special file, a character special file, an executing text reference, a library, a stream or a network file (Internet socket, NFS file or UNIX domain socket). It has plenty of options, but in its default mode it gives an extensive listing of the opened files. lsof does not come installed by default with most of the flavors of Linux/UNIX, so you may need to install it by yourself. On some distributions lsof installs in /usr/sbin which by default is not in your path and you will have to add it. An example output would be:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME bash 101 nasko cwd DIR 3,2 4096 1172699 /home/nasko bash 101 nasko rtd DIR 3,2 4096 2 / bash 101 nasko txt REG 3,2 518140 1204132 /bin/bash bash 101 nasko mem REG 3,2 432647 748736 /lib/ld-2.2.3.so bash 101 nasko mem REG 3,2 14831 1399832 /lib/libtermcap.so.2.0.8 bash 101 nasko mem REG 3,2 72701 748743 /lib/libdl-2.2.3.so bash 101 nasko mem REG 3,2 4783716 748741 /lib/libc-2.2.3.so bash 101 nasko mem REG 3,2 249120 748742 /lib/libnss_compat-2.2.3.so bash 101 nasko mem REG 3,2 357644 748746 /lib/libnsl-2.2.3.so bash 101 nasko 0u CHR 4,5 260596 /dev/tty5 bash 101 nasko 1u CHR 4,5 260596 /dev/tty5 bash 101 nasko 2u CHR 4,5 260596 /dev/tty5 bash 101 nasko 255u CHR 4,5 260596 /dev/tty5 screen 379 nasko cwd DIR 3,2 4096 1172699 /home/nasko screen 379 nasko rtd DIR 3,2 4096 2 / screen 379 nasko txt REG 3,2 250336 358394 /usr/bin/screen-3.9.9 screen 379 nasko mem REG 3,2 432647 748736 /lib/ld-2.2.3.so screen 379 nasko mem REG 3,2 357644 748746 /lib/libnsl-2.2.3.so screen 379 nasko 0r CHR 1,3 260468 /dev/null screen 379 nasko 1w CHR 1,3 260468 /dev/null screen 379 nasko 2w CHR 1,3 260468 /dev/null screen 379 nasko 3r FIFO 3,2 1334324 /home/nasko/.screen/379.pts-6.slack startx 729 nasko cwd DIR 3,2 4096 1172699 /home/nasko startx 729 nasko rtd DIR 3,2 4096 2 / startx 729 nasko txt REG 3,2 518140 1204132 /bin/bash ksmserver 794 nasko 3u unix 0xc8d36580 346900 socket ksmserver 794 nasko 4r FIFO 0,6 346902 pipe ksmserver 794 nasko 5w FIFO 0,6 346902 pipe ksmserver 794 nasko 6u unix 0xd4c83200 346903 socket ksmserver 794 nasko 7u unix 0xd4c83540 346905 /tmp/.ICE-unix/794 mozilla-b 5594 nasko 144u sock 0,0 639105 can't identify protocol mozilla-b 5594 nasko 146u unix 0xd18ec3e0 639134 socket mozilla-b 5594 nasko 147u sock 0,0 639135 can't identify protocol mozilla-b 5594 nasko 150u unix 0xd18ed420 639151 socketHere is brief explanation of some of the abbreviations lsof uses in its output:
cwd current working directory mem memory-mapped file pd parent directory rtd root directory txt program text (code and data) CHR for a character special file sock for a socket of unknown domain unix for a UNIX domain socket DIR for a directory FIFO for a FIFO special file
It is pretty handy tool when it comes to investigating program behavior. lsof reveals plenty of information about what the process is doing under the surface.