"Evil does seek to maintain power by suppressing the truth." "Or by misleading the innocent." | |
Spock and McCoy, "And The Children Shall Lead", stardate 5029.5. |
The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's see whether In the language of mortals has main at the same offset.
Command: src/magic_elf/ndisasm.sh
#!/bin/sh
. ${OUT}/magic_elf/addr_of_main
ndisasm -e ${ofs} -o ${main} -U ${TMP}/magic_elf/magic_elf \
| sed -e '/ret/q' |
Output: out/redhat-linux-i386/magic_elf/ndisasm
08048460 55 push ebp
08048461 89E5 mov ebp,esp
08048463 83EC0C sub esp,byte +0xc
08048466 6A03 push byte +0x3
08048468 6801800408 push dword 0x8048001
0804846D 6A01 push byte +0x1
0804846F E8A4FEFFFF call 0x8048318
08048474 31C0 xor eax,eax
08048476 89EC mov esp,ebp
08048478 5D pop ebp
08048479 C3 ret |
Both programs have main at the same file offset. Unfortunately a brief look through /bin proves this to be pure chance.
Instead of a real system call for write we see a call to strange negative address (check the opcode). ndisasm(1) resolves this address to a location in glibc. However, during development I encountered a configuration of my system where ndisasm(1) failed to do so. The rest of the story is still interesting, though. Yet another way to do it.
Command: src/magic_elf/gdb.sh
#!/bin/sh
file=${1:-${TMP}/magic_elf/magic_elf}
func=${2:-main}
gdb ${file} -q <<EOT | sed -n -e '/:/p' -e '/ret *$/q' -e '/hlt *$/q'
set disassembly-flavor intel
disassemble ${func}
EOT |
Output: out/redhat-linux-i386/magic_elf/gdb
(gdb) (gdb) Dump of assembler code for function main:
0x8048460 <main>: push ebp
0x8048461 <main+1>: mov ebp,esp
0x8048463 <main+3>: sub esp,0xc
0x8048466 <main+6>: push 0x3
0x8048468 <main+8>: push 0x8048001
0x804846d <main+13>: push 0x1
0x804846f <main+15>: call 0x8048318 <write>
0x8048474 <main+20>: xor eax,eax
0x8048476 <main+22>: mov esp,ebp
0x8048478 <main+24>: pop ebp
0x8048479 <main+25>: ret |
That strange negative address resolves to a function in a shared library. Not shown is a pathetic attempt to single-step to the actual code of write.
We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.
Command: src/magic_elf/cc_static.sh
#!/bin/sh
gcc ${CFLAGS} -static ${OUT}/${arch}/magic_elf/magic_elf.c \
-o ${TMP}/magic_elf/magic_elf_static \
&& ls -l ${TMP}/magic_elf \
&& ${TMP}/magic_elf/magic_elf_static |
Output: out/redhat-linux-i386/magic_elf/magic_elf_static
total 1668
-rwxr-xr-x 1 alba anonymou 13711 Jun 30 00:06 magic_elf
-rwxr-xr-x 1 alba anonymou 1687693 Jun 30 00:06 magic_elf_static
ELF |
Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?
Output: out/redhat-linux-i386/magic_elf/static_main.gdb
(gdb) (gdb) Dump of assembler code for function main:
0x80481e0 <main>: push ebp
0x80481e1 <main+1>: mov ebp,esp
0x80481e3 <main+3>: sub esp,0xc
0x80481e6 <main+6>: push 0x3
0x80481e8 <main+8>: push 0x8048001
0x80481ed <main+13>: push 0x1
0x80481ef <main+15>: call 0x804cc60 <__libc_write>
0x80481f4 <main+20>: xor eax,eax
0x80481f6 <main+22>: mov esp,ebp
0x80481f8 <main+24>: pop ebp
0x80481f9 <main+25>: ret |
The name of the function changed for no apparent reason. But it is reachable for disassembly now.
Output: out/redhat-linux-i386/magic_elf/static_write.gdb
(gdb) (gdb) Dump of assembler code for function __libc_write:
0x804cc60 <__libc_write>: push ebx
0x804cc61 <__libc_write+1>: mov edx,DWORD PTR [esp+16]
0x804cc65 <__libc_write+5>: mov ecx,DWORD PTR [esp+12]
0x804cc69 <__libc_write+9>: mov ebx,DWORD PTR [esp+8]
0x804cc6d <__libc_write+13>: mov eax,0x4
0x804cc72 <__libc_write+18>: int 0x80
0x804cc74 <__libc_write+20>: pop ebx
0x804cc75 <__libc_write+21>: cmp eax,0xfffff001
0x804cc7a <__libc_write+26>: jae 0x8052bb0 <__syscall_error>
0x804cc80 <__libc_write+32>: ret |
There are two man pages giving some overview of system calls, intro(2) and syscalls(2). The statement mov eax,4 corresponds to the value of __NR_write in /usr/include/asm/unistd.h.
The code generated by gcc(1) is not suitable for a virus. So here comes hand crafted code optimized for size (twenty three is the perfect number of bytes [1] ). I prefer nasm [2] to GNU as.
Source: src/evil_magic/evil_magic.asm
global _start
_start: push byte 4
pop eax ; eax = 4 = write(2)
xor ebx,ebx
inc ebx ; ebx = 1 = stdout
mov ecx,0x08048001 ; ecx = magic address
push byte 3
pop edx ; edx = 3 = three characters
int 0x80
xor eax,eax
inc eax ; eax = 1 = exit(2)
xor ebx,ebx ; ebx = 0 = return code
int 0x80 |
Command: src/evil_magic/nasm.sh
#!/bin/sh
nasm -f elf -o ${TMP}/evil_magic/nasm.o \
src/evil_magic/evil_magic.asm \
&& ld -o ${TMP}/evil_magic/nasm ${TMP}/evil_magic/nasm.o \
&& ${TMP}/evil_magic/nasm |
Output: out/redhat-linux-i386/evil_magic/nasm
ELF |
Output is good. But how do we get the resulting machine code? We can't just add a call to printf(3) to the assembly code. Above example is not linked with glibc; it does not even have a function called main.
On the other hand things became a lot easier. There is no initialization code that gets executed before _start, so the address of _start is really the ELF entry point of the executable. A look into /usr/include/elf.h shows that Elf32_Ehdr::e_entry is at file offset 24.
Command: src/evil_magic/od.sh
#!/bin/sh
od -j24 -An -tx4 -N4 ${TMP}/evil_magic/nasm \
| sed 's/^[[:space:]]/0x/' |
Output: out/redhat-linux-i386/evil_magic/od
0x08048080 |
The entry point is specified as a virtual address in memory. By subtracting the base address we get the file offset:
0x8048080 - 0x8048000 = 0x80 = 128
Command: out/redhat-linux-i386/evil_magic/ndisasm.sh
#!/bin/sh
ndisasm -e 128 -o 0x08048080 -U tmp/redhat-linux-i386/evil_magic/nasm | head -12 |
Output: out/redhat-linux-i386/evil_magic/evil_magic.asm
08048080 6A04 push byte +0x4
08048082 58 pop eax
08048083 31DB xor ebx,ebx
08048085 43 inc ebx
08048086 B901800408 mov ecx,0x8048001
0804808B 6A03 push byte +0x3
0804808D 5A pop edx
0804808E CD80 int 0x80
08048090 31C0 xor eax,eax
08048092 40 inc eax
08048093 31DB xor ebx,ebx
08048095 CD80 int 0x80 |
There is still one thing left: Dressing up the hex dump as C source. A small filter written in perl(1) would do. Because this tool will be used throughout the document it provides a lot of features, however.
The __attribute__ clause is explained in A section called .text. It is not required at this point.
Initializing the array with string literals (looking like \xDE\xAD\xBE\xEF) is easier. The terminating zero would not work with Doing it in C, however. But then using a list of hexadecimal numbers introduces separating comas, requiring special treatment of the last line.
If command line option -last_line_is_ofs is passed to the program then the last line of disassembly is meant to specify a offset into the code. Actually it's just the last byte of that line. You are free to use any dummy operation, like push byte 1. See Target::infection for an example.
Source: src/evil_magic/ndisasm.pl
#!/usr/bin/perl -sw
use strict;
my $LINE = " %-30s /* %-30s */\n";
$::identfier = 'main' if (!defined($::identfier));
$::size = '' if (!defined($::size));
$::align = '8' if (!defined($::align));
printf "const unsigned char %s[%s]\n", $::identfier, $::size;
print "__attribute__ (( aligned($::align), section(\".text\") )) =\n";
print "{\n";
my @line;
while(<>)
{
s/\s+$//;
my $code = (split())[1];
my $dump = '0x' . substr($code, 0, 2);
for(my $i = 2; $i < length($code); $i += 2)
{
$dump .= ',0x' . substr($code, $i, 2);
}
s/\s+[^\s]*\s+/: /;
push @line, [ $_, $code, $dump ]
}
my $nr = 0;
my $max = $#line;
$max -= 1 if (defined($::last_line_is_ofs));
while($nr < $max)
{
printf $LINE, $line[$nr][2] . ',', $line[$nr][0];
$nr++;
}
printf($LINE . "};\n", $line[$nr][2], $line[$nr][0]);
if (defined($::last_line_is_ofs))
{
my $ofs = substr($line[$nr + 1][1], -2, 2);
printf "enum { ENTRY_POINT_OFS = 0x%x };\n", hex($ofs);
} |
Output: out/redhat-linux-i386/evil_magic/evil_magic.c
const unsigned char main[]
__attribute__ (( aligned(8), section(".text") )) =
{
0x6A,0x04, /* 08048080: push byte +0x4 */
0x58, /* 08048082: pop eax */
0x31,0xDB, /* 08048083: xor ebx,ebx */
0x43, /* 08048085: inc ebx */
0xB9,0x01,0x80,0x04,0x08, /* 08048086: mov ecx,0x8048001 */
0x6A,0x03, /* 0804808B: push byte +0x3 */
0x5A, /* 0804808D: pop edx */
0xCD,0x80, /* 0804808E: int 0x80 */
0x31,0xC0, /* 08048090: xor eax,eax */
0x40, /* 08048092: inc eax */
0x31,0xDB, /* 08048093: xor ebx,ebx */
0xCD,0x80 /* 08048095: int 0x80 */
}; |
Calling the string constant main is not a mistake. Above output is a complete and valid C program.
Command: src/evil_magic/cc.sh
#!/bin/sh
gcc -Wall -O2 ${OUT}/evil_magic/evil_magic.c \
-o ${TMP}/evil_magic/cc \
&& ${TMP}/evil_magic/cc |
Output: out/redhat-linux-i386/evil_magic/cc
out/redhat-linux-i386/evil_magic/evil_magic.c:2: warning: `main' is usually a function
ELF |
[1] | |
[2] |