UNIX ELF PARASITES AND VIRUS - Silvio Cesare - October 1998 INTRODUCTION This paper documents the algorithms and implementation of UNIX parasite and virus code using ELF objects. Brief introductions on UNIX virus detection and evading such detection are given. An implementation of the ELF parasite infector for UNIX is provided, and an ELF virus for Linux on x86 architecture is also supplied. Elementary programming and UNIX knoledge is assumed, and an understanding of Linux x86 archtitecture is assumed for the Linux implementation. ELF understanding is not required but may be of help. The ELF infection method uses is based on utilizing the page padding on the end of the text segment which provides suitable hosting for parasite code. This paper does not document any significant virus programming techniques except those that are only applicable to the UNIX environment. Nor does it try to replicate the ELF specifications. The interested reader is advised to read the ELF documentation if this paper is unclear in ELF specifics. ELF INFECTION A process image consists of a 'text segment' and a 'data segment'. The text segment is given the memory protection r-x (from this its obvious that self modifying code cannot be used in the text segment). The data segment is given the protection rw-. The segment as seen from the process image is typically not all in use as memory used by the process rarely lies on a page border (or we can say, not congruent to modulo the page size). Padding completes the segment, and in practice looks like this. key: [...] A complete page M Memory used in this segment P Padding Page Nr #1 [PPPPMMMMMMMMMMMM] \ #2 [MMMMMMMMMMMMMMMM] |- A segment #3 [MMMMMMMMMMMMPPPP] / Segments are not bound to use multiple pages, so a single page segment is quite possible. Page Nr #1 [PPPPMMMMMMMMPPPP] <- A segment Typically, the data segment directly proceeds the text segment which always starts on a page, but the data segment may not. The memory layout for a process image is thus. key: [...] A complete page T Text D Data P Padding Page Nr #1 [TTTTTTTTTTTTTTTT] <- Part of the text segment #2 [TTTTTTTTTTTTTTTT] <- Part of the text segment #3 [TTTTTTTTTTTTPPPP] <- Part of the text segment #4 [PPPPDDDDDDDDDDDD] <- Part of the data segment #5 [DDDDDDDDDDDDDDDD] <- Part of the data segment #6 [DDDDDDDDDDDDPPPP] <- Part of the data segment pages 1, 2, 3 constitute the text segment pages 4, 5, 6 constitute the data segment From here on, the segment diagrams may use single pages for simplicity. eg Page Nr #1 [TTTTTTTTTTTTPPPP] <- The text segment #2 [PPPPDDDDDDDDPPPP] <- The data segment For completeness, on x86, the stack segment is located after the data segment giving the data segment enough room for growth. Thus the stack is located at the top of memory (remembering that it grows down). In an ELF file, loadable segments are present physically in the file, which completely describe the text and data segments for process image loading. A simplified ELF format for an executable object relevant in this instance is. ELF Header . . Segment 1 <- Text Segment 2 <- Data . . Each segment has a virtual address associated with its starting location. Absolute code that references within each segment is permissible and very probable. To insert parasite code means that the process image must load it so that the original code and data is still intact. This means, that inserting a parasite requires the memory used in the segments to be increased. The text segment compromises not only code, but also the ELF headers including such things as dynamic linking information. If the parasite code is to be inserted by extending the text segment backwards and using this extra memory, problems can arise because these ELF headers may have to move in memory and thus cause problems with absolute referencing. It may be possible to keep the text segment as is, and create another segment consisting of the parasite code, however introducing an extra segment is certainly questionable and easy to detect. Extending the text segment forward or extending the data segment backward will probably overlap the segments. Relocating a segment in memory will cause problems with any code that absolutely references memory. It may be possible to extend the data segment, however this isn't preferred, as its not UNIX portable that properly implement execute memory protection. Page padding at segment borders however provides a practical location for parasite code given that its size is able. This space will not interfere with the original segments, requiring no relocation. Following the guidline just given of preferencing the text segment, we can see that the padding at the end of the text segment is a viable solution. The resulting segments after parasite insertion into text segment padding looks like this. key: [...] A complete page V Parasite code T Text D Data P Padding Page Nr #1 [TTTTTTTTTTTTVVPP] <- Text segment #2 [PPPPDDDDDDDDPPPP] <- Data segment ... A more complete ELF executable layout is (ignoring section content - see below). ELF Header Program header table Segment 1 Segment 2 Section header table optional In practice, this is what is normally seen. ELF Header Program header table Segment 1 Segment 2 Section header table Section 1 . . Section n Typically, the extra sections (those not associated with a segment) are such things as debugging information, symbol tables etc. From the ELF specifications: "An ELF header resides at the beginning and holds a ``road map'' describing the file's organization. Sections hold the bulk of object file information for the linking view: instructions, data, symbol table, relocation information, and so on. ... ... A program header table, if present, tells the system how to create a process image. Files used to build a process image (execute a program) must have a program header table; relocatable files do not need one. A section header table contains information describing the file's sections. Every section has an entry in the table; each entry gives information such as the section name, the section size, etc. Files used during linking must have a section header table; other object files may or may not have one. ... ... Executable and shared object files statically represent programs. To execute such programs, the system uses the files to create dynamic program representations, or process images. A process image has segments that hold its text, data, stack, and so on. The major sections in this part discuss the following. Program header. This section complements Part 1, describing object file structures that relate directly to program execution. The primary data structure, a program header table, locates segment images within the file and contains other information necessary to create the memory image for the program." After insertion of parasite code, the layout of the ELF file will look like this. ELF Header Program header table Segment 1 - The text segment of the host - The parasite Segment 2 Section header table Section 1 . . Section n Thus the parasite code must be physically inserted into the file, and the text segment extended to see the new code. An ELF object may also specify an entry point of the program, that is, the virtual memory location that assumes control of the program. Thus to activate parasite code, the program flow must include the new parasite. This can be done by patching the entry point in the ELF object to point (jump) directly to the parasite. It is then the parasite's responsibility that the host code be executed - typically, by transferring control back to the host once the parasite has completed its execution. From /usr/include/elf.h typedef struct { unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */ Elf32_Half e_type; /* Object file type */ Elf32_Half e_machine; /* Architecture */ Elf32_Word e_version; /* Object file version */ Elf32_Addr e_entry; /* Entry point virtual address */ Elf32_Off e_phoff; /* Program header table file offset */ Elf32_Off e_shoff; /* Section header table file offset */ Elf32_Word e_flags; /* Processor-specific flags */ Elf32_Half e_ehsize; /* ELF header size in bytes */ Elf32_Half e_phentsize; /* Program header table entry size */ Elf32_Half e_phnum; /* Program header table entry count */ Elf32_Half e_shentsize; /* Section header table entry size */ Elf32_Half e_shnum; /* Section header table entry count */ Elf32_Half e_shstrndx; /* Section header string table index */ } Elf32_Ehdr; e_entry is the entry point of the program given as a virtual address. For knowledge of the memory layout of the process image and the segments that compromise it stored in the ELF object see the Program Header information below. e_phoff gives use the file offset for the start of the program header table. Thus to read the header table (and the associated loadable segments), you may lseek to that position and read e_phnum*sizeof(Elf32_Pdr) bytes associated with the program header table. It can also be seen, that the section header table file offset is also given. It was previously mentioned that the section table resides at the end of the file, so after inserting of data at the end of the segment on file, the offset must be updated to reflect the new position. /* Program segment header. */ typedef struct { Elf32_Word p_type; /* Segment type */ Elf32_Off p_offset; /* Segment file offset */ Elf32_Addr p_vaddr; /* Segment virtual address */ Elf32_Addr p_paddr; /* Segment physical address */ Elf32_Word p_filesz; /* Segment size in file */ Elf32_Word p_memsz; /* Segment size in memory */ Elf32_Word p_flags; /* Segment flags */ Elf32_Word p_align; /* Segment alignment */ } Elf32_Phdr; Loadable program segments (text/data) are identified in a program header by a p_type of PT_LOAD (1). Again as with the e_shoff in the ELF header, the file offset (p_offset) must be updated in later phdr's to reflect their new position in the file. p_vaddr identifies the virtual address of the start of the segment. As mentioned above regarding the entry point. It is now possible to identify where program flow begins, by using p_vaddr as the base index and calculating the offset to e_entry. p_filesz and p_memsz are the file sizes and memory sizes respectively that the segment occupies. The use of this scheme of using file and memory sizes, is that where its not necessary to load memory in the process from disk, you may still be able to say that you want the process image to occupy its memory. The .bss section (see below for section definitions), which is for uninitialized data in the data segment is one such case. It is not desirable that uninitialized data be stored in the file, but the process image must allocated enough memory. The .bss section resides at the end of the segment and any memory size past the end of the file size is assumed to be part of this section. /* Section header. */ typedef struct { Elf32_Word sh_name; /* Section name (string tbl index) */ Elf32_Word sh_type; /* Section type */ Elf32_Word sh_flags; /* Section flags */ Elf32_Addr sh_addr; /* Section virtual addr at execution */ Elf32_Off sh_offset; /* Section file offset */ Elf32_Word sh_size; /* Section size in bytes */ Elf32_Word sh_link; /* Link to another section */ Elf32_Word sh_info; /* Additional section information */ Elf32_Word sh_addralign; /* Section alignment */ Elf32_Word sh_entsize; /* Entry size if section holds table */ } Elf32_Shdr; The sh_offset is the file offset that points to the actual section. The shdr should correlate to the segment its located it. It is highly suspicious if the vaddr of the section is different to what is in from the segments view. To insert code at the end of the text segment thus leaves us with the following to do so far. * Increase p_shoff to account for the new code in the ELF header * Locate the text segment program header * Increase p_filesz to account for the new code * Increase p_memsz to account for the new code * For each phdr who's segment is after the insertion (text segment) * increase p_offset to reflect the new position after insertion * For each shdr who's section resides after the insertion * Increase sh_offset to account for the new code * Physically insert the new code into the file - text segment p_offset + p_filesz (original) There is one hitch however. Following the ELF specifications, p_vaddr and p_offset in the Phdr must be congruent together, to modulo the page size. key: ~= is denoting congruency. p_vaddr (mod PAGE_SIZE) ~= p_offset (mod PAGE_SIZE) This means, that any insertion of data at the end of the text segment on the file must be congruent modulo the page size. This does not mean, the text segment must be increased by such a number, only that the physical file be increased so. This also has an interesting side effect in that often a complete page must be used as padding because the required vaddr isn't available. The following may thus happen. key: [...] A complete page T Text D Data P Padding Page Nr #1 [TTTTTTTTTTTTPPPP] <- Text segment #2 [PPPPPPPPPPPPPPPP] <- Padding #3 [PPPPDDDDDDDDPPPP] <- Data segment This can be taken advantage off in that it gives the parasite code more space, such a spare page cannot be guaranteed. To take into account of the congruency of p_vaddr and p_offset, our algorithm is modified to appear as this. * Increase p_shoff by PAGE_SIZE in the ELF header * Locate the text segment program header * Increase p_filesz by account for the new code * Increase p_memsz to account for the new code * For each phdr who's segment is after the insertion (text segment) * increase p_offset by PAGE_SIZE * For each shdr who's section resides after the insertion * Increase sh_offset by PAGE_SIZE * Physically insert the new code and pad to PAGE_SIZE, into the file - text segment p_offset + p_filesz (original) Now that the process image loads the new code into being, to run the new code before the host code is a simple matter of patching the ELF entry point and the virus jump to host code point. The new entry point is determined by the text segment v_addr + p_filesz (original) since all that is being done, is the new code is directly prepending the original host segment. For complete infection code then. * Increase p_shoff by PAGE_SIZE in the ELF header * Patch the insertion code (parasite) to jump to the entry point (original) * Locate the text segment program header * Modify the entry point of the ELF header to point to the new code (p_vaddr + p_filesz) * Increase p_filesz by account for the new code (parasite) * Increase p_memsz to account for the new code (parasite) * For each phdr who's segment is after the insertion (text segment) * increase p_offset by PAGE_SIZE * For each shdr who's section resides after the insertion * Increase sh_offset by PAGE_SIZE * Physically insert the new code (parasite) and pad to PAGE_SIZE, into the file - text segment p_offset + p_filesz (original) This, while perfectly functional, can arouse suspicion because the the new code at the end of the text segment isn't accounted for by any sections. Its an easy matter to associate the entry point with a section however by extending its size, but the last section in the text segment is going to look suspicious. Associating the new code to a section must be done however as programs such as 'strip' use the section header tables and not the program headers. The final algorithm is using this information is. * Increase p_shoff by PAGE_SIZE in the ELF header * Patch the insertion code (parasite) to jump to the entry point (original) * Locate the text segment program header * Modify the entry point of the ELF header to point to the new code (p_vaddr + p_filesz) * Increase p_filesz by account for the new code (parasite) * Increase p_memsz to account for the new code (parasite) * For each phdr who's segment is after the insertion (text segment) * increase p_offset by PAGE_SIZE * For the last shdr in the text segment * increase sh_len by the parasite length * For each shdr who's section resides after the insertion * Increase sh_offset by PAGE_SIZE * Physically insert the new code (parasite) and pad to PAGE_SIZE, into the file - text segment p_offset + p_filesz (original) infect-elf-p is the supplied program (complete with source) that implements the elf infection using text segment padding as described. INFECTING INFECTIONS In the parasite described, infecting infections isn't a problem at all. By skipping executables that don't have enough padding for the parasite, this is solved implicitly. Multiple parasites may exist in the host, but their is a limit of how many depending on the size of the parasite code. NON (NOT AS) TRIVIAL PARASITE CODE Parasite code that requires memory access requires the stack to be used manually naturally. No bss section can be used from within the virus code, because it can only use part of the text segment. It is strongly suggested that rodata not be used, in-fact, it is strongly suggested that no location specific data be used at all that resides outside the parasite at infection time. Thus, if initialized data is to be used, it is best to place it in the text segment, ie at the end of the parasite code - see below on calculating address locations of initialized data that is not known at compile/infection time. If the heap is to be used, then it will be operating system dependent. In Linux, this is done via the 'brk' syscall. The use of any shared library calls from within the parasite should be removed, to avoid any linking problems and to maintain a portable parasite in files that use varying libraries. It is thus naturally recommended to avoid using libc. Most importantly, the parasite code must be relocatable. It is possible to patch the parasite code before inserting it, however the cleanest approach is to write code that doesn't need to be patched. In x86 Linux, some syscalls require the use of an absolute address pointing to initialized data. This can be made relocatable by using a common trick used in buffer overflow code. jmp A B: pop %eax ; %eax now has the address of the string . ; continue as usual . . A: call B .string \"hello\" By making a call directly proceeding the string of interest, the address of the string is pushed onto the stack as the return address. BEYOND ELF PARASITES AND ENTER VIRUS IN UNIX In a UNIX environment the most probably method for a typical garden variety virus to spread is through infecting files that it has legal permission to do so. A simple method of locating new files possible to infect, is by scanning the current directory for writable files. This has the advantage of being relatively fast (in comparison to large tree walks) but finds only a small percentage of infect-able files. Directory searches are however very slow irrespectively, even without large tree walks. If parasite code does not fork, its very quickly noticed what is happening. In the sample virus supplied, only a small random set of files in the current directory are searched. Forking, as mentioned, easily solves the problem of slowing the startup to the host code, however new processes on the system can be spotted as abnormal if careful observation is used. The parasite code as mentioned, must be completely written in machine code, this does not however mean that development must be done like this. Development can easily be done in a high level language such as C and then compiled to asm to be used as parasite code. A bootstrap process can be used for initial infection of the virus into a host program that can then be distributed. That is, the ELF infector code is used, with the virus as the parasite code to be inserted. THE LINUX PARASITE VIRUS This virus implements the ELF infection described by utilizing the padding at the end of the text segment. In this padding, the virus in its entirety is copied, and the appropriate entry points patched. At the end of the parasite code, are the instructions. movl %ebp, $XXXX jmp *%ebp XXXX is patched when the virus replicates to the host entry point. This approach does have the side effect of trashing the ebp register which may or may not be destructive to programs who's entry points depend on ebp being set on entry. In practice, I have not seen this happen (the implemented Linux virus uses the ebp approach), but extensive replicating has not been performed. On execution of an infected host, the virus will copy the parasite (virus) code contained in itself (the file) into memory. The virus will then scan randomly (random enough for this instance) through the current directory, looking for ELF files of type ET_EXEC or ET_DYN to infect. It will infect up to Y_INFECT files, and scan up to N_INFECT files in total. If a file can be infected, ie, its of the correct ELF type, and the padding can sustain the virus, a a modified copy of the file incorporating the virus is made. It then renames the copy to the file its infecting, and thus it is infected. Due to the rather large size of the virus in comparison to the page size (approx 2.3k) not all files are able to be infected, in fact only near half on average. DEVELOPMENT OF THE LINUX VIRUS The Linux virus was completely written in C, and strongly based around the ELF infector code. The C code is supplied as elf-p-virus.c The code requires the use of no libraries, and avoids libc by using a similar scheme to the _syscall declarations Linux employs modified not to use errno. Heap memory was used for dynamic allocation of the phdr and shdr tables using 'brk'. Linux has some syscalls which require the address of initialized strings to be passed to it, notably, open, rename, and unlink. This requires initialized data storage. As stated before, rodata cannot be used, so this data was placed at the end of the code. Making it relocatable required the use of the above mentioned algorithm of using call to push the address (return value) onto the stack. To assist in the asm conversion, extra variables were declared so to leave room on the stack to store the addresses as in some cases the address was used more than once. The C code form of the virus allowed for a debugging version which produces verbose output, and allows argv[0] to be given as argv[1]. This is advantageous because you can setup a pseudo infected host which is non replicating. Then run the virus making argv[0] the name of the pseudo infected host. It would replicate the parasite from that host. Thus it was possible to test without having a binary version of a replicating virus. The C code was converted to asm using the c compiler gcc, with the -S flag to produce assembler. Modifications were made so that use of rodata for initialized data (strings for open, unlink, and rename), was replaced with the relocatable data using the call address methodology. Most of the registers were saved on virus startup and restored on exit (transference of control to host). The asm version of the virus, can be improved tremendously in regards to efficiency, which will in turn improve the expected life time and replication of the virus (a smaller virus can infect more objects, where previously the padding would dictate the larger virus couldn't infect it). The asm virus was written with development time the primary concern and hence almost zero time was spent on hand optimization of the code gcc generated from the C version. In actual fact, less than 5 minutes were spent in asm editing - this is indicative that extensive asm specific skills are not required for a non optmised virus. The edited asm code was compiled (elf-p-virus-egg.c), and then using objdump with the -D flag, the addresses of the parasite start, the required offsets for patching were recorded. The asm was then edited again using the new information. The executeable produced was then patched manually for any bytes needed. elf-text2egg was used to extract hex-codes for the complete length of the parasite code usable in a C program, ala the ELF infector code. The ELF infector was then recompiled using the virus parasite. # objdump -D elf-p-virus-egg . . 08048143