Chapter 3. Analysis

Table of Contents

Introduction
The Tools
IDA Pro
FLAIR
Rpat UNIX Libraries Preprocessor for IDA Pro
Analysis
Fun with strings
Generating Libc5 signatures for IDA
Working with IDA
Some hints
Socket Functions
Example: Identifying a Libc Function
Example: Identifying a Global Libc Variable
Results
the-binary Disassembly
the-binary IDA Database
the-binary Server
the-binary Traffic Decoder
the-binary Client

Reverse engineering undocumented, untrusted and possibly hostile code is a demanding task. The security community needs more knowledge oh the tools and techniques for reverse engineering and I hope that this challenge would be a good way to introduce more people to this skill set.

Before I start with my analyis I would like to point out a great resource for reverse engineers. Ironically most of the publicly available information about reverse engineering is available at sites dedicated to software piracy and cracking software protection schemes. For years the best resource for crackers was Fravia's website While most of the articles on the site deal with cracking registration codes for shareware programs, the collection of tutorials and well documented techniques for reverse engineering makes it invaluable for the aspiring reverse engineer.

The two main approaches to program analysis are the active approach and the passive approach. The active approach involves running the program and monitoring its interaction with the environment. An excelent tool for doing this under Linux is Fenris by Michal Zalewski. The passive approach involves disassembling the program and figuring out the entire program logic before running it. This approach is slower, but the analysis is more thorough. Which approach you would chose is a matter of personal preference the nature of the program. Since the Honeynet binary is a complex program and we have absolutely no idea about what it does, I decided to go with the passive approach.

The commercially available disassembler is IDA Pro. Unfortunately none of the open source tools come even close to its functionality (but the Bastard Disassembly Environemnt project is worth keeping an eye on). IDA Pro supports many different CPU architectures and file formats. Its analysis engine identifies subroutines, local and global variables, Linux system calls and library routines via fingerprinting. The last feature is very important for the analysis of the Honeynet binary.

A evaluation version of IDA Pro is available for download on the Datarescue website. It supports x86 and ELF files. If you are new to IDA, read the Getting Started Manual and Gij's IDA Tutorial. A lot of information about IDA is also available at Fravia's website. IDA Pro does not have a Linux version, but runs fine under Wine.

We'll start the analysis by running objdump -t on the binary.

$ objdump -t the-binary

the-binary:     file format elf32-i386

objdump: the-binary: no symbols
The program is statically linked. The output of strings contains the line
@(#) The Linux C library 5.3.12
which indicates the libc version the program is linked with. A quick search on Google shows that Redhat 4.x used this version of libc. Download all libc-devel packages from RedHat 4.x. They will be useful later.

Because the binary is statically linked, the output of strings shows too many strings from libc code. libstrings.pl is a quick and dirty Perl script that runs strings on all .o files from libc.a stores the strings in a hash. Then the scripts prints out all strings found in the binary, prefixing the ones that occur in libc.a with the name of the .o file where they were found. We can filter the output of the script to only see the strings that are part of the program code, not the library functions. There are very few of them, and some look very promising:

                    [mingetty]
                    /tmp/.hj237349
                    /bin/csh -f -c "%s" 1> %s 2>&1
                    TfOjG
                    /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:.
                    HISTFILE
                    linux
                    TERM
                    /bin/csh -f -c "%s" 
                    %c%s

sub_8049174 is a function called by main(). It opens a raw socket and sends some data, so we can be certain that it is not a libc function. At address 08049278 we have a call to sub_80556CC. This function is between other libc function in the binary and calls lots of internal libc functions like __libc_sigprocmask, __sigaction, __libc_alarm, sigsuspend, etc. These two observations lead us to believe that it is also a libc function. We need to identify it. We open the cross-references window and look at each of the functions calling sub_80556CC. Unfortunately none of them are libc functions. If they were, we could figure out what sub_80556CC is just by looking at the source of the function calling it.

We'll try to find a function call with specific arguments in this function and then grep the libc source. A good example is .text:08055728

        push    0Eh
        call    __sigaction
A quick consultation with the sigaction manpage and /usr/include/linux turns this into a call of sigaction(SIGALRM, ...)

$ grep -r sigaction * | grep SIGALRM
libc/posix/sleep.c:  if (__sigaction (SIGALRM, &action, &oldaction) < 0)
libc/posix/sleep.c:    (void) __sigaction (SIGALRM, &oldaction, (struct sigaction *) NULL);
libc/posix/sleep.c:    (void) __sigaction (SIGALRM, &oldaction, (struct sigaction *) NULL);
libc/pwd/lckpwdf.c:	if (sigaction(SIGALRM, &act, &oldact) == -1)
libc/pwd/lckpwdf.c:	sigaction(SIGALRM, &oldact, NULL);
libc/pwd/lckpwdf.c:	sigaction(SIGALRM, &oldact, NULL);
We have to look at these two source files and try to identify the function by the sequence of subroutine calls. The first function I checked was sleep() and the source matched the disassembled code perfectly. We can rename sub_80556CC to 'sleep' and go back to sub_8049174 - the function that called sleep().

The following files and tools were produces during the analysis of the binary:

the-binary.asm

This is the binary disassembly, produced by IDA Pro.

the-binary.idb

This is the IDA Pro disasembly database.

the-binary.c

This is the decompiled source of the binary. It compiles but hasn't been tested. It is a good reference for the program architecture and features.

decoder.c

The traffic decoder can be run on a live network interface or on a tcpdump file.

$ ./decoder -h
./decoder: invalid option -- h
the-binary Traffic Decoder
Syntax: the-binary [options]
  -i <iface>     Listens on a interface
  -r <dumpfile>  Reads in a tcpdump file

client.c

The client can send commands to the backdoor. It supports all functions of the backdoor and has been tested with the real binary.

$ ./client 
the-binary Client
Syntax: the-binary-client <command> [options]
To change the IP addresses of the client and the backdoor, edit the source

Commands:
  init: initializes the client address list
    --type <type>             type of address list
    --ip <a.b.c.d>            client ip (if type=2, spiecify 10 addresses)
  status: returns status information
  kill: kills the DoS or shell process
    no parameters
  exec: execute a command and discard the output
    --cmd <string>            command line
  exec_output: execute a command and return the output
    --cmd  <string>           command line
  bind_shell: bind a shell on port 23281
    no parameters
  udp_flood: launch udp flood attack
    --src <a.b.c.d>           source ip address
    --dst <a.b.c.d>           destination ip address
    --hostname <hostname>     destination hostname
    --d_port <port>           destination port for the packet
  icmp_flood: launch icmp ping flood/smurf attack
    --src <a.b.c.d>           source ip address
    --dst <a.b.c.d>           destination ip address
    --hostname <hostname>     destination hostname
  syn_flood: launch syn flood attack
    --src <a.b.c.d>           source ip address (if not supplied, use a random ip)
    --dst <a.b.c.d>           destination ip address
    --hostname <hostname>     destination hostname
    --d_port <port>           destination port for the SYN packet
    --sleep_after <number>    sleep after number packets have been sent (optional)
  dns_flood: launch a dns query flood attack
    --src <a.b.c.d>           source ip address (if not supplied, use a random ip)
    --dst <a.b.c.d>           destination ip address
    --hostname <hostname>     destination hostname
    --s_port <port>           source port for the queries (optional)
    --sleep_after <number>    sleep after number packets have been sent (optional)
  dns_smurf: launch dns smurf attack
    --ip <a.b.c.d>            victim ip address
    --hostname <hostname>     victim hostname
    --s_port <port>           source port for the queries (optional)
    --sleep_after <number>    sleep after number packets have been sent (optional)

The client does not display the responses from the backdoor. Use the decoder in sniffing mode to see them.