November 30, 2021

Malware Analysis

  • Why we analyze malware:
    — make virus signatures
    — know how to remediate

Malware is categorized using a number of factors including:

  • the delivery method or type of attack,
  • the goal of the attack,
  • and the target and technique of the attack

| Types of Malware |

The types commonly discussed include:
1 – virus,
2 – worm,
3 – ransomware,
4 – botnet,
5 – dropper,
6 – Trojan,
7 – rootkit,
8 – spyware


  • piece of software that infects an existing file on a user’s system. It may do this by replacing an existing file or by inserting code. For example, a virus may take a running process and inject code directly into that process. It may introduce a library that a program calls, or it may use a technique called process hollowing.
    –Process hollowing is a technique malware authors use to get malicious software into a system. A process is put into a suspended state. The process memory is then unmapped, leaving the process memory essentially empty. Once that’s been done, the virus software is inserted into the memory space of the process, and the process is resumed, allowing the virus to run.
  • requires a human to execute

[2] WORM

  • self propagating virus (no human required)
  • Morris worn
  • Code Red


  • could be a virus or worm
  • take users data to make them pay to get them back

[4] Botnet

  • A botnet is a network of infected machines that are connected together through the use of a command and control network
  • botnet client software usually sends connections out, because outbound connections are almost always allowed
  • This process of connecting out to another device somewhere on the Internet is called beaconing because the device is sending a beacon signal, letting the attacker know that it is there, waiting to have orders sent.

[5] Dropper

  • A dropper is a small piece of malware that acts as an initial infection. It gains an initial foothold for an attacker, who can then go big


  • A Trojan is a specific type of virus or worm, most likely a virus. A Trojan is so named because it is something masquerading as something else


  • The purpose of a rootkit is to hide the existence of itself and other software
  • It may do this by replacing common system utilities that normally look at running processes or file listings.
  • In addition to hiding the existence of software on a system, a rootkit may install a piece of software called a backdoor
  • This process of connecting out to another device somewhere on the Internet is called beaconing because the device is sending a beacon signal, letting the attacker know that it is there, waiting to have orders sent.


  • browsing activities
  • credit cards
  • banking information
  • With the far more direct means of gaining money from victims that exist today, spyware is less prevalent than it once was.


  • infects a system’s in-memory database, meaning it uses techniques that attack running processes

| Delivery Mechanisms |

  • email
  • web-based
    — drive-by
    — watering hole attack

| Evasion |

  • polymorphism, changes its appearence
  • minor modifications
  • change in the compression ratio or even compression scheme


  • Module Overview
  • 2 types of malware analysis:
    [1] static analysis (Static analysis is far safer)
    [2] dynaic analysis

[1] Static

  • Safer
  • focuses on looking at the malware file – how it’s constructed, and the metadata associated with the file

[2] Dynamic

  • Dynamic analysis looks at the execution,
  • looking at what happens when the malware runs
    – the output and the changes to the overall system as a result of the malware execution.
  • Executable formats
    — vary from one operating system to another.
    — Each operating system defines something called an application binary interface (ABI), which is how the operating system expects to interact with programs that are run in that operating system.
    — Each operating system defines a different ABI
  • To analyze malware
    — consider how executable files are constructed.
    — Executable files have a particular format, and do not contain solely the operation code.
    — The program needs to be told where to start the execution, for instance.
    — There are also multiple sections of an executable file,
    — section for the executable code
    — section for all of the data that is known at compile time.
  • How executable files are created
    [1] – Source code
    [2] –
    [3] –

[1] Source Code

  • Human readable
  • provides high-level instructions to the computer on what to do to accomplish the tasks of the program
  • This creation process uses compiled language, rather than an interpreted or intermediate language
  • Compiler
  • 1st thing to happen, is the source code is run through a preprocessor
    — The preprocessor takes the code, which is essentially a macro, and converts it to something that looks more like C than the source code did
  • 2nd, Once the preprocessor is done, our source code is run through the actual compiler
    — The compiler takes out the source code, written in C, and converts it to another language, called assembly
    — Assembly language is a human-readable representation of the operations the compiler has discovered while compiling the code
  • Windows Executable Structure
  • Portable Executable
    — This format covers both executables in the sense of programs you run,
    — but also dynamic link libraries, which are collections of executable functions that can be pulled in by a program at runtime rather than compile time
  • 1st part of PE = DOS Header
  • 2nd part of PE = DOS Stub, the stub program indicates that the executable program can only be run in Windows and not in DOS
  • Other parts of PE
    — size of certain areas within the program, like the size of the initialized data
    — The important piece of data comes at address 0x0028 (40 in decimal)
    — It indicates the address of the entry point to the program.
    —- The entry point is a function.
    —- points to the start of the main() function
    —- The address of this main function is not the same address in every program, which is why it is called out specifically in the PE header
    — PE Sections
    — at least 2 sections
    —- .text section, where the executable code for the program resides, entry point
    —- .data section, where all initialized data is. Initialized data is data that is known at compile time and has a value at program start
    —- .rdata section, which is used for read-only data
    —- .bss section. This is where uninitialized data is located
    —- .rsrc section, this is “resource” section usually for images and strings
  • Viewing Portable Executable (PE)
  • EXE Explorer (by MiTeC)
  • Linux Executable Structure
  • Executable and Linkable Format (ELF) for its binary files
    — ELF
    — It supplants the much older a.out format, which you may still see from time to time
    — ELF supports multiple operating systems and processor architectures, some information about the operating system and processor has to be conveyed in the file header
    — the ELF header includes a flag indicating whether the file is stored in big-endian or little-endian
    — The file header also has to indicate which operating system the executable supports, since the ABIs for some Unix-like operating systems differ
    — ELF has 3 different headers
    — 1) File header (above)
    — 2) program header, tells the operating system how to construct the process in memory
    — 3) Section header. This header describes the sections in the program. Similar to a PE program in Windows,
    —- .text section where the executable code for the program is stored.
    —- .data section for initialized data.
    —- .rodata section for read-only data
  • Viewing Linux Executable Structure
  • readelf -e /usr/bin/readelf

DEBUGGERS (to Assembly)

  • Using a debugger may be really useful if you have a piece of malware that has been packed, meaning the .text section is really just a stub program whose only purpose is to decompress the actual program from the .data section
  • Windows = OllyDbg
  • Linux = dd

DECOMPILER (to source code)

  • NSA released Ghidra
  • look for dll imports that are network replated

Virtual Machine
1 – clean install
2 – take snapshot
3 – dynamic analysis
4 – check for what changed happened
— Ultimately, analyzing the effects of malware involves looking for changes in three places on the system.
—- The first is the FILE SYSTEM. Files may be added to the file system because a dropper is running or because the malware is infecting files already on the system. No matter what the reason, you should watch for any changes in the file system.
—- A snapshot of your virtual machine will give you a MEMORY capture you can analyze. You can also see whether any other processes have been spawned by the malware. If the malware program was packed or encrypted on the disk, a memory capture will give you the ability to dump the process from memory to get the unpacked or unencrypted version
—- you should capture the registry before and after running the malware. One tool that’s freely available is RegShot. It will dump the registry as a baseline and then, after you’ve run the malware, get a second sample. Once you have both samples, you can compare them. One of these is the Microsoft Attack Surface Analyzer. Analyzer directly provides information about the network connections in place, unlike the other techniques mentioned


  • Cuckoo Sandbox
  • Cuckoo Sandbox isn’t particular about the operating system used as a host
  • Cuckoo is essentially a collection of Python scripts over the top of existing software, what you really need to use it is a system that you are comfortable working with and that can run Python
  • You can select Windows, macOS, or Linux.
  • As one prerequisite is a hypervisor, you’ll need to choose what hypervisor to use:
    — Linux as a base operating system, you can use QEMU/KVM for your hypervisor
    — For the others, including Linux, VirtualBox is available and provides a programmatic interface that Cuckoo can use to manage the virtual machine instances.
    — Installing Cuckoo involves a number of prerequisites:
    — Python interpreter
    — python-pip – The preferred installation method for Cuckoo Sandbox is using the Python package manager, pip
    — Python-dev – Any -dev package on Linux includes libraries and headers necessary to write programs against that package. These contents are different from the libraries that are necessary for the runtime of a program
    — libffi-dev – libffi is a foreign function interface library, which is used for getting details about function calls made in one language from another language
    — libssl-dev – Programs that need to implement the Secure Socket Layer (SSL) or Transport Layer Security (TLS) will make use of this library. Basically, it implements encryption for developers
    — Python-virtualenv – This package allows the user to create self-contained, virtual Python instances. Think of it as Python’s version of a container, allowing all of the necessary prerequisites, with the right version numbers, to be installed into a directory structure without impacting the operating system packages or Python installation
    — Python-setuptools – This library provides a build environment for Python packages, allowing packages to be installed, upgraded, and removed from your system easily
    — In addition to those packages, you will probably need some others. If you want to use the web interface, you will need to install MongoDB. The Cuckoo developers also recommend using Postgresql for a database, so you will need not only that package but the libpq-dev package as well, to be able to interface with the database programmatically, as Cuckoo will. Finally, you will need a hypervisor. You can install the VirtualBox package or, if you want to use the native Linux hypervisor, you can install the following packages:

—- qemu-kvm
—- libvirt-bin
—- ubuntu-vm-builder
—- bridge-utils
—- python-libvirt

— All you need to do is run pip install cuckoo (or sudo pip install cuckoo) and pip will take care of all the downloading and installation for you
— First, you should create a user to be when you run Cuckoo. This can be done using something like the following: sudo useradd -d /home/cuckoo -m cuckoo
— You may want to place the home directory for the user in a different location. (The purpose of the home directory is to have a place to create the virtual environment)
— Once you have created the user, you need to add the user to the group that KVM is running from. Do this using sudo usermod -a -G libvirt cuckoo. Your system may use a different group name, so make sure you find the correct group to add your Cuckoo user to.
— The first thing to do is switch to your Cuckoo user (sudo su – cuckoo)
— You will create the sandbox using Python’s virtual environment. Create a Python virtual environment named venv using the command virtualenv venv. That will create a directory structure with all the Python files necessary to operate exclusively within that environment
— In order to switch to the environment, use the activate script that virtualenv has put into place, like this: . venv/bin/activate. Note the . (dot) at the beginning of that command. It essentially tells your shell to run the venv/bin/activate command in the shell.
— Once you have created your user, built the virtual environment, and activated it, you can install Cuckoo. Install everything into the virtual environment, rather than the more global operating system itself. This way, if something really bad happens during the analysis, the malware will be contained to the Cuckoo user and not have any excess permissions. To install Cuckoo, do the following: Make sure pip is updated to the latest version and make sure the setuptools package, necessary for doing Python package installs, is in place and is the latest version. Do that with the following command: pip install -U pip setuptools After that, use the following command to install Cuckoo:pip install -U cuckoo.
— Once you have Cuckoo installed, run it with cuckoo -d. This will create the configuration files that you need to edit, to run in your environment. Configuration is described below. Aside from some configuration, the host is ready.

— You still need a guest operating system to run the malware inside of. If you think you will be looking at Windows malware (and given the enormous Windows user base, it’s still the predominant platform for malware, because that’s where the victims are), Windows 7 makes a good choice.

— Windows 7 is older and doesn’t have all the newer features, but malware will generally run nicely under it. You can create the virtual guest image in any way you like as long as you can get it into the hypervisor. You can create the image inside the hypervisor, or you can create it somewhere else and then import it.

— You must disable Windows Firewall and Windows Defender, if they are in place. Also, disable all automatic updates. Once that is done, install Python, since it will be necessary to get Cuckoo working inside the guest for monitoring and deployment. Once Python is installed, install the Cuckoo agent. The agent is located in the agent/ directory in the Cuckoo deployment. The file is named, and it needs to be copied into the guest system. It doesn’t matter where you place it. Perhaps the easiest location to put the file is the Startup folder, so that it runs every time the system boots. Otherwise, you will need to find another mechanism to make sure the agent runs at boot. The agent will create a console window when it runs, just as most running programs do. If you’d rather hide that console window, change the name of the file from to agent.pyw.

— At this point, you should have a guest that’s ready to take malware from Cuckoo.
— Once configuration is complete, you are ready to run Cuckoo for real. Before starting up the Cuckoo daemon, you may want to install the community signatures, which will help with analysis. To install the community signatures, run cuckoo community inside your virtual space. Once the signatures are downloaded, run cuckoo -d again to get Cuckoo running. This starts the Cuckoo service, waiting for malware samples to be analyzed. Submitting samples couldn’t be easier. From another command line, enter the virtual environment again (. venv/bin/activate). Once there, with your malware sample handy, run cuckoo submit filename. Replace filename with the actual name of the malware sample file.

— In the other session where the Cuckoo daemon is running, you will see the status of the submission, where you can monitor the progress. For each analysis, Cuckoo will create a directory of artifacts. These will be stored in ~/.cuckoo/storage/analyses. The one that is running will be in latest, which is a symlink to the latest analysis directory. If you have installed all the supporting modules correctly, you should have a packet capture file and a memory dump as well from the analysis machine. The file to pay particular attention to is in the reports directory and is named report.json. This is a JavaScript Object Notation formatted file providing the results of the analysis. This file contains details, including system calls made, libraries called, network connections, files added, and all of the other details about the analysis session.

— Additionally, Cuckoo takes screen captures of the running analysis system. You will find those in the shots directory as JPG files. Depending on how long the analysis runs and the behavior of the system, multiple files may be created. These screen captures show what happens on the desktop, in case programs are started that create a window on the desktop.