Dynamic analysis of firmware components in IoT devices
Among the various offensive security techniques, vulnerability assessment takes priority when it comes to analyzing the security of IoT/IIoT devices. In most cases, such devices are analyzed using the black box testing approach, in which the researcher has virtually no knowledge about the object of research. As a rule, this means that the source code of the device’s firmware is unavailable and all the researcher can use is the user manual and a few threads on some user forum discussing the device’s operation.
The vulnerability assessment of IoT/IIoT devices is based on analyzing their firmware. It is performed in several stages: preparing the firmware (extracting and unpacking it), searching for components that are of interest from the researcher’s viewpoint, running the firmware or its parts in an emulator and, finally, searching for vulnerabilities. A variety of techniques are used at this last stage, including static and dynamic analysis and fuzzing.
The conventional approach to analyzing device firmware is to use the QEMU emulator in combination with the GNU Debugger. We decided to discuss other, less obvious tools for working with firmware, including Renode and Qiling. Each of those tools has its own features, advantages, and limitations that make it effective for certain types of tasks.
Renode is a tool designed to emulate the entire system, including memory chips, sensors, displays, and other peripherals. It can also emulate the interactions between multiple processors (on multiprocessor devices), each of which can have its own architecture and firmware. Renode can also interlink emulated hardware with real hardware implemented as a programmable logic device (an FPGA chip).
Qiling is an advanced multi-platform framework for emulating executable files. It can emulate a multitude of operating systems and environments, including, with varying degrees of maturity, Windows, MacOS, Linux, QNX, BSD, UEFI, DOS, MBR, and Ethereum Virtual Machine. It supports x86, x86_64, ARM, ARM64, MIPS, and 8086 architectures and various executable file formats. It can also emulate the MBR loading process.
We selected a real-world device, a network video recorder by a major manufacturer, as an object of our research. The device is based on the HiSilicon platform and runs Linux.
The firmware downloaded from the manufacturer’s website consists of a single file in which the binwalk tool detected a CramFS file system. After unpacking the file, we find uImage – a combined image of the Linux kernel and initramfs – as well as several encrypted scripts and TAR archives.
DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 uImage header, header size: 64 bytes, header CRC: 0xCA9A1902, created: 2019-08-23 07:16:16, image size: 4414954 bytes, Data Address: 0x40008000, Entry Point: 0x40008000, data CRC: 0xDE0F30AC, OS: Linux, CPU: ARM, image type: OS Kernel Image, compression type: none, image name: "Linux-3.18.20" 64 0x40 Linux kernel ARM boot executable zImage (little-endian) 2464 0x9A0 device tree image (dtb) 16560 0x40B0 LZMA compressed data, properties: 0x5D, dictionary size: 33554432 bytes, uncompressed size: -1 bytes 4401848 0x432AB8 device tree image (dtb)
Below, we look at the operation of Renode and Qiling at the system level.
For information on using these tools at the application level (using the video recorder’s firmware as an example), see the full version of the article.
System-level emulation using Renode
Renode is a full system emulation utility that is positioned by its developers primarily as a tool designed to make embedded software development, debugging, and automated testing easier. However, it can also be used as a dynamic analysis tool to analyze the behavior of systems during vulnerability assessment. Renode can be used to run both small embedded real-time operating systems and full-fledged operating systems such as Linux or QNX. The emulator is mostly written in C#, so its functionality can be adapted to the researcher’s needs relatively quickly.
Describing the emulated platform
Peripheral devices that are part of single-chip systems are normally available via Memory Mapped I/O (MMIO) – physical memory regions to which registers of the corresponding peripheral modules are mapped. Renode provides the capability to build an on-chip system from building blocks using a configuration file with the .repl (REnode PLatform) extension that describes which devices should be mapped to which memory addresses.
Information about available peripheral devices and the memory map for the platform employed can be found in SoC documentation (if publicly available). If the documentation is not available, this information can be found, for example, by analyzing the contents of the Device Tree Blob (DTB), a data block describing the platform that is required by the Linux kernel to run Linux on embedded devices.
In the firmware being analyzed, the DTB block is attached to the end of the uImage file (according to information from the binwalk tool). After converting the DTB into a readable format (DTS) using the dtc tool, we can use it to create a platform description for Renode.
Starting emulation
An initialization script has to be prepared to run something useful on the platform described in a REPL file. The script normally loads executable code into virtual memory, configures processor registers, sets additional event handlers, configures the output of debug messages (if necessary), etc.
:name: HiSilicon :description: To run Linux on HiSilicon using sysbus $name?="HiSilicon" mach create $name machine LoadPlatformDescription @platforms/cpus/hisilicon.repl logLevel 0 ### create externals ### showAnalyzer sysbus.uart0 ### redirect memory for Linux ### sysbus Redirect 0xC0000000 0x40000000 0x8000000 ### load binaries ### sysbus LoadBinary "/home/research/digicap.out/uImage" 0x40008000 sysbus LoadAtags "console=ttyS0,115200 mem=128M@0x40000000 nosmp maxcpus=0" 0x8000000 0x40000100 ### set registers ### cpu SetRegisterUnsafe 2 0x40000100 # atags cpu PC 0x40008040
The script loads the uImage file into the platform’s memory at the address taken from the binwalk output, configures kernel arguments, and passes control to address 0x40008040 because the first 0x40 bytes are taken by the uImage header.
After starting emulation, we get a fully functional terminal, with which we can interact just as we would with a terminal on any Linux system:
The Renode emulator provides enough capabilities to quickly start the dynamic analysis of the firmware being studied. As a hands-on example, we were able to partially run the firmware of the network video recorder without actually having the recorder on hand. In the next steps, we can use the tools available in the emulated file system to decrypt the encrypted firmware files, extract kernel modules that provide the recorder functionality and analyze their logic, etc.
As the Renode emulator provides sufficiently extensive support for peripherals that are commonly used in on-chip systems based on the ARM architecture, it is not necessary to write any additional code to see a fully functional Linux terminal. At the same time, where necessary, the modular architecture of the emulator and its scripting and plugin-writing capabilities make it relatively easy to implement support for any lacking functionality at a level that is sufficient to conduct research.
One of the distinguishing features of the tool is its use of system-level emulation. As a result of this, it can be difficult to use it to fuzz-test or debug a user-space application that runs in an emulated operating system.
The tool’s shortcomings include the lack of detailed documentation, with existing documentation describing only the most basic usage scenarios. When implementing something more complicated, such as a new peripheral device, or when trying to understand how a specific built-in command works, you have to repeatedly refer to the project repository on GitHub and study the source code of both the emulator itself and bundled peripheral devices.
Fuzzing using the Qiling Framework
The Qiling Framework was written in Python, which makes adapting its functionality to the researcher’s specific needs sufficiently easy. The Qiling Framework has the Unicorn engine under the hood, which is simply an emulator of machine instructions, while Qiling provides numerous high-level functions such as reading files from the file system, loading dynamic libraries, etc.
Compared to QEMU, the Qiling Framework can emulate more platforms and provides flexible configuration of the emulation process, including the capability to modify executing code on-the-fly. In addition, it is a cross-platform framework, which means it can be used to emulate Windows or QNX executables on Linux, and vice versa.
As part of the demonstration, we will try to use Qiling to fuzz-test the hrsaverify utility, which is part of the firmware that we are analyzing, using AFL++, a utility used to validate encrypted files, which takes the path to the file to be validated as an argument. The Qiling Framework already has several examples of running the AFL++ fuzzer in the examples/fuzzing directory of its repository. We will adapt the example named linux_x8664 to run hrsaverify. The modified script for running the fuzzer is shown below:
import unicornafl as UcAfl UcAfl.monkeypatch() import os, sys from typing import Any, Optional sys.path.append("../../..") from qiling import Qiling from qiling.const import QL_VERBOSE from qiling.extensions import pipe def main(input_file: str): ql = Qiling(["../../rootfs/hikroot/usr/bin/hrsaverify", "/test"], "../../rootfs/hikroot", verbose=QL_VERBOSE.OFF, # keep qiling logging off console=False, # thwart program output stdin=None, stdout=None, stderr=None) # don't care about stdin/stdout def place_input_callback(uc: UcAfl.Uc, input: bytes, persistent_round: int, data: Any) -> Optional[bool]: """Called with every newly generated input.""" with open("../../rootfs/hikroot/test", "wb") as f: f.write(input) def start_afl(_ql: Qiling): """Callback from inside.""" # We start our AFL forkserver or run once if AFL is not available. # This will only return after the fuzzing stopped. try: if not _ql.uc.afl_fuzz(input_file=input_file, place_input_callback=place_input_callback, exits=[ql.os.exit_point]): _ql.log.warning("Ran once without AFL attached") os._exit(0) except UcAfl.UcAflError as ex: if ex.errno != UcAfl.UC_AFL_RET_CALLED_TWICE: raise # Image base address ba = 0x10000 # Set a hook on main() to let unicorn fork and start instrumentation ql.hook_address(callback=start_afl, address=ba + 0x8d8) # Okay, ready to roll ql.run() if __name__ == "__main__": if len(sys.argv) == 1: raise ValueError("No input file provided.") main(sys.argv[1])
The first thing we should look for is the base address of the executable file (in our case, 0x10000), the address of the main function. Sometimes it is necessary to additionally set hooks on other addresses that, when encountered, should be considered as a crash by the fuzzer. For example, when running AFL in a QNX environment (in the qnx_arm directory), this type of additional handler is set for the address of the SignalKill function in libc. In the case of hrsaverify, no additional handlers are needed. It should also be kept in mind that all files that must be available to the running application should be put into sysroot, and their relative paths should be passed (in this case, ../../rootfs/hikroot/).
AFL++ is started with the following command:
AFL_AUTORESUME=1 AFL_PATH="$(realpath ./AFLplusplus)" PATH="$AFL_PATH:$PATH" afl-fuzz -i afl_inputs -o afl_outputs -U -- python ./fuzz_arm_linux.py @@
The AFL fuzzer will start, and after some time we will see some crashes:
Qiling is a promising tool whose main advantages are its high flexibility, extensibility, and support for a broad variety of architectures and environments. The framework can serve as a substitute for QEMU in cases where using the latter is not possible (for example, unsupported target OS or the lack of required additional capabilities, such as setting arbitrary handles for any memory addresses, special handling of interrupts, etc.). However, its high flexibility and shallow learning curve due to its use of Python also contribute to its relatively low emulation and fuzzing speed.
The full version of the article is published on the Kaspersky ICS CERT website.
If you like the site, please consider joining the telegram channel or supporting us on Patreon using the button below.