ARM Linux Kernel early startup code debugging

This post shows how to debug early (pre-decompression/pre-relocation) initialization code of an ARM (Aarch32) Linux kernel. Debugging kernel code is often not needed and anyway rather hard due to the interaction with real hardware and concurrency in play.  However, to watch, read and learn about early ARM initialization code, debugging can be really useful. Early Initialization is running without concurrency anyway, so this is not a problem in this case.

Before starting, I assume you have a working ARM cross compile environment, a compiled kernel and Qemu at hand. Make sure to compile the kernel with debug symbols (CONFIG_DEBUG_KERNEL=y). I use the following arguments to start Qemu:

$ /usr/bin/qemu-system-arm -s -S -M virt -smp 1 \
  -nographic -monitor none -serial stdio \
  -kernel arch/arm/boot/zImage \
  -initrd core-image-minimal-qemuarm.cpio_.gz \
  -append "console=ttyAMA0 earlycon earlyprintk"

Especially the arguments -s -S are notable here, since the former makes sure Qemu’s built-in debugger is available at port 1234 and the latter stops the machine. This now allows to connect to Qemu using gdb. I use the gdb from my ARM cross compiler toolchain. Once I have a gdb prompt, lets immediately enable gdb’s automatic disassembler on next line before connecting:

$ arm-buildroot-linux-gnueabihf-gdb
(gdb) set disassemble-next-line on
(gdb) show disassemble-next-line
Debugger's willingness to use disassemble-next-line is on.
(gdb) target remote :1234
Remote debugging using :1234
0x40000000 in ?? ()
=> 0x40000000: 00 00 a0 e3 mov r0, #0

The debugger disassembled the first instruction which is going to be executed for us. The instruction will set r0 to immediate #0. Is this the first instruction of the kernel?

In order to be able to follow the code it is important to understand how the ARM kernel actually boots. The kernel uses a decompression header which decompresses the kernel first. Then the decompressor jumps to the kernels actual entry point. The decompressor is position independent code (PIC) and hence can be run from any point in RAM. At which address the decompressor is running depends on where the kernel has been loaded to, which is typically done by the bootloader. The decompressor’s entry point is at the start label in arch/arm/boot/compressed/head.S. Depending on whether the EFI stub is enabled, the first instructions might look different. The be sure what instruction we expect we can disassemble the first instructions of the kernel zImage:

$ arm-buildroot-linux-gnueabihf-objdump -D -marm \
  -b binary arch/arm/boot/zImage | head -n 20

arch/arm/boot/zImage:     file format binary

Disassembly of section .data:

00000000 <.data>:
       0:       13105a4d        tstne   r0, #315392     ; 0x4d000
       4:       13105a4d        tstne   r0, #315392     ; 0x4d000
       8:       13105a4d        tstne   r0, #315392     ; 0x4d000
       c:       13105a4d        tstne   r0, #315392     ; 0x4d000
      10:       13105a4d        tstne   r0, #315392     ; 0x4d000
      14:       13105a4d        tstne   r0, #315392     ; 0x4d000
      18:       13105a4d        tstne   r0, #315392     ; 0x4d000
      1c:       e1a00000        nop                     ; (mov r0, r0)
      20:       ea0003f6        b       0x1000

So the above instruction is actually not the first instruction of the kernels decompressor code! This is Qemu’s bootloader code. Qemu Aarch32 bootloader code can be found in Qemu’s source tree in hw/arm/boot.c

static const ARMInsnFixup bootloader[] = { 
	{ 0xe28fe004 }, /* add lr, pc, #4 */ 
	{ 0xe51ff004 }, /* ldr pc, [pc, #-4] */
	{ 0xe3a00000 }, /* mov r0, #0 */ 
	{ 0xe59f1004 }, /* ldr r1, [pc, #4] */ 
	{ 0xe59f2004 }, /* ldr r2, [pc, #4] */ 
	{ 0xe59ff004 }, /* ldr pc, [pc, #4] */ 

The code is initializing r0, r1 and r2 before jumping to the kernels load address, which is the board specific loader start address plus KERNEL_LOAD_ADDR (0x00010000). Use stepi to step through Qemu’s bootloader code. After the last instruction the machine jumps to the kernel’s initial instruction, in my case at 0x40010000:

(gdb) stepi
0x40010000 in ?? ()
=> 0x40010000:  4d 5a 10 13     tstne   r0, #315392     ; 0x4d000

After some stepi the expected instruction appears. In my case, the first instruction is actually a rather weird nop, which has being introduced to make the zImage also a valid PE/COFF binary for EFI (see arch/arm/boot/compressed/efi-header.S).

At this point we can laod the symbols for the decompressor. Since the decompressor is PIC, we need to tell the debugger to which address the decompressor has been laoded to:

(gdb) add-symbol-file arch/arm/boot/compressed/vmlinux 0x40010000
add symbol table from file "arch/arm/boot/compressed/vmlinux" at
        .text_addr = 0x40010000
(y or n) y
Reading symbols from arch/arm/boot/compressed/vmlinux...done.

Symbols are not that useful since we are dealing with assembler, but at least we can use labels to specify breakpoints. It is interesting stepping through the code line by line, but at times larger jumps are more useful.

What is important to know is where the kernel will get uncompressed to. This is determined pretty early in arch/arm/boot/compressed/head.S. For most ARM kernel configuration this is dynamically calculated (CONFIG_AUTO_ZRELADDR). It depends on the load address and the platform specific TEXT_OFFSET (see arch/arm/Makefile). In my case the text offset is `0x00208000` and the load address is 0x40010000, which then works out as 0x40010000 & 0xf8000000 + 0x00208000 = 40208000. Let’s jump to the label restart, where the relocation address should be in r4:

(gdb) info address restart
Symbol "restart" is at 0x40011088 in a file compiled without debugging.
(gdb) break restart
Breakpoint 1 at 0x40011088: file arch/arm/boot/compressed/head.S, line 257.
(gdb) continue

Breakpoint 1, restart () at arch/arm/boot/compressed/head.S:257
257     restart:        adr     r0, LC0
=> 0x40011088 <restart+0>:      73 0f 8f e2     add     r0, pc, #460    ; 0x1cc
(gdb) info reg r4
r4             0x40208001       1075871745

So the kernel will decompressed to 0x40208000. Debugging further actually reveled that the decompressor actually had to relocate the compressed kernel first since the compressed kernel overlaps that target address. Loading the kernel to a higher address avoids that, but as far as I know the kernels load address can not be influenced in Qemu.

To skip the whole decompression phase we can create a breakpoint at that address and continue debugging:

(gdb) break *0x40208000
Breakpoint 2 at 0x40208000
(gdb) continue

Breakpoint 2, 0x40208000 in ?? ()
=> 0x40208000:  3e 1e 04 eb     bl      0x4030f900

This is now the first instruction in arch/arm/kernel/head.S. Typically the kernel is linked to a different address than that. In my case the symbols in vmlinux ( have an entry for stext (the entry point defined in head.S) at 0xc0208000. The uncompressed kernels initialization code is still position independent. The symbols will only match after the MMU has been setup. Unfortunately I did not found a way to load the symbols relocated to the current kernels relocation address. Using add-symbol-file with an text offset seems not to work for the kernel binary (hints welcome!). As a workaround addresses for breakpoints of interest can be calculated using and the known relocation address.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.