ARM Linux Kernel early startup code debugging

This post shows how to debug early (pre-decompression/pre-relocation) initialization code of an ARM (Aarch32) Linux kernel. Debugging kernel code is often not needed and anyway rather hard due to the interaction with real hardware and concurrency in play.  However, to watch, read and learn about early ARM initialization code, debugging can be really useful. Early Initialization is running without concurrency anyway, so this is not a problem in this case.

Before starting, I assume you have a working ARM cross compile environment, a compiled kernel and Qemu at hand. Make sure to compile the kernel with debug symbols (CONFIG_DEBUG_KERNEL=y and CONFIG_DEBUG_INFO=y). I use the following arguments to start Qemu:

$ /usr/bin/qemu-system-arm -s -S -M virt -smp 1 \
  -nographic -monitor none -serial stdio \
  -kernel arch/arm/boot/zImage \
  -initrd core-image-minimal-qemuarm.cpio_.gz \
  -append "console=ttyAMA0 earlycon earlyprintk"

Especially the arguments -s -S are notable here, since the former makes sure Qemu’s built-in debugger is available at port 1234 and the latter stops the machine. This now allows to connect to Qemu using gdb. I use the gdb from my ARM cross compiler toolchain. Once I have a gdb prompt, lets immediately enable gdb’s automatic disassembler on next line before connecting:

$ arm-buildroot-linux-gnueabihf-gdb
...
(gdb) set disassemble-next-line on
(gdb) show disassemble-next-line
Debugger's willingness to use disassemble-next-line is on.
(gdb) target remote :1234
Remote debugging using :1234
0x40000000 in ?? ()
=> 0x40000000: 00 00 a0 e3 mov r0, #0

Continue reading “ARM Linux Kernel early startup code debugging”

U-Boot/Linux and HYP mode on ARMv7

The newer ARMv7 Cortex-A class cores such Cortex-A7, A15 and A17 come with a virtualization extensions which allow to use KVM (kernel virtual machine). The NXP i.MX 7Dual SoC which I worked with lately includes the ARM Cortex-A7 CPU. I went ahead and tried to bring up KVM on i.MX 7. I was not really familiar with the ARMv7 virtualization architecture, so I had to read up on some concepts. This post summarizes what I learned and gives a big picture of software support.

The Hypervisor mode

To provide hardware support for full CPU virtualization an additional privilege level is required. User-space (PL0) uses the SVC (Supervisor) instruction to switch to kernel-space (PL1, SVC mode). A similar separation between Kernel and hypervisor is required. The ARMv7 architecture with virtualization extension calls this privilege level PL2 or HYP mode.

Linux with KVM for ARM uses this mode to provide CPU virtualization. The CPU needs to be in HYP mode when Linux is booting so KVM can make use of the extension. How KVM uses the HYP mode in detail is explained in this excellent LWN article. After building a kernel with KVM support, I encountered this problem first: By default, the system did boot in SVC mode.

...
Brought up 2 CPUs
CPU: All CPU(s) started in SVC mode.
...
kvm [1]: HYP mode not available
...

Secure and Non-Secure world

ARMv7 privilege levels
ARMv7 privilege levels

To understand how to switch into Hypervisor mode, one needs to understand the whole privilege level architecture first. Notable here is that on ARMv7 CPU’s the HYP mode is only available in non-secure mode, by design. Any hypervisor needs to operate in non-secure mode, there is no virtualization extension in secure mode. Continue reading “U-Boot/Linux and HYP mode on ARMv7”

Using the perf utility on ARM

Given two systems, both with a Cortex-A5 CPU, one clocked at 396MHz without L2 cache and one clocked at 500 MHz with 512kB L2 cache. How big is the impact of the L2 cache? Since the clock frequency is different, a simple CPU time comparison of a given program does not answer the question… I tried to answer this question using perf. perf is often used to profile software, but in this case it also proved to be useful to compare two different hardware implementations.

Most CPU’s nowadays have internal counters which count various events (e.g. executed instructions, cache misses, executed branches and branch misses etc…). Other hardware, e.g. cache controllers, might expose performance counters too, but this article focuses on the hardware counters exposed by the CPU. Continue reading “Using the perf utility on ARM”

Linux earlyprintk/earlycon support on ARM

The serial console is a very helpful debugging tool for kernel development. However, when a crash occurs early in the boot process, one is left in the dark. There are two early console implementations available with different merits. This post will throw some lights on them.

earlyprintk

The kernel supports earlyprintk since… probably ever. At least 2.6.12, where the new age (git) started. After enabling “Kernel low-level debugging”, “Early printk” under “Kernel hacking” and selecting an appropriate low-level debugging port, you are ready to get early serial console output. Continue reading “Linux earlyprintk/earlycon support on ARM”