Hibernate Debian running on Google Compute Engines preemptible VM

Googles Compute Engine VMs which are configured preemptible are massively cheaper than regular VMs, typically a fourth or even a fifth of the price of a regular machine. It seems quite lucrative for everything which is not mission critical.

However, it can be quite annoying when all state gets lost. Luckily Google does not just turn off the machine but sends an ACPI G2 Soft-Off signal. With Debian 9 (stretch) the ACPI daemon is processing the ACPI signals (acpid) and by default shuts down the machine. This post shows how to use hibernate instead.

Note: Since Google might start the machine on a different (virtual) hardware resuming the machine might not succeed, or even worse, lead to adverse effects. In practice, it seems to work quite well for me ūüôā

First, swap space is required in order for hibernate to work. By default Debian 9 images created using Google Compute Engine have no swap space provisioned. Repartitioning the disk with the root filesystem on it is a hassle. However, Linux also allows using a swap file instead. First, such a file must be created and Linux must be instructed where to find it. Note that Linux is capable to hibernate even if the file is smaller than available memory. Unused or easily recoverable data such as disk cache will not be part of the memory image created during suspend.

A file of arbitrary size can be allocated using:

# fallocate -l 2G /swapfile
# chmod 600 /swapfile
# mkswap /swapfile

Make sure the file is used as swap space on system boot by adding an appropriate line to /etc/fstab:

/swapfile none swap defaults 0 0

The kernel needs to know the first physical block of the file:

# filefrag -v /swapfile | head
Filesystem type is: ef53
File size of /swapfile is 2147483648 (524288 blocks of 4096 bytes)
 ext:   logical_offset:    physical_offset:    length: expected: flags:
   0:        0.. 30719:    360448.. 391167:     30720:        unwritten

Add the root device as well as the offset to the swapfile to the kernel arguments by editing /etc/default/grub and appending to the GRUB_CMDLINE_LINUX variable:

GRUB_CMDLINE_LINUX="... resume=/dev/sda1 resume_offset=360448"

Now update Grub:

# update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.9.0-4-amd64
Found initrd image: /boot/initrd.img-4.9.0-4-amd64

At this point, a reboot is required. After the reboot swap space should be ready to be used:

# swapon -s
Filename                   Type       Size Used Priority
/swapfile                  file    2097148 0    -1

Let’s test whether a user-initiated hibernate works: By using¬†systemctl the system can be instructed to suspend the system:

# systemctl hibernate

After some seconds the instance should automatically appear as “stopped” in the Compute Engine VM instances view. Restarting the VM will restore the system to its previous state. SSH sessions often do not survive a suspend/resume cycle, also since Google assigns typically a new external IP to the instance. Tools such as screen or tmux¬†allow to create SSH connection independent sessions which can be reconnect using screen -r and¬†tmux at respectively.

To make sure this happens when Google perempts the virtual machine, the machine needs to respond to ACPI events. In Debian 9 acpid is handling ACPI events. The Soft-Off signal does generate an ACPI power button event. The default Debian 9 script /etc/acpi/powerbtn-acpi-support.sh executes shutdown, however, when /etc/acpi/powerbtn.sh  is present an alternative action can be scripted. By using systemctl the machine can be instructed to hibernate instead:

systemctl hibernate

Make sure the script is executable

# chmod +x /etc/acpi/powerbtn.sh

Testing showed that for some reason, two events are generated (this can be seen when passing “%e” powerbtn-acpi-support.sh and printing the first argument):

button/power PBTN 00000080 00000000
button/power LNXPWRBN:00 00000080 00000004

Due to those two events the power button script /etc/acpi/powerbtn.sh will get called twice: Once when terminating the machine, and a second time right after the machine resumes the next time…

By using an event filter in /etc/acpi/events/powerbtn-acpi-support acpid can make sure that only one event gets processed:

event=button[ /]power LNXPWRBN

After that a restart of acpid is required:

# systemctl restart acpid

A preemption can be simulated by simple pressing the Stop button in the Compute Engine VM instances view, it will cause the same ACPI signal.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.