Googles Compute Engine VMs which are configured preemptible are massively cheaper than regular VMs, typically a fourth or even a fifth of the price of a regular machine. It seems quite lucrative for everything which is not mission critical.
However, it can be quite annoying when all state gets lost. Luckily Google does not just turn off the machine but sends an ACPI G2 Soft-Off signal. With Debian 9 (stretch) the ACPI daemon is processing the ACPI signals (acpid) and by default shuts down the machine. This post shows how to use hibernate instead.
Note: Since Google might start the machine on a different (virtual) hardware resuming the machine might not succeed, or even worse, lead to adverse effects. In practice, it seems to work quite well for me
First, swap space is required in order for hibernate to work. By default Debian 9 images created using Google Compute Engine have no swap space provisioned. Repartitioning the disk with the root filesystem on it is a hassle. However, Linux also allows using a swap file instead. First, such a file must be created and Linux must be instructed where to find it. Note that Linux is capable to hibernate even if the file is smaller than available memory. Unused or easily recoverable data such as disk cache will not be part of the memory image created during suspend.
A file of arbitrary size can be allocated using:
# fallocate -l 2G /swapfile # chmod 600 /swapfile # mkswap /swapfile
Make sure the file is used as swap space on system boot by adding an appropriate line to /etc/fstab:
/swapfile none swap defaults 0 0
The kernel needs to know the first physical block of the file:
# filefrag -v /swapfile | head Filesystem type is: ef53 File size of /swapfile is 2147483648 (524288 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 30719: 360448.. 391167: 30720: unwritten ...
Add the root device as well as the offset to the swapfile to the kernel arguments by editing /etc/default/grub and appending to the GRUB_CMDLINE_LINUX variable:
GRUB_CMDLINE_LINUX="... resume=/dev/sda1 resume_offset=360448"
Now update Grub:
# update-grub Generating grub configuration file ... Found linux image: /boot/vmlinuz-4.9.0-4-amd64 Found initrd image: /boot/initrd.img-4.9.0-4-amd64 done
At this point, a reboot is required. After the reboot swap space should be ready to be used:
# swapon -s Filename Type Size Used Priority /swapfile file 2097148 0 -1
Let’s test whether a user-initiated hibernate works: By using systemctl the system can be instructed to suspend the system:
# systemctl hibernate
After some seconds the instance should automatically appear as “stopped” in the Compute Engine VM instances view. Restarting the VM will restore the system to its previous state. SSH sessions often do not survive a suspend/resume cycle, also since Google assigns typically a new external IP to the instance. Tools such as screen or tmux allow to create SSH connection independent sessions which can be reconnect using screen -r and tmux at respectively.
To make sure this happens when Google perempts the virtual machine, the machine needs to respond to ACPI events. In Debian 9 acpid is handling ACPI events. The Soft-Off signal does generate an ACPI power button event. The default Debian 9 script /etc/acpi/powerbtn-acpi-support.sh executes shutdown, however, when /etc/acpi/powerbtn.sh is present an alternative action can be scripted. By using systemctl the machine can be instructed to hibernate instead:
#!/bin/sh systemctl hibernate
Make sure the script is executable
# chmod +x /etc/acpi/powerbtn.sh
Testing showed that for some reason, two events are generated (this can be seen when passing “%e” powerbtn-acpi-support.sh and printing the first argument):
button/power PBTN 00000080 00000000 button/power LNXPWRBN:00 00000080 00000004
Due to those two events the power button script /etc/acpi/powerbtn.sh will get called twice: Once when terminating the machine, and a second time right after the machine resumes the next time…
By using an event filter in /etc/acpi/events/powerbtn-acpi-support acpid can make sure that only one event gets processed:
event=button[ /]power LNXPWRBN action=/etc/acpi/powerbtn-acpi-support.sh
After that a restart of acpid is required:
# systemctl restart acpid
A preemption can be simulated by simple pressing the Stop button in the Compute Engine VM instances view, it will cause the same ACPI signal.