Hi, I am getting a weird libguestfs failure from time to time: 2020-08-27 08:48:40 T libguestfs - 0 - appliance - SeaBIOS (version 1.12.0-1) 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Booting from ROM... 2020-08-27 08:48:40 T libguestfs - 0 - library - child_cleanup: 0xd5a840: child process died Let me elaborate more about my environment I am running libguestfs 1.42.0 on GCP - that's nested virtualization. Using python bindings. Inside a docker. I wrote an output hook in python to emit logs more nicely. The supermin image is debian 10. The failure seems to happen from time to time just as I launch 16 "libguestfs"s on my 16 core machine. (This is a testing framework, so we are using libguestfs here to inspect and change different workloads, very similar to virt-v2v) Here is the full log of the sporadic failure: 2020-08-27 08:48:38 T libguestfs - 0 - enter - set_memsize 2020-08-27 08:48:38 T libguestfs - 0 - enter - add_drive 2020-08-27 08:48:38 T libguestfs - 0 - enter - launch 2020-08-27 08:48:38 T libguestfs - 0 - enter - max_disks 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_tmpdir 2020-08-27 08:48:38 T libguestfs - 0 - enter - version 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_backend 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: program=my-v2v 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: version=1.42.0 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: backend registered: unix 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: backend registered: uml 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: backend registered: libvirt 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: backend registered: direct 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: backend=direct 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: tmpdir=/tmp/libguestfszNoLuV 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: umask=0022 2020-08-27 08:48:38 T libguestfs - 0 - library - launch: euid=0 2020-08-27 08:48:38 T libguestfs - 0 - library - begin testing qemu features 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_cachedir 2020-08-27 08:48:38 T libguestfs - 0 - library - checking for previously cached test results of /usr/bin/kvm, in /var/tmp/.guestfs-0 2020-08-27 08:48:38 T libguestfs - 0 - library - loading previously cached test results 2020-08-27 08:48:38 T libguestfs - 0 - library - qemu version: 3.1 2020-08-27 08:48:38 T libguestfs - 0 - library - qemu mandatory locking: yes 2020-08-27 08:48:38 T libguestfs - 0 - library - qemu KVM: disabled 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_backend_setting 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_sockdir 2020-08-27 08:48:38 T libguestfs - 0 - library - finished testing qemu features 2020-08-27 08:48:38 T libguestfs - 0 - enter - get_backend_setting 2020-08-27 08:48:38 T libguestfs - 0 - library - command: run: dmesg | grep -Eoh 'lpj=[[:digit:]]+' 2020-08-27 08:48:38 T libguestfs - 0 - library - read_lpj_from_dmesg: calculated lpj=8800000 2020-08-27 08:48:38 T libguestfs - 0 - library - command: run: file 2020-08-27 08:48:38 T libguestfs - 0 - library - command: run: \ -- /appliance/root 2020-08-27 08:48:38 T libguestfs - 0 - appliance - /usr/bin/kvm \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -global virtio-blk-pci.scsi=off \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -no-user-config \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -enable-fips \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -nodefaults \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -display none \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -machine accel=kvm:tcg \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -m 2048 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -no-reboot \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -rtc driftfix=slew \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -no-hpet \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -global kvm-pit.lost_tick_policy=discard \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -kernel /appliance/kernel \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -initrd /appliance/initrd \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -object rng-random,filename=/dev/urandom,id=rng0 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device virtio-rng-pci,rng=rng0 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device virtio-scsi-pci,id=scsi \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -drive file=/tmp/xww-a1dtzg8i.qcow2,cache=writeback,format=qcow2,id=hd0,if=none \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device scsi-hd,drive=hd0 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -drive file=/appliance/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device scsi-hd,drive=appliance \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device virtio-serial-pci \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -serial stdio \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device sga \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -chardev socket,path=/tmp/libguestfsISN6Xx/guestfsd.sock,id=channel0 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ 2020-08-27 08:48:38 T libguestfs - 0 - appliance - -append "panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check lpj=8800000 printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=dd6872eb-d621-499d-be42-64d78754c255 selinux=0 guestfs_verbose=1 TERM=linux vsyscall=emulate" 2020-08-27 08:48:40 T libguestfs - 0 - library - responding to serial console Device Status Report 2020-08-27 08:48:40 T libguestfs - 0 - appliance - ^[[1;256r^[[256;256H^[[6n 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Google, Inc. 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Serial Graphics Adapter 03/05/17 2020-08-27 08:48:40 T libguestfs - 0 - appliance - SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ (generic@generic) Sun, 05 Mar 2017 16:09:17 +0100 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Term: 80x24 2020-08-27 08:48:40 T libguestfs - 0 - appliance - 4 0 2020-08-27 08:48:40 T libguestfs - 0 - appliance - ^[[2J 2020-08-27 08:48:40 T libguestfs - 0 - appliance - SeaBIOS (version 1.12.0-1) 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Booting from ROM... 2020-08-27 08:48:40 T libguestfs - 0 - library - child_cleanup: 0xd5a840: child process died 2020-08-27 08:48:40 T libguestfs - 0 - library - sending SIGTERM to process 459 2020-08-27 08:48:40 T libguestfs - 0 - library - qemu maxrss 60136K 2020-08-27 08:48:40 T libguestfs - 0 - subprocess_quit 2020-08-27 08:48:40 T libguestfs - 0 - library - closing guestfs handle 0xd5a840 (state 0) 2020-08-27 08:48:40 T libguestfs - 0 - close 2020-08-27 08:48:40 T libguestfs - 0 - library - command: run: rm 2020-08-27 08:48:40 T libguestfs - 0 - library - command: run: \ -rf /tmp/libguestfszNoLuV 2020-08-27 08:48:40 T libguestfs - 0 - library - command: run: rm 2020-08-27 08:48:40 T libguestfs - 0 - library - command: run: \ -rf /tmp/libguestfsISN6Xx Usually everything is working and the output is: 2020-08-27 07:41:10 T libguestfs - 0 - appliance - SeaBIOS (version 1.12.0-1) 2020-08-27 07:41:11 T libguestfs - 0 - appliance - Booting from ROM... 2020-08-27 07:41:12 T libguestfs - 0 - appliance - ^[[2J[ 0.000000] Linux version 4.19.0-10-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.132-1 (2020-07-24) 2020-08-27 07:41:12 T libguestfs - 0 - appliance - [ 0.000000] Command line: panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check lpj=8800000 printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=dd6872eb-d621-499d-be42-64d78754c255 selinux=0 guestfs_verbose=1 TERM=linux vsyscall=emulate It seems that the qemu process has died somewhere between the ROM boot and the kernel loading. Any ideas how to debug this? Thanks! Sam
Richard W.M. Jones
2020-Aug-27 17:07 UTC
Re: [Libguestfs] Unknown libguestfs failure / race
On Thu, Aug 27, 2020 at 03:59:30PM +0300, Sam Eiderman wrote:> 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Term: 80x24 > 2020-08-27 08:48:40 T libguestfs - 0 - appliance - 4 0 > 2020-08-27 08:48:40 T libguestfs - 0 - appliance - ^[[2J > 2020-08-27 08:48:40 T libguestfs - 0 - appliance - SeaBIOS (version 1.12.0-1) > 2020-08-27 08:48:40 T libguestfs - 0 - appliance - Booting from ROM... > 2020-08-27 08:48:40 T libguestfs - 0 - library - child_cleanup: > 0xd5a840: child process died...> It seems that the qemu process has died somewhere between the ROM boot > and the kernel loading. > > Any ideas how to debug this?qemu failures are the worst ... First of all, enable core dumps. Maybe qemu is dumping core, and if it is then (a) it's a qemu bug and (b) the core dump will tell you (or the qemu developers) exactly what's wrong. Also look at the host dmesg. Since you're starting many instances of qemu it might be that the host is running out of memory, or there's another kernel resource being exhausted. The kernel will print something if that happens. It's also probably best to try to get a reproducer using libguestfs-test-tool in a loop (or more likely, 16 loops all running libguestfs-test-tool as fast as they can). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW