Richard W.M. Jones
2023-Jun-15 18:08 UTC
[Libguestfs] libldm crashes in a linux-sandbox context
On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:> Hello, > > I am using libguestfs in a Bazel's linux-sandbox environment[1]. > > When executing in that sandbox environment, I got frequent crashes. > > Please find attached below the results of libguestfs-test-tool when > run into that linux-sandbox environment. The most relevant part seems > to be: > > [ 0.797233] ldmtool[164]: segfault at 0 ip 0000564a892506a5 sp 00007fff8ee5b900 error 4 in ldmtool[564a8924e000+3000] > [ 0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1 > /init: line 154: 164 Segmentation fault ldmtool create all > > So the root cause seems to be around libldm. This mailing list seems > to cover both libguestfs and libldm, so hopefully, I am at the right > place to ask :) > > Needless to say, when run outside of the sandbox environment, no crash > were observed. > > [1] linux-sandbox.cc > Link: https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc > > ---...> supermin: picked /sys/block/sdb/dev (8:16) as root device > supermin: creating /dev/root as block special 8:16 > supermin: mounting new root on /root > [ 0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 subsystem > [ 0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . Quota mode: none. > supermin: deleting initramfs files > supermin: chroot > Starting /init script ... > mount: only root can use "--types" option (effective UID is 65534) > /init: line 38: /proc/cmdline: No such file or directory > mount: only root can use "--types" option (effective UID is 65534) > mount: only root can use "--options" option (effective UID is 65534) > mount: only root can use "--types" option (effective UID is 65534) > mount: only root can use "--types" option (effective UID is 65534) > mount: only root can use "--options" option (effective UID is 65534)It really goes wrong from here, where apparently it's not running as root (instead UID 65534), even though we're supposed to be running inside a Linux appliance virtual machine. Any idea why that would be? I looked at the sandbox and that would run the qemu process as UID "nobody" (which might be 65534). However I don't understand why that would affect anything running on the new kernel inside the appliance. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Vincent MAILHOL
2023-Jun-16 02:17 UTC
[Libguestfs] libldm crashes in a linux-sandbox context
Hi Richard, On Fri. 16 Jun. 2023 ? 03:08, Richard W.M. Jones <rjones at redhat.com> wrote:> On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote: > > Hello, > > > > I am using libguestfs in a Bazel's linux-sandbox environment[1]. > > > > When executing in that sandbox environment, I got frequent crashes. > > > > Please find attached below the results of libguestfs-test-tool when > > run into that linux-sandbox environment. The most relevant part seems > > to be: > > > > [ 0.797233] ldmtool[164]: segfault at 0 ip 0000564a892506a5 sp 00007fff8ee5b900 error 4 in ldmtool[564a8924e000+3000] > > [ 0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1 > > /init: line 154: 164 Segmentation fault ldmtool create all > > > > So the root cause seems to be around libldm. This mailing list seems > > to cover both libguestfs and libldm, so hopefully, I am at the right > > place to ask :) > > > > Needless to say, when run outside of the sandbox environment, no crash > > were observed. > > > > [1] linux-sandbox.cc > > Link: https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc > > > > --- > ... > > supermin: picked /sys/block/sdb/dev (8:16) as root device > > supermin: creating /dev/root as block special 8:16 > > supermin: mounting new root on /root > > [ 0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 subsystem > > [ 0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . Quota mode: none. > > supermin: deleting initramfs files > > supermin: chroot > > Starting /init script ... > > mount: only root can use "--types" option (effective UID is 65534) > > /init: line 38: /proc/cmdline: No such file or directory > > mount: only root can use "--types" option (effective UID is 65534) > > mount: only root can use "--options" option (effective UID is 65534) > > mount: only root can use "--types" option (effective UID is 65534) > > mount: only root can use "--types" option (effective UID is 65534) > > mount: only root can use "--options" option (effective UID is 65534) > > It really goes wrong from here, where apparently it's not running as > root (instead UID 65534), even though we're supposed to be running > inside a Linux appliance virtual machine. > > Any idea why that would be? > > I looked at the sandbox and that would run the qemu process as UID > "nobody" (which might be 65534). However I don't understand why that > would affect anything running on the new kernel inside the appliance.And you were right. It was a fact that I got a crash in the sandbox but did not outside of it and I jumped to the conclusion that the root cause was linked to the sandbox. I continued the analysis and looked at all the differences between a successful libguestfs-test-tool log and the failed one. It turned out that the sandbox was not the cause. The culprit turns out to be the first line of the log: TMPDIR=/tmp. If I force TMPDIR=/var/tmp, the problem disappears !! This gave me a minimal reproducer: TMPDIR=/tmp/ libguestfs-test-tool That one crashed outside the sandbox. Next, my attention went to this line: libguestfs: checking for previously cached test results of /usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001 I did a: rm -rf /tmp/.guestfs-1001 and that solved my issue \o/ I still do not understand how I could get the issue of running of UID 65534 instead of root in the first place. I did other qemu experimentation, so not sure how, but I somehow got a corrupted environment under /tmp/.guestfs-1001. Last thing, the segfault on ldmtool [1] still seems a valid issue. Even if I now do have a workaround for my problem, that segfault might be worth a bit more investigation. Regardless, thanks a lot for your quick answer, that helped me to continue the troubleshooting. [1] ldmtool line 164 Link: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164