Vincent MAILHOL
2023-Jun-19 11:18 UTC
[Libguestfs] libldm crashes in a linux-sandbox context
On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones <rjones at redhat.com> wrote: (...)> > Last thing, the segfault on ldmtool [1] still seems a valid issue. > > Even if I now do have a workaround for my problem, that segfault might > > be worth a bit more investigation. > > Yes that does look like a real problem. Does it crash if you just run > ldmtool as a normal command, nothing to do with libguestfs? Might be > a good idea to try to get a stack trace of the crash.The fact is that it only crashes with the UUID 65534 in the qemu VM. I am not sure what command line is passed to ldmtool for this crash to occur. I can help to gather information, but my biggest issue is that I do not know how to interact with the VM under /tmp/.guestfs-1001/ [ 0.777352] ldmtool[164]: segfault at 0 ip 0000563a225cd6a5 sp 00007ffe54965a60 error 4 in ldmtool[563a225cb000+3000] ^^^^ ^^^^^^^^^^^^^^^^^^^ This smells like a NULL pointer dereference. The instruction pointer being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a: addr2line -e /usr/bin/ldmtool 564a892506a5 Results: ??:0 Without conviction, I also tried in GDB: $ gdb /usr/bin/ldmtool (...) Reading symbols from /usr/bin/ldmtool... Reading symbols from /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug... (gdb) info line *0x564a892506a5 No line number information available for address 0x564a892506a5 Debug symbols are correctly installed but impossible to convert that instruction pointer into a line number. It is as if the ldmtool on my host and the ldmtool in the qemu VM were from a different build. I tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image did not contain ldmtool. I am not sure how to generate a stack trace or a core dump within that qemu VM. If you can tell me how to get an interactive prompt (or any other guidance) I can try to collect more information. Yours sincerely, Vincent Mailhol
On 6/19/23 13:18, Vincent MAILHOL wrote:> On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones <rjones at redhat.com> wrote: > (...) >>> Last thing, the segfault on ldmtool [1] still seems a valid issue. >>> Even if I now do have a workaround for my problem, that segfault might >>> be worth a bit more investigation. >> >> Yes that does look like a real problem. Does it crash if you just run >> ldmtool as a normal command, nothing to do with libguestfs? Might be >> a good idea to try to get a stack trace of the crash. > > The fact is that it only crashes with the UUID 65534 in the qemu VM. I > am not sure what command line is passed to ldmtool for this crash to > occur. > > I can help to gather information, but my biggest issue is that I do > not know how to interact with the VM under /tmp/.guestfs-1001/ > > [ 0.777352] ldmtool[164]: segfault at 0 ip 0000563a225cd6a5 sp > 00007ffe54965a60 error 4 in ldmtool[563a225cb000+3000] > ^^^^ ^^^^^^^^^^^^^^^^^^^ > This smells like a NULL pointer dereference.... Hey this is actually my line from an email I started writing earlier today :) , but I then decided not to send it. It certainly looks like a null pointer dereference, and if you disassemble the instruction byte stream dump (the "Code:" line from the kernel log) with (e.g.) ndisasm, that confirms it. You get something like 00000025 E8DBFDFFFF call 0xfffffffffffffe05 0000002A 4C8B20 mov r12,[rax] <---- crash 0000002D 4889442408 mov [rsp+0x8],rax 00000032 4C89E7 mov rdi,r12 00000035 E80BE1FFFF call 0xffffffffffffe145 with the "mov r12,[rax]" instruction faulting (with the previously called function presumably having returned 0 in rax). See the "<4c> 8b 20" substring in the "Code:" line -- the angle brackets point at the first byte of the crashing instruction. I didn't send the email ultimately because your email included a link [1] pointing at a particular line number: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164 and so I assumed you actually traced the crash to that line. Is that the case? Or did you perhaps mistake *PID* 164 (from the kernel log) for the line number?> The instruction pointer > being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a: > > addr2line -e /usr/bin/ldmtool 564a892506a5 > > Results: > > ??:0 > > Without conviction, I also tried in GDB: > > $ gdb /usr/bin/ldmtool > (...) > Reading symbols from /usr/bin/ldmtool... > Reading symbols from > /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug... > (gdb) info line *0x564a892506a5 > No line number information available for address 0x564a892506a5 > > Debug symbols are correctly installed but impossible to convert that > instruction pointer into a line number. It is as if the ldmtool on my > host and the ldmtool in the qemu VM were from a different build. I > tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image > did not contain ldmtool. > > I am not sure how to generate a stack trace or a core dump within that > qemu VM. If you can tell me how to get an interactive prompt (or any > other guidance) I can try to collect more information.The IP where the crash occurs is 0000563a225cd6a5. The ldmtool binary (as opposed to a shared object / library) is mapped into the process's address space at 563a225cb000, for a length of 0x3000 bytes. So the offending instruction is supposed to be 0000563a225cd6a5 - 563a225cb000 = 26A5. With the debug symbols installed, can you attach the output of objdump --headers --wide -S /usr/bin/ldmtool ? Can you try addr2line -p -i -f -e /usr/bin/ldmtool 26A5 ? (This still may not be good enough; we might have to offset the difference 0x26A5 with some address related to the .text section... The objdump output should help us experiment.) Laszlo
Richard W.M. Jones
2023-Jun-20 08:10 UTC
[Libguestfs] libldm crashes in a linux-sandbox context
On Mon, Jun 19, 2023 at 08:18:20PM +0900, Vincent MAILHOL wrote:> On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones <rjones at redhat.com> wrote: > (...) > > > Last thing, the segfault on ldmtool [1] still seems a valid issue. > > > Even if I now do have a workaround for my problem, that segfault might > > > be worth a bit more investigation. > > > > Yes that does look like a real problem. Does it crash if you just run > > ldmtool as a normal command, nothing to do with libguestfs? Might be > > a good idea to try to get a stack trace of the crash. > > The fact is that it only crashes with the UUID 65534 in the qemu VM. I > am not sure what command line is passed to ldmtool for this crash to > occur. > > I can help to gather information, but my biggest issue is that I do > not know how to interact with the VM under /tmp/.guestfs-1001/I think you've solved the problem now, but for future reference you can run: $ virt-rescue (there are various options, see the manual). This will create a virtual machine with the appliance and drop you into a shell. Rich.> [ 0.777352] ldmtool[164]: segfault at 0 ip 0000563a225cd6a5 sp > 00007ffe54965a60 error 4 in ldmtool[563a225cb000+3000] > ^^^^ ^^^^^^^^^^^^^^^^^^^ > This smells like a NULL pointer dereference. The instruction pointer > being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a: > > addr2line -e /usr/bin/ldmtool 564a892506a5 > > Results: > > ??:0 > > Without conviction, I also tried in GDB: > > $ gdb /usr/bin/ldmtool > (...) > Reading symbols from /usr/bin/ldmtool... > Reading symbols from > /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug... > (gdb) info line *0x564a892506a5 > No line number information available for address 0x564a892506a5 > > Debug symbols are correctly installed but impossible to convert that > instruction pointer into a line number. It is as if the ldmtool on my > host and the ldmtool in the qemu VM were from a different build. I > tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image > did not contain ldmtool. > > I am not sure how to generate a stack trace or a core dump within that > qemu VM. If you can tell me how to get an interactive prompt (or any > other guidance) I can try to collect more information. > > > Yours sincerely, > Vincent Mailhol-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html