Laszlo Ersek
2023-Mar-20 05:31 UTC
[Libguestfs] Libguestfs Failure on latest Ubuntu 22.04 LTS
On 3/17/23 16:10, Justin Churchey wrote:> Hello Everyone, > > I was having some difficulties converting OVA images yesterday. At > first, I thought it may have been a compatibility issue with > VirtualBox 7.0. However, when I went to run libguestfs-test-tool, it > began failing with the exact same error as the conversions, which > leads me to believe the issue may lie with libguestfs and not the > images themselves. > > To test further, I created a fresh install of Ubuntu 22.04, and the > libguestfs-test-tool seems to fail with the same error, even on a > fresh install. I am attaching the libguestfs-test-tool output for > reference. > > Ubuntu 22.04 is running libguestfs-tools 1.46.2-10ubuntu3 > > If anybody has any insight into the issue, or if you feel a bug report > needs to be filed, please let me know.Your appliance kernel crashes. Here's my theory on why this might happen, based on your log. The guestfish appliance runs with KVM acceleration. The crash happens after/while inserting the modules crc32-pclmul.ko and crct10dif-pclmul.ko. The "pclmul" in the names of those modules indicates that these modules calculate various (crc32) checksums with the PCLMULQDQ instruction. I believe that PCLMULQDQ is an advanced / accelerated instruction and not all CPUs may support it. Your appliance guest is started with "-cpu max" on the QEMU command line (from libguestfs commit 30f74f38bd6e, "appliance: Use -cpu max.", 2021-01-28). This is probably why the appliance kernel thinks PCLMULQDQ is available. I think the PCLMULQDQ instruction may cause an issue here. I don't know why it misbehaves under KVM, but that's my suspicion anyway. Note that the kernel crash log provides the following instruction (assembly binary) dump: 46 70 48 8b 56 68 48 03 97 90 01 00 00 48 c1 e0 06 48 03 46 20 48 89 97 08 02 00 00 48 be ab aa aa aa aa aa aa aa 48 8b 48 10 <48> 89 0a 48 8b 50 20 48 8b 8f 08 02 00 00 48 89 d0 48 f7 e6 48 c1 with the instruction starting at <48> causing the page fault, as the direct symptom. Now, we can disassemble this: printf \ '%b' \ '\x46\x70\x48\x8b\x56\x68\x48\x03\x97\x90\x01\x00\x00\x48\xc1\xe0\x06\x48\x03\x46\x20\x48\x89\x97\x08\x02\x00\x00\x48\xbe\xab\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x48\x8b\x48\x10\x48\x89\x0a\x48\x8b\x50\x20\x48\x8b\x8f\x08\x02\x00\x00\x48\x89\xd0\x48\xf7\xe6\x48\xc1' \ > bin $ ndisasm -b64 bin 00000000 467048 jo 0x4b 00000003 8B5668 mov edx,[rsi+0x68] 00000006 48039790010000 add rdx,[rdi+0x190] 0000000D 48C1E006 shl rax,byte 0x6 00000011 48034620 add rax,[rsi+0x20] 00000015 48899708020000 mov [rdi+0x208],rdx 0000001C 48BEABAAAAAAAAAA mov rsi,0xaaaaaaaaaaaaaaab -AAAA 00000026 488B4810 mov rcx,[rax+0x10] 0000002A 48890A mov [rdx],rcx <----------- crash 0000002D 488B5020 mov rdx,[rax+0x20] 00000031 488B8F08020000 mov rcx,[rdi+0x208] 00000038 4889D0 mov rax,rdx 0000003B 48F7E6 mul rsi 0000003E 48 rex.w 0000003F C1 db 0xc1 Note the constant 0xaaaaaaaaaaaaaaab; that seems very special. We can search the kernel tree for it (I'm not bothering about checking out the particular ubuntu kernel version for now): $ git grep -i aaaaaaaaaaaaaaab arch/x86/math-emu/poly_atan.c:/* 0xaaaaaaaaaaaaaaabLL, transferred to fixedpterm[] */ arch/x86/math-emu/poly_sin.c: 0xaaaaaaaaaaaaaaabLL, arch/x86/math-emu/poly_tan.c:static const unsigned long long twothirds = 0xaaaaaaaaaaaaaaabLL; In particular, in the last file (poly_tan.c) contains a snippet like mul64_Xsig(&accum, &twothirds); which seems vagely related to 0000001C 48BEABAAAAAAAAAA mov rsi,0xaaaaaaaaaaaaaaab -AAAA ... 0000003B 48F7E6 mul rsi Now this does not seem connected to PCLMULQDQ, but it does somehow look connected to multiplication. I don't really know where to go with this, except for asking KVM experts. For now, can you try: export LIBGUESTFS_BACKEND_SETTINGS=force_tcg from <https://libguestfs.org/guestfs.3.html#backend-settings>, and see if that makes a difference? Laszlo
Justin Churchey
2023-Mar-20 15:47 UTC
[Libguestfs] Libguestfs Failure on latest Ubuntu 22.04 LTS
Hello Laszlo, Thank you for the rundown. I enabled the additional LIBGUESTFS_BACKEND_SETTINGS, and I have attached a follow up to the libguestfs-test-tool output. I also checked out my CPU settings (cat /proc/cpuinfo output attached), and the host does appear to support PCLMULQDQ (AMD Ryzen 7 5700X). I also checked the cpuinfo in one of the guests I have created (Ubuntu 18.04, unstable due to intermittent kernel panics), and the cpuinfo indicates that this feature seems to be passed down to my guest as well. I noticed that the libguestfs-test-tool didn't seem to like the qemu settings it tried to boot with. So, I went back to basics and built a disk using qemu-img (qcow2) and utilized qemu-system-x86_64 to do the base install (Ubuntu 18.04). The resulting image boots and I import the resulting image with virt-install. However, the GUI/console seems to want to lock up shortly after boot if I am using virt-tools. The guest seems more stable when I boot it directly with `qemu-system,` and this may be my workaround for now. In virt-tools, I can consistently get a panic on the guest by trying to enable the qemu-guest-agent: `systemctl enable qemu-guest-agent.` Unfortunately, I cannot get the full output from that panic (attached). It would seem that this problem is more than just libguestfs-tools. Is there a KVM listserv that this might be more appropriate for? Sincerely, On Mon, Mar 20, 2023 at 1:31?AM Laszlo Ersek <lersek at redhat.com> wrote:> On 3/17/23 16:10, Justin Churchey wrote: > > Hello Everyone, > > > > I was having some difficulties converting OVA images yesterday. At > > first, I thought it may have been a compatibility issue with > > VirtualBox 7.0. However, when I went to run libguestfs-test-tool, it > > began failing with the exact same error as the conversions, which > > leads me to believe the issue may lie with libguestfs and not the > > images themselves. > > > > To test further, I created a fresh install of Ubuntu 22.04, and the > > libguestfs-test-tool seems to fail with the same error, even on a > > fresh install. I am attaching the libguestfs-test-tool output for > > reference. > > > > Ubuntu 22.04 is running libguestfs-tools 1.46.2-10ubuntu3 > > > > If anybody has any insight into the issue, or if you feel a bug report > > needs to be filed, please let me know. > > Your appliance kernel crashes. > > Here's my theory on why this might happen, based on your log. > > The guestfish appliance runs with KVM acceleration. > > The crash happens after/while inserting the modules crc32-pclmul.ko and > crct10dif-pclmul.ko. > > The "pclmul" in the names of those modules indicates that these modules > calculate various (crc32) checksums with the PCLMULQDQ instruction. I > believe that PCLMULQDQ is an advanced / accelerated instruction and not > all CPUs may support it. > > Your appliance guest is started with "-cpu max" on the QEMU command line > (from libguestfs commit 30f74f38bd6e, "appliance: Use -cpu max.", > 2021-01-28). This is probably why the appliance kernel thinks PCLMULQDQ > is available. > > I think the PCLMULQDQ instruction may cause an issue here. I don't know > why it misbehaves under KVM, but that's my suspicion anyway. > > Note that the kernel crash log provides the following instruction > (assembly binary) dump: > > 46 70 48 8b 56 68 48 03 97 90 01 00 00 48 c1 e0 06 48 03 46 20 48 89 97 > 08 02 00 00 48 be ab aa aa aa aa aa aa aa 48 8b 48 10 <48> 89 0a 48 8b > 50 20 48 8b 8f 08 02 00 00 48 89 d0 48 f7 e6 48 c1 > > with the instruction starting at <48> causing the page fault, as the > direct symptom. Now, we can disassemble this: > > printf \ > '%b' \ > > '\x46\x70\x48\x8b\x56\x68\x48\x03\x97\x90\x01\x00\x00\x48\xc1\xe0\x06\x48\x03\x46\x20\x48\x89\x97\x08\x02\x00\x00\x48\xbe\xab\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x48\x8b\x48\x10\x48\x89\x0a\x48\x8b\x50\x20\x48\x8b\x8f\x08\x02\x00\x00\x48\x89\xd0\x48\xf7\xe6\x48\xc1' > \ > > bin > > $ ndisasm -b64 bin > > 00000000 467048 jo 0x4b > 00000003 8B5668 mov edx,[rsi+0x68] > 00000006 48039790010000 add rdx,[rdi+0x190] > 0000000D 48C1E006 shl rax,byte 0x6 > 00000011 48034620 add rax,[rsi+0x20] > 00000015 48899708020000 mov [rdi+0x208],rdx > 0000001C 48BEABAAAAAAAAAA mov rsi,0xaaaaaaaaaaaaaaab > -AAAA > 00000026 488B4810 mov rcx,[rax+0x10] > 0000002A 48890A mov [rdx],rcx <----------- crash > 0000002D 488B5020 mov rdx,[rax+0x20] > 00000031 488B8F08020000 mov rcx,[rdi+0x208] > 00000038 4889D0 mov rax,rdx > 0000003B 48F7E6 mul rsi > 0000003E 48 rex.w > 0000003F C1 db 0xc1 > > Note the constant 0xaaaaaaaaaaaaaaab; that seems very special. We can > search the kernel tree for it (I'm not bothering about checking out the > particular ubuntu kernel version for now): > > $ git grep -i aaaaaaaaaaaaaaab > arch/x86/math-emu/poly_atan.c:/* 0xaaaaaaaaaaaaaaabLL, transferred to > fixedpterm[] */ > arch/x86/math-emu/poly_sin.c: 0xaaaaaaaaaaaaaaabLL, > arch/x86/math-emu/poly_tan.c:static const unsigned long long twothirds > 0xaaaaaaaaaaaaaaabLL; > > In particular, in the last file (poly_tan.c) contains a snippet like > > mul64_Xsig(&accum, &twothirds); > > which seems vagely related to > > 0000001C 48BEABAAAAAAAAAA mov rsi,0xaaaaaaaaaaaaaaab > -AAAA > ... > 0000003B 48F7E6 mul rsi > > Now this does not seem connected to PCLMULQDQ, but it does somehow look > connected to multiplication. > > I don't really know where to go with this, except for asking KVM experts. > > For now, can you try: > > export LIBGUESTFS_BACKEND_SETTINGS=force_tcg > > from <https://libguestfs.org/guestfs.3.html#backend-settings>, and see > if that makes a difference? > > Laszlo > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/libguestfs/attachments/20230320/5dbc2e66/attachment.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: cpuinfo.out Type: application/octet-stream Size: 4044 bytes Desc: not available URL: <http://listman.redhat.com/archives/libguestfs/attachments/20230320/5dbc2e66/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: libguestfs-test-tool.out Type: application/octet-stream Size: 39221 bytes Desc: not available URL: <http://listman.redhat.com/archives/libguestfs/attachments/20230320/5dbc2e66/attachment-0001.obj>