Rom Freiman
2014-Jan-16 21:25 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
Thanks Richard for a fast reply. Yes, indeed, im working on a nested environment. I try to run v2v inside a VM (L1) and to create an L2 by the conversion process. And on Intel. As I wrote, it fails once in few times, mainly when there is a memory pressure on L0. Kashyap, can you please share your experience? Why should it crash during nested conversion. I'm not too familiar with libguestfs logic - maybe you can point for me, according to the logs, at what stage of the conversion the failure happens. Thanks, Rom On Thu, Jan 16, 2014 at 5:23 PM, Richard W.M. Jones <rjones@redhat.com>wrote:> On Wed, Jan 15, 2014 at 04:35:29PM +0200, Rom Freiman wrote: > > Hi everybody, > > > > Wanted to hear your opinion and to receive a smart advice. > > > > I'm trying to use virt-v2v in order to convert ova image (exported from > > vcenter) to run on libvirt/kvm - all this inside a VM of fedora. > > The converted image is also a fedora. > > During the conversion process, in some point of libguestfs activity, I > get > > double fault panic from L2 (printed as part of libguest output) and the > > conversion process fails - no errors appear neither in L0 not in L1 > message > > logs. > > Are you using nested KVM? > > Kashyap (CC'd) has done a lot of testing on nested KVM on *Intel*, > never with satisfactory results. It just doesn't work very well. > > On AMD is a different story -- nested KVM just works. > > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat > http://people.redhat.com/~rjones > Fedora Windows cross-compiler. Compile Windows programs, test, and > build Windows installers. Over 100 libraries supported. > http://fedoraproject.org/wiki/MinGW >
Richard W.M. Jones
2014-Jan-16 21:45 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
On Thu, Jan 16, 2014 at 11:25:10PM +0200, Rom Freiman wrote:> Thanks Richard for a fast reply. > > Yes, indeed, im working on a nested environment. I try to run v2v inside a > VM (L1) and to create an L2 by the conversion process. And on Intel. As I > wrote, it fails once in few times, mainly when there is a memory pressure > on L0.Just to confirm, the 'nested' flag is enabled on the L0 kvm_intel.ko module? If not enabled, libguestfs will use TCG which -- while not exactly perfect -- will usually run the appliance OK albeit slowly. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Rom Freiman
2014-Jan-16 21:52 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
Enabled. I'm playing the real game ;) On Thu, Jan 16, 2014 at 11:45 PM, Richard W.M. Jones <rjones@redhat.com>wrote:> On Thu, Jan 16, 2014 at 11:25:10PM +0200, Rom Freiman wrote: > > Thanks Richard for a fast reply. > > > > Yes, indeed, im working on a nested environment. I try to run v2v inside > a > > VM (L1) and to create an L2 by the conversion process. And on Intel. As I > > wrote, it fails once in few times, mainly when there is a memory pressure > > on L0. > > Just to confirm, the 'nested' flag is enabled on the L0 kvm_intel.ko > module? > > If not enabled, libguestfs will use TCG which -- while not exactly > perfect -- will usually run the appliance OK albeit slowly. > > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat > http://people.redhat.com/~rjones > virt-top is 'top' for virtual machines. Tiny program with many > powerful monitoring features, net stats, disk stats, logging, etc. > http://people.redhat.com/~rjones/virt-top >
Richard W.M. Jones
2014-Jan-17 13:58 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
On Thu, Jan 16, 2014 at 11:25:10PM +0200, Rom Freiman wrote:> Thanks Richard for a fast reply. > > Yes, indeed, im working on a nested environment. I try to run v2v inside a > VM (L1) and to create an L2 by the conversion process. And on Intel. As I > wrote, it fails once in few times, mainly when there is a memory pressure > on L0. > > Kashyap, can you please share your experience? Why should it crash during > nested conversion. I'm not too familiar with libguestfs logic - maybe you > can point for me, according to the logs, at what stage of the conversion > the failure happens.I can make a general comment here: libguestfs relies on KVM/qemu/etc being reliable. If KVM/qemu/etc is broken, then libguestfs isn't going to work. If you can't disable nested KVM and don't want to fix it, then you have two other options: (1) Patch libguestfs to force qemu to use TCG: In src/launch-direct.c: if (qemu_supports (g, data, "-machine")) { ADD_CMDLINE ("-machine"); - ADD_CMDLINE ("accel=kvm:tcg"); + ADD_CMDLINE ("accel=tcg"); } else { (2) Take a look at the UML backend for libguestfs: http://libguestfs.org/guestfs.3.html#user-mode-linux-backend Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Richard W.M. Jones
2014-Jan-17 14:39 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
[Please keep libguestfs mailing list in CC] On Fri, Jan 17, 2014 at 04:14:03PM +0200, Rom Freiman wrote:> How do you know that the problem is with KVM/QEMU and not with libguestfs?The guestfsd daemon is simply running the regular 'mount' command. The mount command causes the kernel to panic. There should be no circumstances where running an ordinary command like that, albeit as root, should cause the kernel to panic. Unless the kernel (or in this case, something underneath the kernel) is broken. mount -o ro /dev/sdb /sysroot/ [ 12.645305] PANIC: double fault, error_code: 0x0 [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 12.645305] task: ffff88001cc816e0 ti: ffff88001cde6000 task.ti: ffff88001cde6000 [ 12.645305] RIP: 0033:[<00007fa602c5b99b>] [<00007fa602c5b99b>] 0x7fa602c5b99a [ 12.645305] RSP: 002b:00007fff4f5884a0 EFLAGS: 00010216 [ 12.645305] RAX: 00007fa602008ff8 RBX: 00007fa601ff0000 RCX: 00007fa601ff0000 [ 12.645305] RDX: 00000000003b7068 RSI: 00007fff4f588560 RDI: 00007fa601ff3d18 [ 12.645305] RBP: 00007fff4f5885d0 R08: 00007fa60200f310 R09: 0000000000000000 [ 12.645305] R10: 0000000000000022 R11: 00007fa60200f310 R12: 00007fa60200e9b0 [ 12.645305] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fa602e6e990 [ 12.645305] FS: 00007fa602e69880(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000 [ 12.645305] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 12.645305] CR2: 0000000000000000 CR3: 000000001d7fb000 CR4: 00000000000006f0 [ 12.645305] [ 12.645305] Kernel panic - not syncing: Machine halted. [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 12.645305] ffff88001f005f58 ffff88001f005e90 ffffffff8164024b ffffffff819e89dc [ 12.645305] ffff88001f005f08 ffffffff8163c272 0000000000000008 ffff88001f005f18 [ 12.645305] ffff88001f005eb8 ffffffff8163c8e5 0000000000000046 00000000000000b1 [ 12.645305] Call Trace: [ 12.645305] <#DF> [<ffffffff8164024b>] dump_stack+0x45/0x56 [ 12.645305] [<ffffffff8163c272>] panic+0xc8/0x1d7 [ 12.645305] [<ffffffff8163c8e5>] ? printk+0x67/0x69 [ 12.645305] [<ffffffff81048ae1>] df_debug+0x31/0x40 [ 12.645305] [<ffffffff810132ed>] do_double_fault+0x5d/0x80 [ 12.645305] [<ffffffff81650b88>] double_fault+0x28/0x30 [ 12.645305] <<EOE>> Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Rom Freiman
2014-Jan-17 15:06 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
Kashyap, just to be sure - it happens to you during the v2v conversion? on L2? While L1 and L0 works fine afterwords, right? Thanks On Fri, Jan 17, 2014 at 4:45 PM, Kashyap Chamarthy <kchamart@redhat.com> wrote:> On 01/17/2014 03:38 PM, Richard W.M. Jones wrote: >> On Fri, Jan 17, 2014 at 04:14:03PM +0200, Rom Freiman wrote: >>> How do you know that the problem is with KVM/QEMU and not with libguestfs? >> >> The guestfsd daemon is simply running the regular 'mount' command. >> The mount command causes the kernel to panic. There should be no >> circumstances where running an ordinary command like that, albeit as >> root, should cause the kernel to panic. Unless the kernel (or in this >> case, something underneath the kernel) is broken. >> >> mount -o ro /dev/sdb /sysroot/ >> [ 12.645305] PANIC: double fault, error_code: 0x0 >> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted >> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 >> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >> [ 12.645305] task: ffff88001cc816e0 ti: ffff88001cde6000 task.ti: >> ffff88001cde6000 >> [ 12.645305] RIP: 0033:[<00007fa602c5b99b>] [<00007fa602c5b99b>] >> 0x7fa602c5b99a >> [ 12.645305] RSP: 002b:00007fff4f5884a0 EFLAGS: 00010216 >> [ 12.645305] RAX: 00007fa602008ff8 RBX: 00007fa601ff0000 RCX: 00007fa601ff0000 >> [ 12.645305] RDX: 00000000003b7068 RSI: 00007fff4f588560 RDI: 00007fa601ff3d18 >> [ 12.645305] RBP: 00007fff4f5885d0 R08: 00007fa60200f310 R09: 0000000000000000 >> [ 12.645305] R10: 0000000000000022 R11: 00007fa60200f310 R12: 00007fa60200e9b0 >> [ 12.645305] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fa602e6e990 >> [ 12.645305] FS: 00007fa602e69880(0000) GS:ffff88001f000000(0000) >> knlGS:0000000000000000 >> [ 12.645305] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [ 12.645305] CR2: 0000000000000000 CR3: 000000001d7fb000 CR4: 00000000000006f0 >> [ 12.645305] >> [ 12.645305] Kernel panic - not syncing: Machine halted. >> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted >> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 >> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >> [ 12.645305] ffff88001f005f58 ffff88001f005e90 ffffffff8164024b >> ffffffff819e89dc >> [ 12.645305] ffff88001f005f08 ffffffff8163c272 0000000000000008 >> ffff88001f005f18 >> [ 12.645305] ffff88001f005eb8 ffffffff8163c8e5 0000000000000046 >> 00000000000000b1 >> [ 12.645305] Call Trace: >> [ 12.645305] <#DF> [<ffffffff8164024b>] dump_stack+0x45/0x56 >> [ 12.645305] [<ffffffff8163c272>] panic+0xc8/0x1d7 >> [ 12.645305] [<ffffffff8163c8e5>] ? printk+0x67/0x69 >> [ 12.645305] [<ffffffff81048ae1>] df_debug+0x31/0x40 >> [ 12.645305] [<ffffffff810132ed>] do_double_fault+0x5d/0x80 >> [ 12.645305] [<ffffffff81650b88>] double_fault+0x28/0x30 >> [ 12.645305] <<EOE>> > > Correct. > > I encountered this same double_fault panic a week ago: > > > http://kashyapc.fedorapeople.org/temp/double-fault-panic-nested-kvm-environment.txt > > With these versions: > > $ uname -r ; rpm -q libvirt qemu-system-x86 > 3.11.10-301.fc20.x86_64 > libvirt-1.1.3.1-2.fc20.x86_64 > qemu-system-x86-1.6.1-2.fc20.x86_64 > > When I briefly discussed this double fault panic with Paolo Bonzini (KVM > maintainer), he mentioned it is probably a host hypervisor bug. But this > needs more investigation (ftrace for nested guest, x86info in L1 and L2 > - if possible). > > > Answering Rom's earlier question ("Kashyap, can you please share your > experience?"): Yes, ested virtualization with KVM and Intel is not > *really* the most stable, but there's on going work upstream to improve > this and fix bbugs. > > Refer these recent bugs I filed while in a nested KVM environment: > > https://bugzilla.kernel.org/show_bug.cgi?id=67761 > https://bugzilla.kernel.org/show_bug.cgi?id=68051 > https://bugzilla.kernel.org/show_bug.cgi?id=67751 > > > > -- > /kashyap
Rom Freiman
2014-Jan-18 22:01 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
Hey everybody, Richard, you were right. I managed to reproduce the same crash without dealing with v2v (and libguestfs). Actually - it's reproducible really ease - I write a big file to /tmp on L0 (till it 100% full) and then run a L2 VM. Almost every time it crushes with double fault. Debugging, debugging and more debugging. Marcelo/Paolo, if you have any clue, I would like to hear from you. Thanks, Rom On Fri, Jan 17, 2014 at 5:06 PM, Rom Freiman <rom@stratoscale.com> wrote:> Kashyap, just to be sure - it happens to you during the v2v > conversion? on L2? While L1 and L0 works fine afterwords, right? > > Thanks > > On Fri, Jan 17, 2014 at 4:45 PM, Kashyap Chamarthy <kchamart@redhat.com> wrote: >> On 01/17/2014 03:38 PM, Richard W.M. Jones wrote: >>> On Fri, Jan 17, 2014 at 04:14:03PM +0200, Rom Freiman wrote: >>>> How do you know that the problem is with KVM/QEMU and not with libguestfs? >>> >>> The guestfsd daemon is simply running the regular 'mount' command. >>> The mount command causes the kernel to panic. There should be no >>> circumstances where running an ordinary command like that, albeit as >>> root, should cause the kernel to panic. Unless the kernel (or in this >>> case, something underneath the kernel) is broken. >>> >>> mount -o ro /dev/sdb /sysroot/ >>> [ 12.645305] PANIC: double fault, error_code: 0x0 >>> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted >>> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 >>> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >>> [ 12.645305] task: ffff88001cc816e0 ti: ffff88001cde6000 task.ti: >>> ffff88001cde6000 >>> [ 12.645305] RIP: 0033:[<00007fa602c5b99b>] [<00007fa602c5b99b>] >>> 0x7fa602c5b99a >>> [ 12.645305] RSP: 002b:00007fff4f5884a0 EFLAGS: 00010216 >>> [ 12.645305] RAX: 00007fa602008ff8 RBX: 00007fa601ff0000 RCX: 00007fa601ff0000 >>> [ 12.645305] RDX: 00000000003b7068 RSI: 00007fff4f588560 RDI: 00007fa601ff3d18 >>> [ 12.645305] RBP: 00007fff4f5885d0 R08: 00007fa60200f310 R09: 0000000000000000 >>> [ 12.645305] R10: 0000000000000022 R11: 00007fa60200f310 R12: 00007fa60200e9b0 >>> [ 12.645305] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fa602e6e990 >>> [ 12.645305] FS: 00007fa602e69880(0000) GS:ffff88001f000000(0000) >>> knlGS:0000000000000000 >>> [ 12.645305] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [ 12.645305] CR2: 0000000000000000 CR3: 000000001d7fb000 CR4: 00000000000006f0 >>> [ 12.645305] >>> [ 12.645305] Kernel panic - not syncing: Machine halted. >>> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted >>> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1 >>> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >>> [ 12.645305] ffff88001f005f58 ffff88001f005e90 ffffffff8164024b >>> ffffffff819e89dc >>> [ 12.645305] ffff88001f005f08 ffffffff8163c272 0000000000000008 >>> ffff88001f005f18 >>> [ 12.645305] ffff88001f005eb8 ffffffff8163c8e5 0000000000000046 >>> 00000000000000b1 >>> [ 12.645305] Call Trace: >>> [ 12.645305] <#DF> [<ffffffff8164024b>] dump_stack+0x45/0x56 >>> [ 12.645305] [<ffffffff8163c272>] panic+0xc8/0x1d7 >>> [ 12.645305] [<ffffffff8163c8e5>] ? printk+0x67/0x69 >>> [ 12.645305] [<ffffffff81048ae1>] df_debug+0x31/0x40 >>> [ 12.645305] [<ffffffff810132ed>] do_double_fault+0x5d/0x80 >>> [ 12.645305] [<ffffffff81650b88>] double_fault+0x28/0x30 >>> [ 12.645305] <<EOE>> >> >> Correct. >> >> I encountered this same double_fault panic a week ago: >> >> >> http://kashyapc.fedorapeople.org/temp/double-fault-panic-nested-kvm-environment.txt >> >> With these versions: >> >> $ uname -r ; rpm -q libvirt qemu-system-x86 >> 3.11.10-301.fc20.x86_64 >> libvirt-1.1.3.1-2.fc20.x86_64 >> qemu-system-x86-1.6.1-2.fc20.x86_64 >> >> When I briefly discussed this double fault panic with Paolo Bonzini (KVM >> maintainer), he mentioned it is probably a host hypervisor bug. But this >> needs more investigation (ftrace for nested guest, x86info in L1 and L2 >> - if possible). >> >> >> Answering Rom's earlier question ("Kashyap, can you please share your >> experience?"): Yes, ested virtualization with KVM and Intel is not >> *really* the most stable, but there's on going work upstream to improve >> this and fix bbugs. >> >> Refer these recent bugs I filed while in a nested KVM environment: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=67761 >> https://bugzilla.kernel.org/show_bug.cgi?id=68051 >> https://bugzilla.kernel.org/show_bug.cgi?id=67751 >> >> >> >> -- >> /kashyap
Kashyap Chamarthy
2014-Jan-21 15:38 UTC
Re: [Libguestfs] Double fault panic in L2 upon v2v conversion
On 01/17/2014 04:06 PM, Rom Freiman wrote:> Kashyap, just to be sure - it happens to you during the v2v > conversion? on L2?I haven't done any v2v conversions in L2 (or at any other level). PS: Sorry, I didn't notice my previous 2 emails didn't go to the list, that wasn't intended. Rich, you bounce them here, if you prefer (instead of me clumsily forwarding them). -- /kashyap