On Wed, Oct 15, 2014 at 12:23:17PM +0000, VONDRA Alain wrote:> Now, the conversion hangs on the first disk at 54,03/100%, here is the gdb result :OK I just talked to Paolo about this, and it's fairly serious. Can you get a core dump from this? You will need to set up the ulimit etc as outlined in the previous email: https://www.redhat.com/archives/libguestfs/2014-October/msg00102.html then run the conversion until it hangs, then 'killall -BUS qemu-img' which should cause qemu-img to drop a core dump file. You can send me (privately) the core dump and I will open a BZ about this and ensure that the necessary people see this. Thanks, Rich.> (gdb) t a a bt > > Thread 2 (Thread 0x7f0f3cea9700 (LWP 9583)): > #0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 > #1 0x00007f0f45b765e7 in qemu_sem_timedwait () > #2 0x00007f0f45b0750c in worker_thread () > #3 0x00007f0f4203bdf3 in start_thread (arg=0x7f0f3cea9700) at pthread_create.c:308 > #4 0x00007f0f41d6901d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 > > Thread 1 (Thread 0x7f0f45aa88c0 (LWP 9582)): > #0 0x00007f0f41d5eb0f in __GI_ppoll (fds=0x7f0f470d7a00, nfds=1, timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 > #1 0x00007f0f45b146db in qemu_poll_ns () > #2 0x00007f0f45b15430 in aio_poll () > #3 0x00007f0f45b0eedd in bdrv_prwv_co () > #4 0x00007f0f45b0efd3 in bdrv_rw_co () > #5 0x00007f0f45b05b35 in img_convert () > #6 0x00007f0f41c94af5 in __libc_start_main (main=0x7f0f45b01e80 <main>, argc=10, ubp_av=0x7fff19cb3ab8, init=<optimized out>, fini=<optimized out>, > rtld_fini=<optimized out>, stack_end=0x7fff19cb3aa8) at libc-start.c:274 > #7 0x00007f0f45b022ed in _start () > > Alain > > -----Message d'origine----- > De : Richard W.M. Jones [mailto:rjones@redhat.com] > Envoyé : mardi 14 octobre 2014 22:52 > À : VONDRA Alain > Cc : libguestfs@redhat.com > Objet : Re: [Libguestfs] Virt-v2v conversion issue > > On Tue, Oct 14, 2014 at 03:40:22PM +0000, VONDRA Alain wrote: > > Rich, > > I've followed your instructions to trace, but I am not very skilful with gdb, maybe I made a mistake : > > > > (1) As root do: > > > > echo core.%p > /proc/sys/kernel/core_pattern -> OK > > > > (2) Before running virt-v2v, do: > > > > ulimited -c unlimited -> I think it's ulimit -c unlimited -> -> OK > > > > and you should get a core.* file in the current directory when qemu-img segfaults. Attach that file to gdb to get a stack trace: > > > > gdb /usr/bin/qemu-img core.XYZ -> Do I need to wait the crash becase I don't have any core ??? > > Yes, you have to wait for qemu-img to crash before there will be a core dump. If it's not crashing, then connect to qemu-img directly, something like this: > > gdb /usr/bin/qemu-img `pidof qemu-img` > > and run this command: > > > (gdb) t a a bt > > to show the stack trace in all threads. > > If qemu-img is consuming CPU then it's probably not hung. > > I'm still interested to find out why fstrim didn't work. Can you run: > > guestfish --ro -d unc-srv-qual03 > ><fs> run > ><fs> part-list /dev/sda > ><fs> part-list /dev/sdb > ><fs> part-list /dev/sdc > ><fs> part-list /dev/sdd > (etc) > > I'd be interested to see if the partitions are unaligned, which is the only reason why fstrim should fail on NTFS. > > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Killall does nothing the process is still alive... If I do kill -9, do I get a core dump ? Each time I got this hang, I find only umount -f /tmp/xxxx to force qemu-img to stop. Alain -----Message d'origine----- De : Richard W.M. Jones [mailto:rjones@redhat.com] Envoyé : mercredi 15 octobre 2014 14:41 À : VONDRA Alain Cc : libguestfs@redhat.com Objet : Re: [Libguestfs] Virt-v2v conversion issue On Wed, Oct 15, 2014 at 12:23:17PM +0000, VONDRA Alain wrote:> Now, the conversion hangs on the first disk at 54,03/100%, here is the gdb result :OK I just talked to Paolo about this, and it's fairly serious. Can you get a core dump from this? You will need to set up the ulimit etc as outlined in the previous email: https://www.redhat.com/archives/libguestfs/2014-October/msg00102.html then run the conversion until it hangs, then 'killall -BUS qemu-img' which should cause qemu-img to drop a core dump file. You can send me (privately) the core dump and I will open a BZ about this and ensure that the necessary people see this. Thanks, Rich.> (gdb) t a a bt > > Thread 2 (Thread 0x7f0f3cea9700 (LWP 9583)): > #0 sem_timedwait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 > #1 0x00007f0f45b765e7 in qemu_sem_timedwait () > #2 0x00007f0f45b0750c in worker_thread () > #3 0x00007f0f4203bdf3 in start_thread (arg=0x7f0f3cea9700) at > pthread_create.c:308 > #4 0x00007f0f41d6901d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 > > Thread 1 (Thread 0x7f0f45aa88c0 (LWP 9582)): > #0 0x00007f0f41d5eb0f in __GI_ppoll (fds=0x7f0f470d7a00, nfds=1, > timeout=<optimized out>, sigmask=0x0) at > ../sysdeps/unix/sysv/linux/ppoll.c:56 > #1 0x00007f0f45b146db in qemu_poll_ns () > #2 0x00007f0f45b15430 in aio_poll () > #3 0x00007f0f45b0eedd in bdrv_prwv_co () > #4 0x00007f0f45b0efd3 in bdrv_rw_co () > #5 0x00007f0f45b05b35 in img_convert () > #6 0x00007f0f41c94af5 in __libc_start_main (main=0x7f0f45b01e80 <main>, argc=10, ubp_av=0x7fff19cb3ab8, init=<optimized out>, fini=<optimized out>, > rtld_fini=<optimized out>, stack_end=0x7fff19cb3aa8) at > libc-start.c:274 > #7 0x00007f0f45b022ed in _start () > > Alain > > -----Message d'origine----- > De : Richard W.M. Jones [mailto:rjones@redhat.com] Envoyé : mardi 14 > octobre 2014 22:52 À : VONDRA Alain Cc : libguestfs@redhat.com Objet : > Re: [Libguestfs] Virt-v2v conversion issue > > On Tue, Oct 14, 2014 at 03:40:22PM +0000, VONDRA Alain wrote: > > Rich, > > I've followed your instructions to trace, but I am not very skilful with gdb, maybe I made a mistake : > > > > (1) As root do: > > > > echo core.%p > /proc/sys/kernel/core_pattern -> OK > > > > (2) Before running virt-v2v, do: > > > > ulimited -c unlimited -> I think it's ulimit -c unlimited -> -> OK > > > > and you should get a core.* file in the current directory when qemu-img segfaults. Attach that file to gdb to get a stack trace: > > > > gdb /usr/bin/qemu-img core.XYZ -> Do I need to wait the crash becase I don't have any core ??? > > Yes, you have to wait for qemu-img to crash before there will be a core dump. If it's not crashing, then connect to qemu-img directly, something like this: > > gdb /usr/bin/qemu-img `pidof qemu-img` > > and run this command: > > > (gdb) t a a bt > > to show the stack trace in all threads. > > If qemu-img is consuming CPU then it's probably not hung. > > I'm still interested to find out why fstrim didn't work. Can you run: > > guestfish --ro -d unc-srv-qual03 > ><fs> run > ><fs> part-list /dev/sda > ><fs> part-list /dev/sdb > ><fs> part-list /dev/sdc > ><fs> part-list /dev/sdd > (etc) > > I'd be interested to see if the partitions are unaligned, which is the only reason why fstrim should fail on NTFS. > > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat > http://people.redhat.com/~rjones Read my programming and > virtualization blog: http://rwmj.wordpress.com virt-builder quickly > builds VMs from scratch http://libguestfs.org/virt-builder.1.html-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
On Wed, Oct 15, 2014 at 01:49:55PM +0000, VONDRA Alain wrote:> If I do kill -9, do I get a core dump ?I'm not sure, but ...> Killall does nothing the process is still alive... > Each time I got this hang, I find only umount -f /tmp/xxxx to force > qemu-img to stop.Are you sure this couldn't be an NFS server problem? Any message or errors in 'dmesg'? Any kernel threads (see top) which are consuming time or stuck in 'D' state? I've been playing around with various test scripts (see attached) to see if I could reproduce this problem, but I couldn't reproduce it yet on RHEL 7.1 or Fedora 21. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/