Daniel P. Berrange
2016-Jun-01 13:59 UTC
Re: [libvirt-users] Migration problem - takes 5 minutes to start moving the memory
On Wed, Jun 01, 2016 at 03:55:37PM +0200, Peter Krempa wrote:> On Wed, Jun 01, 2016 at 11:59:29 +0200, Marc-Aurèle Brothier - Exoscale wrote: > > Hi, > > > > I'm facing a strange issue while doing a migration from an hypervisor to another one. The migration takes for ever to start moving the memory. > > The VM had no workload what so ever, just a basic ubuntu image. The versions on the hypervisors are: libvirt 1.2.21, qemu 1.2.3 > > > > Command to launche the migration: > > virsh migrate --verbose --live --abort-on-error --tunnelled --p2p --auto-converge --copy-storage-inc --xml vm-6160.xml 6160 qemu+tls://<destination_hypervisor>/system > > > > You are copying storage too. It takes 5 minutes to copy the storage. The > memory migration starts after the storage migration converges.I don't think that's it - if you look at the logs provided, you can see that the storage was apparently fully copied after 49 seconds. There was then 5 minutes where neither the disk or memory processed numbers increased, before memory copying started. So there's something fishy going on there, whether just bogus stats reporting by qemu or a genuine delay/hang somewhere Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Marc-Aurèle Brothier - Exoscale
2016-Jun-02 12:32 UTC
Re: [libvirt-users] Migration problem - takes 5 minutes to start moving the memory
> On 01 Jun 2016, at 15:59, Daniel P. Berrange <berrange@redhat.com> wrote: > > On Wed, Jun 01, 2016 at 03:55:37PM +0200, Peter Krempa wrote: >> On Wed, Jun 01, 2016 at 11:59:29 +0200, Marc-Aurèle Brothier - Exoscale wrote: >>> Hi, >>> >>> I'm facing a strange issue while doing a migration from an hypervisor to another one. The migration takes for ever to start moving the memory. >>> The VM had no workload what so ever, just a basic ubuntu image. The versions on the hypervisors are: libvirt 1.2.21, qemu 1.2.3 >>> >>> Command to launche the migration: >>> virsh migrate --verbose --live --abort-on-error --tunnelled --p2p --auto-converge --copy-storage-inc --xml vm-6160.xml 6160 qemu+tls://<destination_hypervisor>/system >>> >> >> You are copying storage too. It takes 5 minutes to copy the storage. The >> memory migration starts after the storage migration converges. > > I don't think that's it - if you look at the logs provided, you can see > that the storage was apparently fully copied after 49 seconds. There was > then 5 minutes where neither the disk or memory processed numbers > increased, before memory copying started. So there's something fishy > going on there, whether just bogus stats reporting by qemu or a genuine > delay/hang somewhere >That's correct, the disk was copied pretty quickly and I could see it at the destination growing during those 49 seconds. What would you do to try to figure out what's going on? Correction, we are using Qemu 2.3 (not 1.2.3, the Ubuntu syntaxe confised me with 1:2.3)
Daniel P. Berrange
2016-Jun-02 12:38 UTC
Re: [libvirt-users] Migration problem - takes 5 minutes to start moving the memory
On Thu, Jun 02, 2016 at 02:32:47PM +0200, Marc-Aurèle Brothier - Exoscale wrote:> > > On 01 Jun 2016, at 15:59, Daniel P. Berrange <berrange@redhat.com> wrote: > > > > On Wed, Jun 01, 2016 at 03:55:37PM +0200, Peter Krempa wrote: > >> On Wed, Jun 01, 2016 at 11:59:29 +0200, Marc-Aurèle Brothier - Exoscale wrote: > >>> Hi, > >>> > >>> I'm facing a strange issue while doing a migration from an hypervisor to another one. The migration takes for ever to start moving the memory. > >>> The VM had no workload what so ever, just a basic ubuntu image. The versions on the hypervisors are: libvirt 1.2.21, qemu 1.2.3 > >>> > >>> Command to launche the migration: > >>> virsh migrate --verbose --live --abort-on-error --tunnelled --p2p --auto-converge --copy-storage-inc --xml vm-6160.xml 6160 qemu+tls://<destination_hypervisor>/system > >>> > >> > >> You are copying storage too. It takes 5 minutes to copy the storage. The > >> memory migration starts after the storage migration converges. > > > > I don't think that's it - if you look at the logs provided, you can see > > that the storage was apparently fully copied after 49 seconds. There was > > then 5 minutes where neither the disk or memory processed numbers > > increased, before memory copying started. So there's something fishy > > going on there, whether just bogus stats reporting by qemu or a genuine > > delay/hang somewhere > > > > That's correct, the disk was copied pretty quickly and I could see > it at the destination growing during those 49 seconds. What would > you do to try to figure out what's going on?Could try turning on debugging for libvirt QEMU driver, so we can see what QMP monitor traffic is going back & forth -might be something we can spot there that isn't visible in the API stats. eg log_filters="1:qemu" in the libvirtd.conf file and restart Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|