Soeren Malchow
2015-May-31 23:35 UTC
Re: [libvirt-users] [ovirt-users] Bug in Snapshot Removing
Small addition again: This error shows up in the log while removing snapshots WITHOUT rendering the Vms unresponsive — Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed out during operation: cannot acquire state change lock Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.Vm ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block job info Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs… — From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> Date: Monday 1 June 2015 00:56 To: "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> Subject: [ovirt-users] Bug in Snapshot Removing Dear all I am not sure if the mail just did not get any attention between all the mails and this time it is also going to the libvirt mailing list. I am experiencing a problem with VM becoming unresponsive when removing Snapshots (Live Merge) and i think there is a serious problem. Here are the previous mails, http://lists.ovirt.org/pipermail/users/2015-May/033083.html The problem is on a system with everything on the latest version, CentOS 7.1 and ovirt 3.5.2.1 all upgrades applied. This Problem did NOT exist before upgrading to CentOS 7.1 with an environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo activated. I think this is a bug in libvirt, not ovirt itself, but i am not sure. The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, line 697). We are very willing to help, test and supply log files in anyway we can. Regards Soeren
Soeren Malchow
2015-May-31 23:39 UTC
Re: [libvirt-users] [ovirt-users] Bug in Snapshot Removing
And sorry, another update, it does kill the VM partly, it was still pingable when i wrote the last mail, but no ssh and no spice console possible From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> Date: Monday 1 June 2015 01:35 To: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>>, "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> Subject: Re: [ovirt-users] Bug in Snapshot Removing Small addition again: This error shows up in the log while removing snapshots WITHOUT rendering the Vms unresponsive — Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed out during operation: cannot acquire state change lock Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.Vm ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block job info Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs… — From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> Date: Monday 1 June 2015 00:56 To: "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> Subject: [ovirt-users] Bug in Snapshot Removing Dear all I am not sure if the mail just did not get any attention between all the mails and this time it is also going to the libvirt mailing list. I am experiencing a problem with VM becoming unresponsive when removing Snapshots (Live Merge) and i think there is a serious problem. Here are the previous mails, http://lists.ovirt.org/pipermail/users/2015-May/033083.html The problem is on a system with everything on the latest version, CentOS 7.1 and ovirt 3.5.2.1 all upgrades applied. This Problem did NOT exist before upgrading to CentOS 7.1 with an environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo activated. I think this is a bug in libvirt, not ovirt itself, but i am not sure. The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, line 697). We are very willing to help, test and supply log files in anyway we can. Regards Soeren
Soeren Malchow
2015-Jun-01 07:23 UTC
[libvirt-users] FW: [ovirt-users] Bug in Snapshot Removing
And sorry, another update, it does kill the VM partly, it was still pingable when i wrote the last mail, but no ssh and no spice console possible From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> Date: Monday 1 June 2015 01:35 To: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>>, "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> Subject: Re: [ovirt-users] Bug in Snapshot Removing Small addition again: This error shows up in the log while removing snapshots WITHOUT rendering the Vms unresponsive — Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed out during operation: cannot acquire state change lock Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.Vm ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block job info Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs… — From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> Date: Monday 1 June 2015 00:56 To: "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> Subject: [ovirt-users] Bug in Snapshot Removing Dear all I am not sure if the mail just did not get any attention between all the mails and this time it is also going to the libvirt mailing list. I am experiencing a problem with VM becoming unresponsive when removing Snapshots (Live Merge) and i think there is a serious problem. Here are the previous mails, http://lists.ovirt.org/pipermail/users/2015-May/033083.html The problem is on a system with everything on the latest version, CentOS 7.1 and ovirt 3.5.2.1 all upgrades applied. This Problem did NOT exist before upgrading to CentOS 7.1 with an environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo activated. I think this is a bug in libvirt, not ovirt itself, but i am not sure. The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, line 697). We are very willing to help, test and supply log files in anyway we can. Regards Soeren
Allon Mureinik
2015-Jun-02 11:47 UTC
Re: [libvirt-users] [ovirt-users] Bug in Snapshot Removing
Adam, can you take a look at this please? Thanks! ----- Original Message -----> From: "Soeren Malchow" <soeren.malchow@mcon.net> > To: "Soeren Malchow" <soeren.malchow@mcon.net>, libvirt-users@redhat.com, > "users" <users@ovirt.org> > Sent: Monday, June 1, 2015 2:39:24 AM > Subject: Re: [ovirt-users] Bug in Snapshot Removing> And sorry, another update, it does kill the VM partly, it was still pingable > when i wrote the last mail, but no ssh and no spice console possible> From: Soeren Malchow < soeren.malchow@mcon.net > > Date: Monday 1 June 2015 01:35 > To: Soeren Malchow < soeren.malchow@mcon.net >, " libvirt-users@redhat.com " > < libvirt-users@redhat.com >, users < users@ovirt.org > > Subject: Re: [ovirt-users] Bug in Snapshot Removing> Small addition again:> This error shows up in the log while removing snapshots WITHOUT rendering the > Vms unresponsive> — > Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed > out during operation: cannot acquire state change lock > Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.Vm > ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block job > info > Traceback (most recent call last): > File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs…> —> From: Soeren Malchow < soeren.malchow@mcon.net > > Date: Monday 1 June 2015 00:56 > To: " libvirt-users@redhat.com " < libvirt-users@redhat.com >, users < > users@ovirt.org > > Subject: [ovirt-users] Bug in Snapshot Removing> Dear all> I am not sure if the mail just did not get any attention between all the > mails and this time it is also going to the libvirt mailing list.> I am experiencing a problem with VM becoming unresponsive when removing > Snapshots (Live Merge) and i think there is a serious problem.> Here are the previous mails,> http://lists.ovirt.org/pipermail/users/2015-May/033083.html> The problem is on a system with everything on the latest version, CentOS 7.1 > and ovirt 3.5.2.1 all upgrades applied.> This Problem did NOT exist before upgrading to CentOS 7.1 with an environment > running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo > activated.> I think this is a bug in libvirt, not ovirt itself, but i am not sure. The > actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, > line 697).> We are very willing to help, test and supply log files in anyway we can.> Regards > Soeren> _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users
Adam Litke
2015-Jun-02 16:53 UTC
Re: [libvirt-users] [ovirt-users] Bug in Snapshot Removing
Hello Soeren. I've started to look at this issue and I'd agree that at first glance it looks like a libvirt issue. The 'cannot acquire state change lock' messages suggest a locking bug or severe contention at least. To help me better understand the problem I have a few questions about your setup.>From your earlier report it appears that you have 15 VMs running onthe failing host. Are you attempting to remove snapshots from all VMs at the same time? Have you tried with fewer concurrent operations? I'd be curious to understand if the problem is connected to the number of VMs running or the number of active block jobs. Have you tried RHEL-7.1 as a hypervisor host? Rather than rebooting the host, does restarting libvirtd cause the VMs to become responsive again? Note that this operation may cause the host to move to Unresponsive state in the UI for a short period of time. Thanks for your report. On 31/05/15 23:39 +0000, Soeren Malchow wrote:>And sorry, another update, it does kill the VM partly, it was still pingable when i wrote the last mail, but no ssh and no spice console possible > >From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> >Date: Monday 1 June 2015 01:35 >To: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>>, "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> >Subject: Re: [ovirt-users] Bug in Snapshot Removing > >Small addition again: > >This error shows up in the log while removing snapshots WITHOUT rendering the Vms unresponsive > >— >Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed out during operation: cannot acquire state change lock >Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.Vm ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block job info > Traceback (most recent call last): > File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs… > >— > > > >From: Soeren Malchow <soeren.malchow@mcon.net<mailto:soeren.malchow@mcon.net>> >Date: Monday 1 June 2015 00:56 >To: "libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>" <libvirt-users@redhat.com<mailto:libvirt-users@redhat.com>>, users <users@ovirt.org<mailto:users@ovirt.org>> >Subject: [ovirt-users] Bug in Snapshot Removing > >Dear all > >I am not sure if the mail just did not get any attention between all the mails and this time it is also going to the libvirt mailing list. > >I am experiencing a problem with VM becoming unresponsive when removing Snapshots (Live Merge) and i think there is a serious problem. > >Here are the previous mails, > >http://lists.ovirt.org/pipermail/users/2015-May/033083.html > >The problem is on a system with everything on the latest version, CentOS 7.1 and ovirt 3.5.2.1 all upgrades applied. > >This Problem did NOT exist before upgrading to CentOS 7.1 with an environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo activated. > >I think this is a bug in libvirt, not ovirt itself, but i am not sure. The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, line 697). > >We are very willing to help, test and supply log files in anyway we can. > >Regards >Soeren >>_______________________________________________ >Users mailing list >Users@ovirt.org >http://lists.ovirt.org/mailman/listinfo/users-- Adam Litke