Lentes, Bernd
2019-Jun-13 11:08 UTC
Re: [libvirt-users] blockcommit of domain not successfull
----- On Jun 13, 2019, at 9:56 AM, Peter Krempa pkrempa@redhat.com wrote:> > Thanks for comming back to me with the information. > > Unfortunately this is not a full debug log but I can try to tell you > what I see here:I configured libvirtd that way: ha-idg-1:~ # grep -Ev '^$|#' /etc/libvirt/libvirtd.conf log_level = 1 log_filters="1:qemu 3:remote 4:event 3:util.json 3:rpc" log_outputs="1:file:/var/log/libvirt/libvirtd.log" keepalive_interval = -1 That's what i found on https://wiki.libvirt.org/page/DebugLogs . Isn't that correct ? That should create informative logfiles. The other host has excat the same configuration but produce much bigger logfiles !?! I have libvirt-daemon-4.0.0-8.12.1.x86_64.> >> 2019-06-07 20:30:57.170+0000: 30299: error : qemuMonitorIO:719 : internal error: >> End of file from qemu monitor >> 2019-06-08 03:59:17.690+0000: 30299: error : qemuMonitorIO:719 : internal error: >> End of file from qemu monitor > > So this looks like qemu crashed. Or at least it's the usual symptom we > get. Is there anything in /var/log/libvirt/qemu/$VMNAME.log?That's all: qemu-system-x86_64: block/mirror.c:864: mirror_run: Assertion `((&bs->tracked_requests)->lh_first == ((void *)0))' failed.> >> 2019-06-08 03:59:26.145+0000: 30300: warning : qemuGetProcessInfo:1461 : cannot >> parse process status data >> 2019-06-08 03:59:26.191+0000: 30303: warning : qemuGetProcessInfo:1461 : cannot >> parse process status data >> 2019-06-08 03:59:56.095+0000: 27956: warning : >> qemuDomainObjBeginJobInternal:4865 : Cannot start job (destroy, none) for >> domain severin; current job is (modify, none) owned by (13061 >> remoteDispatchDomainBlockJobAbort, 0 <null>) for (38s, >> 0s) > > And this looks to me as if the Abort job can't be interrupted properly > while waiting synchronously for the job to finish. This seems to be the > problem. If the VM indeed crashed there's a problem in job waiting > apparently. > > I'd still really like to have debug logs in this case to really see what > happened.I configured logging as i found on https://wiki.libvirt.org/page/DebugLogs. What else can i do ? Bernd Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671
Lentes, Bernd
2019-Jun-13 14:01 UTC
Re: [libvirt-users] blockcommit of domain not successfull
----- On Jun 13, 2019, at 1:08 PM, Bernd Lentes bernd.lentes@helmholtz-muenchen.de wrote: I found further information in /var/log/messages for both occurrences: 2019-06-01T03:05:31.620725+02:00 ha-idg-2 systemd-coredump[14253]: Core Dumping has been disabled for process 30590 (qemu-system-x86). 2019-06-01T03:05:31.712673+02:00 ha-idg-2 systemd-coredump[14253]: Process 30590 (qemu-system-x86) of user 488 dumped core. 2019-06-01T03:05:32.173272+02:00 ha-idg-2 kernel: [294682.387828] br0: port 4(vnet2) entered disabled state 2019-06-01T03:05:32.177111+02:00 ha-idg-2 kernel: [294682.388384] device vnet2 left promiscuous mode 2019-06-01T03:05:32.177122+02:00 ha-idg-2 kernel: [294682.388391] br0: port 4(vnet2) entered disabled state 2019-06-01T03:05:32.208916+02:00 ha-idg-2 wickedd[2954]: error retrieving tap attribute from sysfs 2019-06-01T03:05:41.395685+02:00 ha-idg-2 systemd-machined[2824]: Machine qemu-31-severin terminated. 2019-06-08T05:59:17.502899+02:00 ha-idg-1 systemd-coredump[31089]: Core Dumping has been disabled for process 19489 (qemu-system-x86). 2019-06-08T05:59:17.523050+02:00 ha-idg-1 systemd-coredump[31089]: Process 19489 (qemu-system-x86) of user 489 dumped core. 2019-06-08T05:59:17.650334+02:00 ha-idg-1 kernel: [999258.577132] br0: port 9(vnet7) entered disabled state 2019-06-08T05:59:17.650354+02:00 ha-idg-1 kernel: [999258.578103] device vnet7 left promiscuous mode 2019-06-08T05:59:17.650355+02:00 ha-idg-1 kernel: [999258.578108] br0: port 9(vnet7) entered disabled state 2019-06-08T05:59:25.983702+02:00 ha-idg-1 systemd-machined[1383]: Machine qemu-205-severin terminated. Core Dumping is disabled, but nevertheless a core dump has been created ? Where could i find it ? Would it be useful to provide it ? Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671
Peter Krempa
2019-Jun-14 07:14 UTC
Re: [libvirt-users] blockcommit of domain not successfull
On Thu, Jun 13, 2019 at 16:01:18 +0200, Lentes, Bernd wrote:> > ----- On Jun 13, 2019, at 1:08 PM, Bernd Lentes bernd.lentes@helmholtz-muenchen.de wrote: > > I found further information in /var/log/messages for both occurrences: > > 2019-06-01T03:05:31.620725+02:00 ha-idg-2 systemd-coredump[14253]: Core Dumping has been disabled for process 30590 (qemu-system-x86). > 2019-06-01T03:05:31.712673+02:00 ha-idg-2 systemd-coredump[14253]: Process 30590 (qemu-system-x86) of user 488 dumped core. > 2019-06-01T03:05:32.173272+02:00 ha-idg-2 kernel: [294682.387828] br0: port 4(vnet2) entered disabled state > 2019-06-01T03:05:32.177111+02:00 ha-idg-2 kernel: [294682.388384] device vnet2 left promiscuous mode > 2019-06-01T03:05:32.177122+02:00 ha-idg-2 kernel: [294682.388391] br0: port 4(vnet2) entered disabled state > 2019-06-01T03:05:32.208916+02:00 ha-idg-2 wickedd[2954]: error retrieving tap attribute from sysfs > 2019-06-01T03:05:41.395685+02:00 ha-idg-2 systemd-machined[2824]: Machine qemu-31-severin terminated. > > > 2019-06-08T05:59:17.502899+02:00 ha-idg-1 systemd-coredump[31089]: Core Dumping has been disabled for process 19489 (qemu-system-x86). > 2019-06-08T05:59:17.523050+02:00 ha-idg-1 systemd-coredump[31089]: Process 19489 (qemu-system-x86) of user 489 dumped core. > 2019-06-08T05:59:17.650334+02:00 ha-idg-1 kernel: [999258.577132] br0: port 9(vnet7) entered disabled state > 2019-06-08T05:59:17.650354+02:00 ha-idg-1 kernel: [999258.578103] device vnet7 left promiscuous mode > 2019-06-08T05:59:17.650355+02:00 ha-idg-1 kernel: [999258.578108] br0: port 9(vnet7) entered disabled state > 2019-06-08T05:59:25.983702+02:00 ha-idg-1 systemd-machined[1383]: Machine qemu-205-severin terminated. > > Core Dumping is disabled, but nevertheless a core dump has been created ? > Where could i find it ? > Would it be useful to provide it ?So this really hints to qemu crashing. It certainly will be beneficial to collect the backtrace, but you really should report this (including the error message from the vm log file) to the qemu team. They might have even fixed it by now, so a plain update might help.