Sander Eikelenboom
2013-Apr-16 19:23 UTC
xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Hi Stefano, When trying to start a hvm quest with xen-unstable and qemu-upstream, it ends up stuck: xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->000000000019c864 Modules: 0000000000000000->0000000000000000 TOTAL: 0000000000000000->000000002f800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x000000000000017b 1GB PAGES: 0x0000000000000000 libxl: error: libxl_qmp.c:440:qmp_next: timeout libxl: error: libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP Bisecting pointed to this commit: commit 0ca5aa4f4c4a8bcc73988dd52a536241d35e5223 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Wed Apr 10 15:23:27 2013 +0200 qemu-char: another io_add_watch_poll fix After attaching the source, we have to remove the reference we hold to it, because we do not hold anymore a pointer to the source. If we do not do this, removing the source will not finalize it and will not drop the "real" I/O watch source. This showed up when backporting the new flow control patches to older versions of QEMU that still used select. The whole select then failed with EBADF (poll instead will reporting POLLNVAL on a single pollfd) and QEMU froze. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1365600207-21685-1-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Which changed: --- a/qemu-char.c +++ b/qemu-char.c @@ -663,6 +663,7 @@ static guint io_add_watch_poll(GIOChannel *channel, gpointer user_data) { IOWatchPoll *iwp; + int tag; iwp = (IOWatchPoll *) g_source_new(&io_watch_poll_funcs, sizeof(IOWatchPoll)); iwp->fd_can_read = fd_can_read; @@ -671,7 +672,9 @@ static guint io_add_watch_poll(GIOChannel *channel, iwp->fd_read = (GSourceFunc) fd_read; iwp->src = NULL; - return g_source_attach(&iwp->parent, NULL); + tag = g_source_attach(&iwp->parent, NULL); + g_source_unref(&iwp->parent); + return tag; } #ifndef _WIN32 Any thing that springs to mind ? -- Sander
Stefano Stabellini
2013-Apr-18 13:27 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
On Tue, 16 Apr 2013, Sander Eikelenboom wrote:> Hi Stefano, > > When trying to start a hvm quest with xen-unstable and qemu-upstream, it ends up stuck: > > xc: info: VIRTUAL MEMORY ARRANGEMENT: > Loader: 0000000000100000->000000000019c864 > Modules: 0000000000000000->0000000000000000 > TOTAL: 0000000000000000->000000002f800000 > ENTRY ADDRESS: 0000000000100000 > xc: info: PHYSICAL MEMORY ALLOCATION: > 4KB PAGES: 0x0000000000000200 > 2MB PAGES: 0x000000000000017b > 1GB PAGES: 0x0000000000000000 > libxl: error: libxl_qmp.c:440:qmp_next: timeout > libxl: error: libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP > > Bisecting pointed to this commit: > > commit 0ca5aa4f4c4a8bcc73988dd52a536241d35e5223 > Author: Paolo Bonzini <pbonzini@redhat.com> > Date: Wed Apr 10 15:23:27 2013 +0200 > > qemu-char: another io_add_watch_poll fix > > After attaching the source, we have to remove the reference we hold > to it, because we do not hold anymore a pointer to the source. > > If we do not do this, removing the source will not finalize it and > will not drop the "real" I/O watch source. > > This showed up when backporting the new flow control patches to older > versions of QEMU that still used select. The whole select then failed > with EBADF (poll instead will reporting POLLNVAL on a single pollfd) > and QEMU froze. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > Message-id: 1365600207-21685-1-git-send-email-pbonzini@redhat.com > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> > > > Which changed: > > --- a/qemu-char.c > +++ b/qemu-char.c > @@ -663,6 +663,7 @@ static guint io_add_watch_poll(GIOChannel *channel, > gpointer user_data) > { > IOWatchPoll *iwp; > + int tag; > > iwp = (IOWatchPoll *) g_source_new(&io_watch_poll_funcs, sizeof(IOWatchPoll)); > iwp->fd_can_read = fd_can_read; > @@ -671,7 +672,9 @@ static guint io_add_watch_poll(GIOChannel *channel, > iwp->fd_read = (GSourceFunc) fd_read; > iwp->src = NULL; > > - return g_source_attach(&iwp->parent, NULL); > + tag = g_source_attach(&iwp->parent, NULL); > + g_source_unref(&iwp->parent); > + return tag; > } > > #ifndef _WIN32 > > Any thing that springs to mind ?Nope, sorry. Given the commit that your bisection identified and the error message, it''s clear that it''s a QMP connection issue from libxl. I would start looking in libxl_qmp.c. Maybe Anthony (CC''ed) has some ideas.
Paolo Bonzini
2013-Apr-18 13:32 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Il 18/04/2013 15:27, Stefano Stabellini ha scritto:>> > >> > Any thing that springs to mind ? > Nope, sorry. > Given the commit that your bisection identified and the error message, > it''s clear that it''s a QMP connection issue from libxl. I would start > looking in libxl_qmp.c. > Maybe Anthony (CC''ed) has some ideas.What host is this? It is likely a glib or qemu-char bug (hopefully the deadlock that Gerd found on RHEL6) that was latent up to this patch. Paolo
Sander Eikelenboom
2013-Apr-18 23:21 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Thursday, April 18, 2013, 3:32:37 PM, you wrote:> Il 18/04/2013 15:27, Stefano Stabellini ha scritto: >>> > >>> > Any thing that springs to mind ? >> Nope, sorry. >> Given the commit that your bisection identified and the error message, >> it''s clear that it''s a QMP connection issue from libxl. I would start >> looking in libxl_qmp.c. >> Maybe Anthony (CC''ed) has some ideas.> What host is this? It is likely a glib or qemu-char bug (hopefully > the deadlock that Gerd found on RHEL6) that was latent up to this patch.The host/Dom0 is Debian squeeze (glibc ) on xen-unstable, using qemu-upstream. The guest is Debian wheezy. Previously (older xen-unstable and qemu-upstream (from yesterdays tree)), it worked without connecting the console (serial=pty and using "xl create -c") and only reverting 0ca5aa4f4c4a8bcc73988dd52a536241d35e5223 from qemu-upstream. With the latest xen-unstable (764012376a096a45bced88ee2ec1a6c17c6c22c7) and qemu-upstream (09dada400328d75daf79e3eca1e48e024fec148d): Without reverting anything, and using serial=pty and connecting the console at boot with "xl create -c", i get: xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->000000000019c864 Modules: 0000000000000000->0000000000000000 TOTAL: 0000000000000000->000000002f800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x000000000000017b 1GB PAGES: 0x0000000000000000 libxl: error: libxl_qmp.c:440:qmp_next: timeout libxl: error: libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP libxl: error: libxl_qmp.c:440:qmp_next: timeout libxl: error: libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP libxl: error: libxl_qmp.c:690:libxl__qmp_initialize: Connection error: Resource temporarily unavailable Daemon running with PID 10174 xenconsole: Could not read tty from store: No such file or directory libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console child [0] exited with error status 2 and qemu log only shows: char device redirected to /dev/pts/17 (label serial0) With reverting commit 0ca5aa4f4c4a8bcc73988dd52a536241d35e5223, and using serial=pty and connecting the console at boot with "xl create -c", i get: The Guest starts fine. and qemu log only shows: char device redirected to /dev/pts/16 (label serial0) main-loop: WARNING: I/O thread spun for 1000 iterations xen be: vkbd-0: initialise() failed xen be: vkbd-0: initialise() failed xen be: vkbd-0: initialise() failed With reverting commit 0ca5aa4f4c4a8bcc73988dd52a536241d35e5223, and using serial=pty and NOT connecting the console at boot with "xl create" (omitting the -c), i get: Parsing config from /etc/xen/domU/production/security.cfg xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->000000000019c864 Modules: 0000000000000000->0000000000000000 TOTAL: 0000000000000000->000000002f800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x000000000000017b 1GB PAGES: 0x0000000000000000 Daemon running with PID 468 So the guest starts, but it runs very very very very slowly. When using "xl console", and after i hit a key, i see the kernel boot output being printed very slowly with about 1 character per second ... and qemu log only shows: char device redirected to /dev/pts/16 (label serial0) main-loop: WARNING: I/O thread spun for 1000 iterations So it looks like it''s a output buffer that''s filled to the max that makes it grind to a halt. -- Sander> Paolo
Paolo Bonzini
2013-Apr-19 07:27 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Il 19/04/2013 01:21, Sander Eikelenboom ha scritto:>> Thursday, April 18, 2013, 3:32:37 PM, you wrote: > >> Il 18/04/2013 15:27, Stefano Stabellini ha scritto: >>>>> >>>>> Any thing that springs to mind ? >>> Nope, sorry. >>> Given the commit that your bisection identified and the error message, >>> it''s clear that it''s a QMP connection issue from libxl. I would start >>> looking in libxl_qmp.c. >>> Maybe Anthony (CC''ed) has some ideas. > >> What host is this? It is likely a glib or qemu-char bug (hopefully >> the deadlock that Gerd found on RHEL6) that was latent up to this patch. > > The host/Dom0 is Debian squeeze (glibc ) on xen-unstable, using qemu-upstream. > The guest is Debian wheezy.What version is glib on the host?> and qemu log only shows: > char device redirected to /dev/pts/16 (label serial0)Ok, there are patches on the list already for this issue. Paolo
Sander Eikelenboom
2013-Apr-19 07:52 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 9:27:34 AM, you wrote:> Il 19/04/2013 01:21, Sander Eikelenboom ha scritto:> >> Thursday, April 18, 2013, 3:32:37 PM, you wrote: >> >>> Il 18/04/2013 15:27, Stefano Stabellini ha scritto: >>>>>> >>>>>> Any thing that springs to mind ? >>>> Nope, sorry. >>>> Given the commit that your bisection identified and the error message, >>>> it''s clear that it''s a QMP connection issue from libxl. I would start >>>> looking in libxl_qmp.c. >>>> Maybe Anthony (CC''ed) has some ideas. >> >>> What host is this? It is likely a glib or qemu-char bug (hopefully >>> the deadlock that Gerd found on RHEL6) that was latent up to this patch. >> >> The host/Dom0 is Debian squeeze (glibc ) on xen-unstable, using qemu-upstream. >> The guest is Debian wheezy.> What version is glib on the host?Ah i knew i forgot to look something up .. ii libglib2.0-0 2.24.2-1 The GLib library of C routines ii libglib2.0-data 2.24.2-1 Common files for GLib library ii libglib2.0-dev 2.24.2-1 Development files for the GLib library>> and qemu log only shows: >> char device redirected to /dev/pts/16 (label serial0)> Ok, there are patches on the list already for this issue.> Paolo
Sander Eikelenboom
2013-Apr-19 07:56 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 9:27:34 AM, you wrote:> Il 19/04/2013 01:21, Sander Eikelenboom ha scritto:> >> Thursday, April 18, 2013, 3:32:37 PM, you wrote: >> >>> Il 18/04/2013 15:27, Stefano Stabellini ha scritto: >>>>>> >>>>>> Any thing that springs to mind ? >>>> Nope, sorry. >>>> Given the commit that your bisection identified and the error message, >>>> it''s clear that it''s a QMP connection issue from libxl. I would start >>>> looking in libxl_qmp.c. >>>> Maybe Anthony (CC''ed) has some ideas. >> >>> What host is this? It is likely a glib or qemu-char bug (hopefully >>> the deadlock that Gerd found on RHEL6) that was latent up to this patch. >> >> The host/Dom0 is Debian squeeze (glibc ) on xen-unstable, using qemu-upstream. >> The guest is Debian wheezy.> What version is glib on the host?Somewhat more precise: ~# /lib/libc.so.6 GNU C Library (Debian EGLIBC 2.11.3-4) stable release version 2.11.3, by Roland McGrath et al. Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled by GNU CC version 4.4.5. Compiled on a Linux 2.6.32 system on 2012-06-06. Available extensions: crypt add-on version 2.1 by Michael Glad and others GNU Libidn by Simon Josefsson Native POSIX Threads Library by Ulrich Drepper et al BIND-8.2.3-T5B For bug reporting instructions, please see: <http://www.debian.org/Bugs/>.>> and qemu log only shows: >> char device redirected to /dev/pts/16 (label serial0)> Ok, there are patches on the list already for this issue.> Paolo
Paolo Bonzini
2013-Apr-19 08:04 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Il 19/04/2013 09:52, Sander Eikelenboom ha scritto:>> > What version is glib on the host? > Ah i knew i forgot to look something up .. > > ii libglib2.0-0 2.24.2-1 The GLib library of C routines > ii libglib2.0-data 2.24.2-1 Common files for GLib library > ii libglib2.0-dev 2.24.2-1 Development files for the GLib libraryThis very likely has the same deadlock bug that Gerd found on RHEL6. Paolo
Sander Eikelenboom
2013-Apr-19 12:56 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 10:04:22 AM, you wrote:> Il 19/04/2013 09:52, Sander Eikelenboom ha scritto: >>> > What version is glib on the host? >> Ah i knew i forgot to look something up .. >> >> ii libglib2.0-0 2.24.2-1 The GLib library of C routines >> ii libglib2.0-data 2.24.2-1 Common files for GLib library >> ii libglib2.0-dev 2.24.2-1 Development files for the GLib library> This very likely has the same deadlock bug that Gerd found on RHEL6.Any ideas on how to get things fixed ?> Paolo
Paolo Bonzini
2013-Apr-19 13:00 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Il 19/04/2013 14:56, Sander Eikelenboom ha scritto:> > Friday, April 19, 2013, 10:04:22 AM, you wrote: > >> Il 19/04/2013 09:52, Sander Eikelenboom ha scritto: >>>>> What version is glib on the host? >>> Ah i knew i forgot to look something up .. >>> >>> ii libglib2.0-0 2.24.2-1 The GLib library of C routines >>> ii libglib2.0-data 2.24.2-1 Common files for GLib library >>> ii libglib2.0-dev 2.24.2-1 Development files for the GLib library > >> This very likely has the same deadlock bug that Gerd found on RHEL6. > > Any ideas on how to get things fixed ?Either I or Gerd will test the prototype patch I posted yesterday, and it will be ok in -rc0. Paolo
Sander Eikelenboom
2013-Apr-19 16:53 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 3:00:56 PM, you wrote:> Il 19/04/2013 14:56, Sander Eikelenboom ha scritto: >> >> Friday, April 19, 2013, 10:04:22 AM, you wrote: >> >>> Il 19/04/2013 09:52, Sander Eikelenboom ha scritto: >>>>>> What version is glib on the host? >>>> Ah i knew i forgot to look something up .. >>>> >>>> ii libglib2.0-0 2.24.2-1 The GLib library of C routines >>>> ii libglib2.0-data 2.24.2-1 Common files for GLib library >>>> ii libglib2.0-dev 2.24.2-1 Development files for the GLib library >> >>> This very likely has the same deadlock bug that Gerd found on RHEL6. >> >> Any ideas on how to get things fixed ?> Either I or Gerd will test the prototype patch I posted yesterday, and > it will be ok in -rc0.This prototype patch fixes the guest start issue for me. Thx, Sander> Paolo
Paolo Bonzini
2013-Apr-19 17:00 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Il 19/04/2013 18:53, Sander Eikelenboom ha scritto:> > Friday, April 19, 2013, 3:00:56 PM, you wrote: > >> Il 19/04/2013 14:56, Sander Eikelenboom ha scritto: >>> >>> Friday, April 19, 2013, 10:04:22 AM, you wrote: >>> >>>> Il 19/04/2013 09:52, Sander Eikelenboom ha scritto: >>>>>>> What version is glib on the host? >>>>> Ah i knew i forgot to look something up .. >>>>> >>>>> ii libglib2.0-0 2.24.2-1 The GLib library of C routines >>>>> ii libglib2.0-data 2.24.2-1 Common files for GLib library >>>>> ii libglib2.0-dev 2.24.2-1 Development files for the GLib library >>> >>>> This very likely has the same deadlock bug that Gerd found on RHEL6. >>> >>> Any ideas on how to get things fixed ? > >> Either I or Gerd will test the prototype patch I posted yesterday, and >> it will be ok in -rc0. > > This prototype patch fixes the guest start issue for me.Cool, I just reposted the 4 patches together. Can you reply with the text "Tested-by: Sander Eikelenboom <linux@eikelenboom.it>"? Paolo
Sander Eikelenboom
2013-Apr-19 17:03 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 7:00:11 PM, you wrote:> Il 19/04/2013 18:53, Sander Eikelenboom ha scritto: >> >> Friday, April 19, 2013, 3:00:56 PM, you wrote: >> >>> Il 19/04/2013 14:56, Sander Eikelenboom ha scritto: >>>> >>>> Friday, April 19, 2013, 10:04:22 AM, you wrote: >>>> >>>>> Il 19/04/2013 09:52, Sander Eikelenboom ha scritto: >>>>>>>> What version is glib on the host? >>>>>> Ah i knew i forgot to look something up .. >>>>>> >>>>>> ii libglib2.0-0 2.24.2-1 The GLib library of C routines >>>>>> ii libglib2.0-data 2.24.2-1 Common files for GLib library >>>>>> ii libglib2.0-dev 2.24.2-1 Development files for the GLib library >>>> >>>>> This very likely has the same deadlock bug that Gerd found on RHEL6. >>>> >>>> Any ideas on how to get things fixed ? >> >>> Either I or Gerd will test the prototype patch I posted yesterday, and >>> it will be ok in -rc0. >> >> This prototype patch fixes the guest start issue for me.> Cool, I just reposted the 4 patches together. Can you reply with the > text "Tested-by: Sander Eikelenboom <linux@eikelenboom.it>"?I''m not subscribed to the qemu-devel list, so unfortunately, i have no message-id to reply to. However i''m fine with you sticking the tested-by on it. -- Sander> Paolo
Sander Eikelenboom
2013-Apr-20 14:36 UTC
Re: xen-unstable qemu-upstream: libxl: error: libxl_qmp.c:440:qmp_next: timeout, libxl_qmp.c:705:libxl__qmp_initialize: Failed to connect to QMP
Friday, April 19, 2013, 7:15:59 PM, you wrote:> Il 19/04/2013 19:11, Sander Eikelenboom ha scritto: >> BTW, is there a relation with the "main-loop: WARNING: I/O thread spun for 1000 iterations" ? >> I''m still seeing that one popping up in the log.> If you want to try to up the count to 10000 or 100000 it would be nice.> If it doesn''t disappear with 100000 it''s definitely a bug.It''s somewhere between 10000 and 50000> Paolo