Andrew Cooper
2013-Mar-07 15:13 UTC
[PATCH] tools/xenconsoled: Initialise pointers before trying to use them
This is a regression introduced by "Switch from select() to poll() in xenconsoled''s IO loop." hg c/s 26405:7359c3122c5d git cc5434c933153c4b8812d1df901f8915c22830a8 which results in reliable segfaults during VM power operations. Signed-off-by: Marcus Granado <marcus.granado@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- These segfaults are 100% reproducible in a XenServer system when rebooting a PV domain. I am at a loss to explain why the OSS testing has not picked them up. diff -r 94ece33caae2 -r f66d74cbf492 tools/console/daemon/io.c --- a/tools/console/daemon/io.c +++ b/tools/console/daemon/io.c @@ -669,6 +669,7 @@ static struct domain *create_domain(int strcat(dom->conspath, "/console"); dom->master_fd = -1; + dom->master_pollfd = NULL; dom->slave_fd = -1; dom->log_fd = -1; @@ -687,6 +688,7 @@ static struct domain *create_domain(int dom->remote_port = -1; dom->interface = NULL; dom->xce_handle = NULL; + dom->xce_pollfd = NULL; if (!watch_domain(dom, true)) goto out;
Olaf Hering
2013-Mar-07 15:28 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On Thu, Mar 07, Andrew Cooper wrote:> This is a regression introduced by > > "Switch from select() to poll() in xenconsoled''s IO loop." > hg c/s 26405:7359c3122c5d > git cc5434c933153c4b8812d1df901f8915c22830a8 > > which results in reliable segfaults during VM power operations.What about switching to calloc(3), given that many members are zeroed anyway? Olaf
Andrew Cooper
2013-Mar-07 15:57 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On 07/03/13 15:28, Olaf Hering wrote:> On Thu, Mar 07, Andrew Cooper wrote: > >> This is a regression introduced by >> >> "Switch from select() to poll() in xenconsoled''s IO loop." >> hg c/s 26405:7359c3122c5d >> git cc5434c933153c4b8812d1df901f8915c22830a8 >> >> which results in reliable segfaults during VM power operations. > What about switching to calloc(3), given that many members are zeroed > anyway? > > OlafThat also works, and is better going forwards. V2 on its way.
Wei Liu
2013-Mar-07 16:20 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On Thu, 2013-03-07 at 15:13 +0000, Andrew Cooper wrote:> This is a regression introduced by > > "Switch from select() to poll() in xenconsoled''s IO loop." > hg c/s 26405:7359c3122c5d > git cc5434c933153c4b8812d1df901f8915c22830a8 > > which results in reliable segfaults during VM power operations. > > Signed-off-by: Marcus Granado <marcus.granado@citrix.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >Good catch. Thanks. Wei.
Andrew Cooper
2013-Mar-07 16:22 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On 07/03/13 16:20, Wei Liu wrote:> On Thu, 2013-03-07 at 15:13 +0000, Andrew Cooper wrote: >> This is a regression introduced by >> >> "Switch from select() to poll() in xenconsoled''s IO loop." >> hg c/s 26405:7359c3122c5d >> git cc5434c933153c4b8812d1df901f8915c22830a8 >> >> which results in reliable segfaults during VM power operations. >> >> Signed-off-by: Marcus Granado <marcus.granado@citrix.com> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> > Good catch. Thanks. > > > Wei. >Sadly, after fixing these segfaults, the code as currently is will cause xenconsoled to exit gracefully as soon as you try and boot the 128th domain. We are currently investigating the issue. ~Andrew
Wei Liu
2013-Mar-07 17:15 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On Thu, 2013-03-07 at 16:22 +0000, Andrew Cooper wrote:> On 07/03/13 16:20, Wei Liu wrote: > > On Thu, 2013-03-07 at 15:13 +0000, Andrew Cooper wrote: > >> This is a regression introduced by > >> > >> "Switch from select() to poll() in xenconsoled''s IO loop." > >> hg c/s 26405:7359c3122c5d > >> git cc5434c933153c4b8812d1df901f8915c22830a8 > >> > >> which results in reliable segfaults during VM power operations. > >> > >> Signed-off-by: Marcus Granado <marcus.granado@citrix.com> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > >> > > Good catch. Thanks. > > > > > > Wei. > > > > Sadly, after fixing these segfaults, the code as currently is will cause > xenconsoled to exit gracefully as soon as you try and boot the 128th > domain. We are currently investigating the issue. >Odd. So you were implying if you didn''t fix this bug, you succeeded in booting up >128 guests? All the exit paths have "dolog" which outputs to stderr, so maybe run xenconsoled in foreground can help. Wei.> ~Andrew
Andrew Cooper
2013-Mar-07 17:46 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
On 07/03/13 17:15, Wei Liu wrote:> On Thu, 2013-03-07 at 16:22 +0000, Andrew Cooper wrote: >> On 07/03/13 16:20, Wei Liu wrote: >>> On Thu, 2013-03-07 at 15:13 +0000, Andrew Cooper wrote: >>>> This is a regression introduced by >>>> >>>> "Switch from select() to poll() in xenconsoled''s IO loop." >>>> hg c/s 26405:7359c3122c5d >>>> git cc5434c933153c4b8812d1df901f8915c22830a8 >>>> >>>> which results in reliable segfaults during VM power operations. >>>> >>>> Signed-off-by: Marcus Granado <marcus.granado@citrix.com> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> >>> Good catch. Thanks. >>> >>> >>> Wei. >>> >> Sadly, after fixing these segfaults, the code as currently is will cause >> xenconsoled to exit gracefully as soon as you try and boot the 128th >> domain. We are currently investigating the issue. >> > Odd. So you were implying if you didn''t fix this bug, you succeeded in > booting up >128 guests? > > All the exit paths have "dolog" which outputs to stderr, so maybe run > xenconsoled in foreground can help. > > > Wei.After fixing the segfault bugs, xenconsoled intermittently fails and exits with: Mar 7 16:45:49 localhost /usr/sbin/xenconsoled: Failure in poll xs_handle: 3 (No such process) The test case is attempting to sequentially boot 1000 PV VMs on the same host. So far, the common theme of failures appear to be at multiples of 128 VMs. This is usually at the 128th VM, but also seen at the 384th VM. ~Andrew
Ian Jackson
2013-Mar-07 18:07 UTC
Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them
Andrew Cooper writes ("Re: [PATCH] tools/xenconsoled: Initialise pointers before trying to use them"):> On 07/03/13 16:20, Wei Liu wrote: > > On Thu, 2013-03-07 at 15:13 +0000, Andrew Cooper wrote: > >> This is a regression introduced by > >> > >> "Switch from select() to poll() in xenconsoled''s IO loop." > >> hg c/s 26405:7359c3122c5d > >> git cc5434c933153c4b8812d1df901f8915c22830a8 > >> > >> which results in reliable segfaults during VM power operations. > >> > >> Signed-off-by: Marcus Granado <marcus.granado@citrix.com> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>...> Sadly, after fixing these segfaults, the code as currently is will cause > xenconsoled to exit gracefully as soon as you try and boot the 128th > domain. We are currently investigating the issue.Right. Nevertheless, I''ve applied your patch to be going on with. Ian.