thr3ads.net - Xen devel - [Xen-devel] SHUTDOWN_crash and vcpu deferrals [Feb 2009]

If this information is useful, please help other people find it:
Share via:

John Levon

2009-Feb-20 21:01 UTC

[Xen-devel] SHUTDOWN_crash and vcpu deferrals

If an HVM guest is waiting for an ioemu assist, when qemu isn''t
running, and
domain_shutdown(SHUTDOWN_crash) is called, then the domain isn''t
crashed
properly:

446 void domain_shutdown(struct domain *d, u8 reason)
447 { 
...
466  for_each_vcpu ( d, v )
467  {
468   if ( v->defer_shutdown )
469    continue; 

Nothing will ever end the deferral. I added code to bust through the
deferral if SHUTDOWN_crash was the reason, and it seemed to help, but
I''m not sure it''s the right fix.

regards
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Feb-20 21:35 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

On 20/02/2009 21:01, "John Levon" <levon@movementarian.org>
wrote:
> If an HVM guest is waiting for an ioemu assist, when qemu isn''t
running, and
> domain_shutdown(SHUTDOWN_crash) is called, then the domain isn''t
crashed
> properly:
> 
> Nothing will ever end the deferral. I added code to bust through the
> deferral if SHUTDOWN_crash was the reason, and it seemed to help, but
> I''m not sure it''s the right fix.
Hm. If qemu is down you''re kind of screwed anyway. Even a non-crashed
guest
will likely hang. If you care about that eventuality (i.e., you believe qemu
problems are possible/likely and need to detect them, defend against them,
or whatever), would it be better to have tools try to detect it through
keepalives or something, and basically tackle that class of problem head on?

If you want the hack, I think what you''re doing is probably about
right. I''d
have to go back over that code again to be exactly sure though, since
it''s a
bit subtle.

Personally I think a dead qemu is pretty bad, and bugs leading to such
should simply be found and fixed (oh for a perfect world :-). That bad
things happen to a guest, like SHUTDOWN_crash hanging, after qemu is dead...
I''d just live with that -- a worse thing has *already* happened to that
guest''s virtualisation environment.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Levon

2009-Feb-20 22:03 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

On Fri, Feb 20, 2009 at 09:35:16PM +0000, Keir Fraser wrote:
> Hm. If qemu is down you''re kind of screwed anyway.
You''re totally screwed. But what happens today is this: you get some
weird message about sentinels in xend.log (if you happen to read it),
and a domain state that looks like this:

domu-224                        2  1024     1     ------ 0.0

which is not exactly very useful. But we detect qemu failures now in
xend. So we turn on this code:

        # ideally we would like to forcibly crash the domain with
        # something like
        #    xc.domain_shutdown(self.vm.getDomid(), DOMAIN_CRASH)
        # but this can easily lead to very rapid restart loops against
        # which we currently have no protection

(The comment being completely incorrect), but then the crash doesn''t
work because of the bug I pointed out.

All I want to do is mark a domain without a qemu process as crashed. Is
that clearer?

And yes, it''s pretty trivial to make qemu break. Most typically by
passing bogus parameters (say, a broken kernel image, an incorrect NIC,
etc.)

regards,
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Feb-21 09:01 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

On 20/02/2009 22:03, "John Levon" <levon@movementarian.org>
wrote:
> All I want to do is mark a domain without a qemu process as crashed. Is
> that clearer?
> 
> And yes, it''s pretty trivial to make qemu break. Most typically by
> passing bogus parameters (say, a broken kernel image, an incorrect NIC,
> etc.)
Hmmmm.... Okay, I guess that is pretty reasonable. I''ll sort out a
patch
after the summit.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2009-Feb-23 16:51 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

John Levon writes ("Re: [Xen-devel] SHUTDOWN_crash and vcpu
deferrals"):>         # ideally we would like to forcibly crash the domain with
>         # something like
>         #    xc.domain_shutdown(self.vm.getDomid(), DOMAIN_CRASH)
>         # but this can easily lead to very rapid restart loops against
>         # which we currently have no protection
> 
> (The comment being completely incorrect), but then the crash
doesn''t
> work because of the bug I pointed out.
I wrote that comment.  I haven''t been following this bit of xend.  Do
you mean that nowadays if you say
   on_crash = ''restart''
and the domain immediately crashes on boot, you don''t get an infinite
restart loop ?  One of the most common causes of qemu `crashing'' is
that it wasn''t able to open the dom0 device corresponding to some
emulated device for the guest''s benefit and that obviously happens at
startup.
> All I want to do is mark a domain without a qemu process as crashed. Is
> that clearer?
I think that would be good, provided that we can prevent it restarting
rapidly.
> And yes, it''s pretty trivial to make qemu break. Most typically by
> passing bogus parameters (say, a broken kernel image, an incorrect NIC,
> etc.)
As you say.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Levon

2009-Feb-23 16:54 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

On Mon, Feb 23, 2009 at 04:51:10PM +0000, Ian Jackson wrote:
> > (The comment being completely incorrect), but then the crash
doesn''t
> > work because of the bug I pointed out.
> 
> I wrote that comment.  I haven''t been following this bit of xend. 
Do
> you mean that nowadays if you say
>    on_crash = ''restart''
> and the domain immediately crashes on boot, you don''t get an
infinite
> restart loop ?  One of the most common causes of qemu `crashing''
is
AFAIK this has been the case since forever:

        rst = self._readVm(''xend/previous_restart_time'')
        if rst:
            rst = float(rst)
            timeout = now - rst
            if timeout < MINIMUM_RESTART_TIME:
                log.error(
                    ''VM %s restarting too fast (%f seconds since the
last ''
                    ''restart).  Refusing to restart to avoid
loops.'',
                    self.info[''name_label''], timeout)
                self.destroy()
                return

        self._writeVm(''xend/previous_restart_time'', str(now))

This is from 3.1.4. Perhaps it was broken when you tried it, but it
certainly seems to do its intended job on 3.3.2pre for me.

regards,
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2009-Feb-23 16:58 UTC

head link

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

John Levon writes ("Re: [Xen-devel] SHUTDOWN_crash and vcpu
deferrals"):> This is from 3.1.4. Perhaps it was broken when you tried it, but it
> certainly seems to do its intended job on 3.3.2pre for me.
Oh, great.  I put the comment there because I remembered it happening
to me once (with some kind of pre-3.2 unstable tree I think) but
perhaps I misremembered or there was something else wrong.  I didn''t
try to reproduce it.

Well, in that case we should definitely fix Xen so that the guest can
be crashed and get rid of my bogus comment.

Regards,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Feb 2009 - SHUTDOWN_crash and vcpu deferrals

[Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals

Re: [Xen-devel] SHUTDOWN_crash and vcpu deferrals