Andrew Cooper
2013-Feb-04 14:25 UTC
[PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
While the triple fault action on native hardware will result in a system reset, any modern operating system can and will make use of less violent reboot methods. As a result, the most likely cause of a triple fault is a fatal software bug. This patch allows the toolstack to indicate that a triple fault should mean a crash rather than a reboot. The default of reboot still remains the same. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1233,9 +1233,14 @@ void hvm_hlt(unsigned long rflags) void hvm_triple_fault(void) { struct vcpu *v = current; + struct domain * d = v->domain; + u8 reason = d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_CRASH] + ? SHUTDOWN_crash : SHUTDOWN_reboot; + gdprintk(XENLOG_INFO, "Triple fault on VCPU%d - " - "invoking HVM system reset.\n", v->vcpu_id); - domain_shutdown(v->domain, SHUTDOWN_reboot); + "invoking HVM system %s.\n", v->vcpu_id, + reason == SHUTDOWN_crash ? "crash" : "reboot"); + domain_shutdown(v->domain, reason); } void hvm_inject_trap(struct hvm_trap *trap) diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/include/public/hvm/params.h --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -142,6 +142,9 @@ #define HVM_PARAM_ACCESS_RING_PFN 28 #define HVM_PARAM_SHARING_RING_PFN 29 -#define HVM_NR_PARAMS 31 +/* Boolean: Should a triple fault imply crash rather than reboot? */ +#define HVM_PARAM_TRIPLE_FAULT_CRASH 31 + +#define HVM_NR_PARAMS 32 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
Jan Beulich
2013-Feb-04 14:46 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
>>> On 04.02.13 at 15:25, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > While the triple fault action on native hardware will result in a system > reset, any modern operating system can and will make use of less violent > reboot methods. As a result, the most likely cause of a triple fault is a > fatal software bug. > > This patch allows the toolstack to indicate that a triple fault should mean > a > crash rather than a reboot. The default of reboot still remains the same.Makes sense to me; minor nits below (no need to resend just because of that, but would be nice to be addressed if you had to rev the patch anyway).> --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -1233,9 +1233,14 @@ void hvm_hlt(unsigned long rflags) > void hvm_triple_fault(void) > { > struct vcpu *v = current; > + struct domain * d = v->domain;Stray blank.> + u8 reason = d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_CRASH] > + ? SHUTDOWN_crash : SHUTDOWN_reboot; > + > gdprintk(XENLOG_INFO, "Triple fault on VCPU%d - " > - "invoking HVM system reset.\n", v->vcpu_id); > - domain_shutdown(v->domain, SHUTDOWN_reboot); > + "invoking HVM system %s.\n", v->vcpu_id, > + reason == SHUTDOWN_crash ? "crash" : "reboot"); > + domain_shutdown(v->domain, reason);So you have d cached in a local variable now, yet you still use v->domain here? Also, I''d prefer for the message to continue to say "reset". Jan
Andrew Cooper
2013-Feb-04 14:50 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
On 04/02/13 14:46, Jan Beulich wrote:>>>> On 04.02.13 at 15:25, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> While the triple fault action on native hardware will result in a system >> reset, any modern operating system can and will make use of less violent >> reboot methods. As a result, the most likely cause of a triple fault is a >> fatal software bug. >> >> This patch allows the toolstack to indicate that a triple fault should mean >> a >> crash rather than a reboot. The default of reboot still remains the same. > Makes sense to me; minor nits below (no need to resend just > because of that, but would be nice to be addressed if you had > to rev the patch anyway). > >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -1233,9 +1233,14 @@ void hvm_hlt(unsigned long rflags) >> void hvm_triple_fault(void) >> { >> struct vcpu *v = current; >> + struct domain * d = v->domain; > Stray blank.Space between * and d ?> >> + u8 reason = d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_CRASH] >> + ? SHUTDOWN_crash : SHUTDOWN_reboot; >> + >> gdprintk(XENLOG_INFO, "Triple fault on VCPU%d - " >> - "invoking HVM system reset.\n", v->vcpu_id); >> - domain_shutdown(v->domain, SHUTDOWN_reboot); >> + "invoking HVM system %s.\n", v->vcpu_id, >> + reason == SHUTDOWN_crash ? "crash" : "reboot"); >> + domain_shutdown(v->domain, reason); > So you have d cached in a local variable now, yet you still use > v->domain here?Doh - missed that.> > Also, I''d prefer for the message to continue to say "reset". > > Jan >Ok - I will respin and send as non-rfc.
Ian Campbell
2013-Feb-04 15:26 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
On Mon, 2013-02-04 at 14:25 +0000, Andrew Cooper wrote:> While the triple fault action on native hardware will result in a system > reset, any modern operating system can and will make use of less violent > reboot methods. As a result, the most likely cause of a triple fault is a > fatal software bug. > > This patch allows the toolstack to indicate that a triple fault should mean a > crash rather than a reboot. The default of reboot still remains the same.Just a random thought -- what about adding SHUTDOWN_triple_fault as an explicit thing, then the toolstack can decide what to do?> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/arch/x86/hvm/hvm.c > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -1233,9 +1233,14 @@ void hvm_hlt(unsigned long rflags) > void hvm_triple_fault(void) > { > struct vcpu *v = current; > + struct domain * d = v->domain; > + u8 reason = d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_CRASH] > + ? SHUTDOWN_crash : SHUTDOWN_reboot; > + > gdprintk(XENLOG_INFO, "Triple fault on VCPU%d - " > - "invoking HVM system reset.\n", v->vcpu_id); > - domain_shutdown(v->domain, SHUTDOWN_reboot); > + "invoking HVM system %s.\n", v->vcpu_id, > + reason == SHUTDOWN_crash ? "crash" : "reboot"); > + domain_shutdown(v->domain, reason); > } > > void hvm_inject_trap(struct hvm_trap *trap) > diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/include/public/hvm/params.h > --- a/xen/include/public/hvm/params.h > +++ b/xen/include/public/hvm/params.h > @@ -142,6 +142,9 @@ > #define HVM_PARAM_ACCESS_RING_PFN 28 > #define HVM_PARAM_SHARING_RING_PFN 29 > > -#define HVM_NR_PARAMS 31 > +/* Boolean: Should a triple fault imply crash rather than reboot? */ > +#define HVM_PARAM_TRIPLE_FAULT_CRASH 31 > + > +#define HVM_NR_PARAMS 32 > > #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Keir Fraser
2013-Feb-04 16:46 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
On 04/02/2013 15:26, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:> On Mon, 2013-02-04 at 14:25 +0000, Andrew Cooper wrote: >> While the triple fault action on native hardware will result in a system >> reset, any modern operating system can and will make use of less violent >> reboot methods. As a result, the most likely cause of a triple fault is a >> fatal software bug. >> >> This patch allows the toolstack to indicate that a triple fault should mean a >> crash rather than a reboot. The default of reboot still remains the same. > > Just a random thought -- what about adding SHUTDOWN_triple_fault as an > explicit thing, then the toolstack can decide what to do?I kind of prefer that, although it will require changes to every toolstack. An alternative would be to do that, *and* still have the new HVM_PARAM, so that any SHUTDOWN_* code can be generated by a triple fault (including new SHUTDOWN_triple_fault) -- but defaulting to SHUTDOWN_reboot so that the default behaviour is still unchanged. Or, in any case, I''m not dead against the existing patch, it just seems less flexible than it could be. But maybe that flexibility is pointless. -- Keir>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/arch/x86/hvm/hvm.c >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -1233,9 +1233,14 @@ void hvm_hlt(unsigned long rflags) >> void hvm_triple_fault(void) >> { >> struct vcpu *v = current; >> + struct domain * d = v->domain; >> + u8 reason = d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_CRASH] >> + ? SHUTDOWN_crash : SHUTDOWN_reboot; >> + >> gdprintk(XENLOG_INFO, "Triple fault on VCPU%d - " >> - "invoking HVM system reset.\n", v->vcpu_id); >> - domain_shutdown(v->domain, SHUTDOWN_reboot); >> + "invoking HVM system %s.\n", v->vcpu_id, >> + reason == SHUTDOWN_crash ? "crash" : "reboot"); >> + domain_shutdown(v->domain, reason); >> } >> >> void hvm_inject_trap(struct hvm_trap *trap) >> diff -r 5af4f2ab06f3 -r 6f8c532df545 xen/include/public/hvm/params.h >> --- a/xen/include/public/hvm/params.h >> +++ b/xen/include/public/hvm/params.h >> @@ -142,6 +142,9 @@ >> #define HVM_PARAM_ACCESS_RING_PFN 28 >> #define HVM_PARAM_SHARING_RING_PFN 29 >> >> -#define HVM_NR_PARAMS 31 >> +/* Boolean: Should a triple fault imply crash rather than reboot? */ >> +#define HVM_PARAM_TRIPLE_FAULT_CRASH 31 >> + >> +#define HVM_NR_PARAMS 32 >> >> #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > >
Andrew Cooper
2013-Feb-04 17:12 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
On 04/02/13 16:46, Keir Fraser wrote:> On 04/02/2013 15:26, "Ian Campbell" <Ian.Campbell@citrix.com> wrote: > >> On Mon, 2013-02-04 at 14:25 +0000, Andrew Cooper wrote: >>> While the triple fault action on native hardware will result in a system >>> reset, any modern operating system can and will make use of less violent >>> reboot methods. As a result, the most likely cause of a triple fault is a >>> fatal software bug. >>> >>> This patch allows the toolstack to indicate that a triple fault should mean a >>> crash rather than a reboot. The default of reboot still remains the same. >> Just a random thought -- what about adding SHUTDOWN_triple_fault as an >> explicit thing, then the toolstack can decide what to do? > I kind of prefer that, although it will require changes to every toolstack. > > An alternative would be to do that, *and* still have the new HVM_PARAM, so > that any SHUTDOWN_* code can be generated by a triple fault (including new > SHUTDOWN_triple_fault) -- but defaulting to SHUTDOWN_reboot so that the > default behaviour is still unchanged. > > Or, in any case, I''m not dead against the existing patch, it just seems less > flexible than it could be. But maybe that flexibility is pointless. > > -- KeirI considered this approach originally, but decided against it. SHUTDOWN_triple_fault would be meaningless as a standard SCHOP_shutdown parameter, and having the toolstack differentiate between _crash and _triple_fault seems pointless. I thought that the ideal end result would be specifying on_triple_fault="reboot"|"crash" In the vm.cfg file The on_{crash,reboot} actions would still then take effect as usual. Having said that, if _triple_fault is preferred, I am not overly attached to this specific implementation. If it isn''t obvious, the motivation behind this patch is because I am currently chasing a windows triple fault on Xen-4.2. It appears machine specific, but related to our PV driver, and takes a long time to reproduce. Having automated tests fail soon with a triple fault is better than having the domain in question sit in a reboot loop until the hour long timeout kicks in. ~Andrew
Keir Fraser
2013-Feb-04 17:55 UTC
Re: [PATCH RFC] hvm: Allow triple fault to imply crash rather than reboot
On 04/02/2013 17:12, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:>> An alternative would be to do that, *and* still have the new HVM_PARAM, so >> that any SHUTDOWN_* code can be generated by a triple fault (including new >> SHUTDOWN_triple_fault) -- but defaulting to SHUTDOWN_reboot so that the >> default behaviour is still unchanged. >> >> Or, in any case, I''m not dead against the existing patch, it just seems less >> flexible than it could be. But maybe that flexibility is pointless. >> >> -- Keir > > I considered this approach originally, but decided against it. > > SHUTDOWN_triple_fault would be meaningless as a standard SCHOP_shutdown > parameter, and having the toolstack differentiate between _crash and > _triple_fault seems pointless.How about letting the HVM_PARAM accept any SHUTDOWN_ code? Rather than being a boolean? That''s a trivial change, just seems a bit cleaner than a boolean to me. Also adding the SHUTDOWN_triple_fault seemed like a maybe-nice-to-have. I don''t really care that much, and indeed it probably is pointless.> I thought that the ideal end result would be specifying > > on_triple_fault="reboot"|"crash" > > In the vm.cfg file > > The on_{crash,reboot} actions would still then take effect as usual. > > Having said that, if _triple_fault is preferred, I am not overly > attached to this specific implementation.No let''s drop the idea of a SHUTDOWN_triple_fault. :)> If it isn''t obvious, the motivation behind this patch is because I am > currently chasing a windows triple fault on Xen-4.2. It appears machine > specific, but related to our PV driver, and takes a long time to > reproduce. Having automated tests fail soon with a triple fault is > better than having the domain in question sit in a reboot loop until the > hour long timeout kicks in.Yep, agreed, a patch along these lines of some sort is a very good idea! -- Keir