Andrew Cooper
2012-Nov-22 15:00 UTC
[PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
The self_nmi() code cause''s an NMI to be triggered by sending an APIC message to the local processor. However, NMIs are blocked by the VMEXIT, until the next iret or VMENTER. Volume 3 Chapter 27 Section 1 of the Intel SDM states: An NMI causes subsequent NMIs to be blocked, but only after the VM exit completes. As a result, as soon as the VMENTER happens, an immediate VMEXIT happens as a result of the queued NMI. We have seen hundreds of iterations of this VMEXIT/VMENTER loop before the HVM guest resumes normal operation. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -- Changes since v2 * Switch from ''int $2'' to do_nmi() * Reworked commit message to more clearly explain the problem diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ vector = intr_info & INTR_INFO_VECTOR_MASK; if ( vector == TRAP_machine_check ) do_machine_check(regs); + else if ( vector == TRAP_nmi && + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =+ (X86_EVENTTYPE_NMI << 8) ) ) + /* Must be called before interrupts are enabled to ensure + * the NMI handler code is run before the first IRET. The + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) + */ + do_nmi(); break; case EXIT_REASON_MCE_DURING_VMENTRY: do_machine_check(regs); @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_ (X86_EVENTTYPE_NMI << 8) ) goto exit_and_crash; HVMTRACE_0D(NMI); - self_nmi(); /* Real NMI, vector 2: normal processing. */ break; case TRAP_machine_check: HVMTRACE_0D(MCE);
Jan Beulich
2012-Nov-22 15:15 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > The self_nmi() code cause''s an NMI to be triggered by sending an APIC > message to the local processor. However, NMIs are blocked by the > VMEXIT, until the next iret or VMENTER. > > Volume 3 Chapter 27 Section 1 of the Intel SDM states: > > An NMI causes subsequent NMIs to be blocked, but only after the VM exit > completes. > > As a result, as soon as the VMENTER happens, an immediate VMEXIT > happens as a result of the queued NMI. We have seen hundreds of > iterations of this VMEXIT/VMENTER loop before the HVM guest resumes > normal operation. > > Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > -- > Changes since v2 > * Switch from ''int $2'' to do_nmi() > * Reworked commit message to more clearly explain the problem > > diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ > vector = intr_info & INTR_INFO_VECTOR_MASK; > if ( vector == TRAP_machine_check ) > do_machine_check(regs); > + else if ( vector == TRAP_nmi && > + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) => + (X86_EVENTTYPE_NMI << 8) ) ) > + /* Must be called before interrupts are enabled to ensure > + * the NMI handler code is run before the first IRET. The > + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) > + */ > + do_nmi();But that''s only half of it, at least as far as I recall the outcome of the discussion: You want an IRET (or VMRESUME) before possibly going into the scheduler (and hence not back to the current VM). And the same also on the PV exit path from an NMI. Jan> break; > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs); > @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_ > (X86_EVENTTYPE_NMI << 8) ) > goto exit_and_crash; > HVMTRACE_0D(NMI); > - self_nmi(); /* Real NMI, vector 2: normal processing. */ > break; > case TRAP_machine_check: > HVMTRACE_0D(MCE);
Andrew Cooper
2012-Nov-22 15:16 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 15:15, Jan Beulich wrote:>>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> The self_nmi() code cause''s an NMI to be triggered by sending an APIC >> message to the local processor. However, NMIs are blocked by the >> VMEXIT, until the next iret or VMENTER. >> >> Volume 3 Chapter 27 Section 1 of the Intel SDM states: >> >> An NMI causes subsequent NMIs to be blocked, but only after the VM exit >> completes. >> >> As a result, as soon as the VMENTER happens, an immediate VMEXIT >> happens as a result of the queued NMI. We have seen hundreds of >> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes >> normal operation. >> >> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> -- >> Changes since v2 >> * Switch from ''int $2'' to do_nmi() >> * Reworked commit message to more clearly explain the problem >> >> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >> vector = intr_info & INTR_INFO_VECTOR_MASK; >> if ( vector == TRAP_machine_check ) >> do_machine_check(regs); >> + else if ( vector == TRAP_nmi && >> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>> + (X86_EVENTTYPE_NMI << 8) ) ) >> + /* Must be called before interrupts are enabled to ensure >> + * the NMI handler code is run before the first IRET. The >> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >> + */ >> + do_nmi(); > But that''s only half of it, at least as far as I recall the outcome of > the discussion: You want an IRET (or VMRESUME) before possibly > going into the scheduler (and hence not back to the current VM). > And the same also on the PV exit path from an NMI. > > JanWhen I read this codepath, there seemed to be no consideration for re-scheduling. I will double check, but I think the VMRESUME is unconditional if the VMEXIT reason was a real NMI. ~Andrew> >> break; >> case EXIT_REASON_MCE_DURING_VMENTRY: >> do_machine_check(regs); >> @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_ >> (X86_EVENTTYPE_NMI << 8) ) >> goto exit_and_crash; >> HVMTRACE_0D(NMI); >> - self_nmi(); /* Real NMI, vector 2: normal processing. */ >> break; >> case TRAP_machine_check: >> HVMTRACE_0D(MCE); > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Nov-22 15:21 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 16:16, Andrew Cooper <andrew.cooper3@citrix.com> wrote:> On 22/11/12 15:15, Jan Beulich wrote: >>>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> The self_nmi() code cause''s an NMI to be triggered by sending an APIC >>> message to the local processor. However, NMIs are blocked by the >>> VMEXIT, until the next iret or VMENTER. >>> >>> Volume 3 Chapter 27 Section 1 of the Intel SDM states: >>> >>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit >>> completes. >>> >>> As a result, as soon as the VMENTER happens, an immediate VMEXIT >>> happens as a result of the queued NMI. We have seen hundreds of >>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes >>> normal operation. >>> >>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> >>> -- >>> Changes since v2 >>> * Switch from ''int $2'' to do_nmi() >>> * Reworked commit message to more clearly explain the problem >>> >>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >>> vector = intr_info & INTR_INFO_VECTOR_MASK; >>> if ( vector == TRAP_machine_check ) >>> do_machine_check(regs); >>> + else if ( vector == TRAP_nmi && >>> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>> + (X86_EVENTTYPE_NMI << 8) ) ) >>> + /* Must be called before interrupts are enabled to ensure >>> + * the NMI handler code is run before the first IRET. The >>> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >>> + */ >>> + do_nmi(); >> But that''s only half of it, at least as far as I recall the outcome of >> the discussion: You want an IRET (or VMRESUME) before possibly >> going into the scheduler (and hence not back to the current VM). >> And the same also on the PV exit path from an NMI. > > When I read this codepath, there seemed to be no consideration for > re-scheduling. I will double check, but I think the VMRESUME is > unconditional if the VMEXIT reason was a real NMI.Interesting - I see nothing NMI-related in vmx/entry.S, i.e. it is my understanding that just like in any other case you may end up calling do_softirq on the way out from handling the NMI. Jan
Mats Petersson
2012-Nov-22 15:22 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 15:16, Andrew Cooper wrote:> On 22/11/12 15:15, Jan Beulich wrote: >>>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> The self_nmi() code cause''s an NMI to be triggered by sending an APIC >>> message to the local processor. However, NMIs are blocked by the >>> VMEXIT, until the next iret or VMENTER. >>> >>> Volume 3 Chapter 27 Section 1 of the Intel SDM states: >>> >>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit >>> completes. >>> >>> As a result, as soon as the VMENTER happens, an immediate VMEXIT >>> happens as a result of the queued NMI. We have seen hundreds of >>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes >>> normal operation. >>> >>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> >>> -- >>> Changes since v2 >>> * Switch from ''int $2'' to do_nmi() >>> * Reworked commit message to more clearly explain the problem >>> >>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >>> vector = intr_info & INTR_INFO_VECTOR_MASK; >>> if ( vector == TRAP_machine_check ) >>> do_machine_check(regs); >>> + else if ( vector == TRAP_nmi && >>> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>> + (X86_EVENTTYPE_NMI << 8) ) ) >>> + /* Must be called before interrupts are enabled to ensure >>> + * the NMI handler code is run before the first IRET. The >>> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >>> + */ >>> + do_nmi(); >> But that''s only half of it, at least as far as I recall the outcome of >> the discussion: You want an IRET (or VMRESUME) before possibly >> going into the scheduler (and hence not back to the current VM). >> And the same also on the PV exit path from an NMI. >> >> Jan > When I read this codepath, there seemed to be no consideration for > re-scheduling. I will double check, but I think the VMRESUME is > unconditional if the VMEXIT reason was a real NMI.Technically, we could probably get a timer interrupt which causes a scheduling event, between the NMI and the VMEXIT. However, that timer interrupt will have an IRET at the end of it, so it fixes itself.> > ~Andrew > >>> break; >>> case EXIT_REASON_MCE_DURING_VMENTRY: >>> do_machine_check(regs); >>> @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_ >>> (X86_EVENTTYPE_NMI << 8) ) >>> goto exit_and_crash; >>> HVMTRACE_0D(NMI); >>> - self_nmi(); /* Real NMI, vector 2: normal processing. */ >>> break; >>> case TRAP_machine_check: >>> HVMTRACE_0D(MCE); >>
Andrew Cooper
2012-Nov-22 15:37 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 15:21, Jan Beulich wrote:>>>> On 22.11.12 at 16:16, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 22/11/12 15:15, Jan Beulich wrote: >>>>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> The self_nmi() code cause''s an NMI to be triggered by sending an APIC >>>> message to the local processor. However, NMIs are blocked by the >>>> VMEXIT, until the next iret or VMENTER. >>>> >>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states: >>>> >>>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit >>>> completes. >>>> >>>> As a result, as soon as the VMENTER happens, an immediate VMEXIT >>>> happens as a result of the queued NMI. We have seen hundreds of >>>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes >>>> normal operation. >>>> >>>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> >>>> -- >>>> Changes since v2 >>>> * Switch from ''int $2'' to do_nmi() >>>> * Reworked commit message to more clearly explain the problem >>>> >>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >>>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >>>> vector = intr_info & INTR_INFO_VECTOR_MASK; >>>> if ( vector == TRAP_machine_check ) >>>> do_machine_check(regs); >>>> + else if ( vector == TRAP_nmi && >>>> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>>> + (X86_EVENTTYPE_NMI << 8) ) ) >>>> + /* Must be called before interrupts are enabled to ensure >>>> + * the NMI handler code is run before the first IRET. The >>>> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >>>> + */ >>>> + do_nmi(); >>> But that''s only half of it, at least as far as I recall the outcome of >>> the discussion: You want an IRET (or VMRESUME) before possibly >>> going into the scheduler (and hence not back to the current VM). >>> And the same also on the PV exit path from an NMI. >> When I read this codepath, there seemed to be no consideration for >> re-scheduling. I will double check, but I think the VMRESUME is >> unconditional if the VMEXIT reason was a real NMI. > Interesting - I see nothing NMI-related in vmx/entry.S, i.e. it is > my understanding that just like in any other case you may end > up calling do_softirq on the way out from handling the NMI. > > Jan >Ah yes - vmx_asm_do_vmentry does have the potential to process softirqs, which we will want to do with NMIs enabled. I missed that last time I looked. So we need to deliberately execute an iretd. int $2 is unusable because it adds an extra dimension to the fake NMI corruption issue if you get the combination int $2->MCE->NMI before the int $2 has correctly saved its exception frame. A quick solution would be to execute a noop function with run_in_exception_handler(). Alternatively, I can code enable_nmi() or so which does an inline iret to itself. Which of these would you prefer? -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Nov-22 15:55 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > A quick solution would be to execute a noop function with > run_in_exception_handler(). Alternatively, I can code enable_nmi() or > so which does an inline iret to itself. Which of these would you prefer?I would actually consider avoiding to run softirqs altogether in that case, just like native Linux doesn''t do any event or scheduler processing in that case. Jan
Jan Beulich
2012-Nov-22 16:00 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 16:22, Mats Petersson <mats.petersson@citrix.com> wrote: > On 22/11/12 15:16, Andrew Cooper wrote: >> On 22/11/12 15:15, Jan Beulich wrote: >>>>>> On 22.11.12 at 16:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> The self_nmi() code cause''s an NMI to be triggered by sending an APIC >>>> message to the local processor. However, NMIs are blocked by the >>>> VMEXIT, until the next iret or VMENTER. >>>> >>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states: >>>> >>>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit >>>> completes. >>>> >>>> As a result, as soon as the VMENTER happens, an immediate VMEXIT >>>> happens as a result of the queued NMI. We have seen hundreds of >>>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes >>>> normal operation. >>>> >>>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> >>>> -- >>>> Changes since v2 >>>> * Switch from ''int $2'' to do_nmi() >>>> * Reworked commit message to more clearly explain the problem >>>> >>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >>>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >>>> vector = intr_info & INTR_INFO_VECTOR_MASK; >>>> if ( vector == TRAP_machine_check ) >>>> do_machine_check(regs); >>>> + else if ( vector == TRAP_nmi && >>>> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>>> + (X86_EVENTTYPE_NMI << 8) ) ) >>>> + /* Must be called before interrupts are enabled to ensure >>>> + * the NMI handler code is run before the first IRET. The >>>> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >>>> + */ >>>> + do_nmi(); >>> But that''s only half of it, at least as far as I recall the outcome of >>> the discussion: You want an IRET (or VMRESUME) before possibly >>> going into the scheduler (and hence not back to the current VM). >>> And the same also on the PV exit path from an NMI. >>> >>> Jan >> When I read this codepath, there seemed to be no consideration for >> re-scheduling. I will double check, but I think the VMRESUME is >> unconditional if the VMEXIT reason was a real NMI. > > Technically, we could probably get a timer interrupt which causes a > scheduling event, between the NMI and the VMEXIT. However, that timer > interrupt will have an IRET at the end of it, so it fixes itself.Not exactly - such a timer interrupt wouldn''t cause any scheduling to happen (this and other softirqs get run only when exiting to a guest, or from an idle vCPU); when exiting to the hypervisor nothing like that happens (and it would be catastrophic if it did). Furthermore, interrupts should _never_ get enabled while handling an NMI. Jan
Andrew Cooper
2012-Nov-22 16:05 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 15:55, Jan Beulich wrote:>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> A quick solution would be to execute a noop function with >> run_in_exception_handler(). Alternatively, I can code enable_nmi() or >> so which does an inline iret to itself. Which of these would you prefer? > I would actually consider avoiding to run softirqs altogether in that > case, just like native Linux doesn''t do any event or scheduler > processing in that case. > > Jan >That would probably be the easiest solution. I was already going to do the same for the rentrant NMI and MCE handling case (and also the process pending upcalls checking), due to the complexities of fixing the race condition at the end of the handler. Unfortunately, I don''t think I have time to look at this issue immediately, but if it is ok to wait till the beginning of next week, I can roll it into the main dev work for the other NMI/MCE work. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Nov-22 16:12 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 22/11/12 15:55, Jan Beulich wrote: >>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> A quick solution would be to execute a noop function with >>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or >>> so which does an inline iret to itself. Which of these would you prefer? >> I would actually consider avoiding to run softirqs altogether in that >> case, just like native Linux doesn''t do any event or scheduler >> processing in that case. > > That would probably be the easiest solution. > > I was already going to do the same for the rentrant NMI and MCE handling > case (and also the process pending upcalls checking), due to the > complexities of fixing the race condition at the end of the handler. > > Unfortunately, I don''t think I have time to look at this issue > immediately, but if it is ok to wait till the beginning of next week, IThat''s fine of course.> can roll it into the main dev work for the other NMI/MCE work.Hmm, that would mean it''s going to be more difficult to backport. I''d prefer having this in as simple a fashion as possible in the patch here, and then have your other patch(es) build on top of that. Jan
Andrew Cooper
2012-Nov-22 16:31 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 16:12, Jan Beulich wrote:>>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 22/11/12 15:55, Jan Beulich wrote: >>>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> A quick solution would be to execute a noop function with >>>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or >>>> so which does an inline iret to itself. Which of these would you prefer? >>> I would actually consider avoiding to run softirqs altogether in that >>> case, just like native Linux doesn''t do any event or scheduler >>> processing in that case. >> That would probably be the easiest solution. >> >> I was already going to do the same for the rentrant NMI and MCE handling >> case (and also the process pending upcalls checking), due to the >> complexities of fixing the race condition at the end of the handler. >> >> Unfortunately, I don''t think I have time to look at this issue >> immediately, but if it is ok to wait till the beginning of next week, I > That''s fine of course. > >> can roll it into the main dev work for the other NMI/MCE work. > Hmm, that would mean it''s going to be more difficult to backport. > I''d prefer having this in as simple a fashion as possible in the > patch here, and then have your other patch(es) build on top of > that. > > Jan >I will ensure that the fix for this is suitably separate in the patch series, and in the easiest backportable form. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Tim Deegan
2012-Nov-22 17:34 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 15:00 +0000 on 22 Nov (1353596446), Andrew Cooper wrote:> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ > vector = intr_info & INTR_INFO_VECTOR_MASK; > if ( vector == TRAP_machine_check ) > do_machine_check(regs); > + else if ( vector == TRAP_nmi && > + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) => + (X86_EVENTTYPE_NMI << 8) ) ) > + /* Must be called before interrupts are enabled to ensure > + * the NMI handler code is run before the first IRET. The > + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) > + */ > + do_nmi(); > break; > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs);I think it might also make sense to move the HVMTARCE invocations that are just above here down to after this block. There''s quite a lot of code behind there and though I don''t see any potential faults they might well get added later (including in places like printk() and tasklet_schedule()). I had a look at the rest of the code that runs between the vmexit and here, and it looks OK to me. George, would that make the tracing more confusing? Tim.
George Dunlap
2012-Nov-26 11:50 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 22/11/12 17:34, Tim Deegan wrote:> At 15:00 +0000 on 22 Nov (1353596446), Andrew Cooper wrote: >> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_ >> vector = intr_info & INTR_INFO_VECTOR_MASK; >> if ( vector == TRAP_machine_check ) >> do_machine_check(regs); >> + else if ( vector == TRAP_nmi && >> + ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>> + (X86_EVENTTYPE_NMI << 8) ) ) >> + /* Must be called before interrupts are enabled to ensure >> + * the NMI handler code is run before the first IRET. The >> + * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1) >> + */ >> + do_nmi(); >> break; >> case EXIT_REASON_MCE_DURING_VMENTRY: >> do_machine_check(regs); > I think it might also make sense to move the HVMTARCE invocations that > are just above here down to after this block. There''s quite a lot of > code behind there and though I don''t see any potential faults they might > well get added later (including in places like printk() and > tasklet_schedule()). > > I had a look at the rest of the code that runs between the vmexit and > here, and it looks OK to me. > > George, would that make the tracing more confusing?Well, it would mildly, because you''d get the trace in vmx_do_extint() before the trace for the VMEXIT that caused it. :-) If it''s important for correctness, then xenalyze will just have to deal. But it would be a lot nicer not to have to deal. -George
Jan Beulich
2013-Feb-28 09:58 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 22.11.12 at 17:12, Jan Beulich wrote: >>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 22/11/12 15:55, Jan Beulich wrote: > >>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >>> A quick solution would be to execute a noop function with > >>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or > >>> so which does an inline iret to itself. Which of these would you prefer? > >> I would actually consider avoiding to run softirqs altogether in that > >> case, just like native Linux doesn''t do any event or scheduler > >> processing in that case. > > > > That would probably be the easiest solution. > > > > I was already going to do the same for the rentrant NMI and MCE handling > > case (and also the process pending upcalls checking), due to the > > complexities of fixing the race condition at the end of the handler. > > > > Unfortunately, I don''t think I have time to look at this issue > > immediately, but if it is ok to wait till the beginning of next week, I > > That''s fine of course.So that was 3 months ago, and we''re likely going to need to do further unfixed releases if this won''t move forward. Can you estimate whether you''ll be able to get back to this? Jan
Andrew Cooper
2013-Feb-28 12:32 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/13 09:58, Jan Beulich wrote:>>>> On 22.11.12 at 17:12, Jan Beulich wrote: >>>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> On 22/11/12 15:55, Jan Beulich wrote: >>>>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>> A quick solution would be to execute a noop function with >>>>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or >>>>> so which does an inline iret to itself. Which of these would you prefer? >>>> I would actually consider avoiding to run softirqs altogether in that >>>> case, just like native Linux doesn''t do any event or scheduler >>>> processing in that case. >>> That would probably be the easiest solution. >>> >>> I was already going to do the same for the rentrant NMI and MCE handling >>> case (and also the process pending upcalls checking), due to the >>> complexities of fixing the race condition at the end of the handler. >>> >>> Unfortunately, I don''t think I have time to look at this issue >>> immediately, but if it is ok to wait till the beginning of next week, I >> That''s fine of course. > So that was 3 months ago, and we''re likely going to need to do > further unfixed releases if this won''t move forward. Can you > estimate whether you''ll be able to get back to this? > > Jan >With the new enable_nmi() function, I can spin a workaround patch quite quickly, and will try to get that done this week. The remainder of the reentrant tangle is still happening very slowly. ~Andrew
Tim Deegan
2013-Feb-28 13:00 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich wrote:> >>> On 22.11.12 at 17:12, Jan Beulich wrote: > >>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > > On 22/11/12 15:55, Jan Beulich wrote: > > >>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > >>> A quick solution would be to execute a noop function with > > >>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or > > >>> so which does an inline iret to itself. Which of these would you prefer? > > >> I would actually consider avoiding to run softirqs altogether in that > > >> case, just like native Linux doesn''t do any event or scheduler > > >> processing in that case. > > > > > > That would probably be the easiest solution. > > > > > > I was already going to do the same for the rentrant NMI and MCE handling > > > case (and also the process pending upcalls checking), due to the > > > complexities of fixing the race condition at the end of the handler. > > > > > > Unfortunately, I don''t think I have time to look at this issue > > > immediately, but if it is ok to wait till the beginning of next week, I > > > > That''s fine of course. > > So that was 3 months ago, and we''re likely going to need to do > further unfixed releases if this won''t move forward. Can you > estimate whether you''ll be able to get back to this?Let''s make a step in the right direction before 4.3, at least: commit d278beed1df2d226911dce92295411018c9bba2f Author: Tim Deegan <tim@xen.org> Date: Thu Feb 28 12:42:15 2013 +0000 vmx: handle NMIs before re-enabling interrupts. Also, switch to calling do_nmi() and explicitly re-enabling NMIs rather than raising a fake NMI and relying on the NMI processing to IRET, since that handling code is likely to change a fair amount in future. Signed-off-by: Tim Deegan <tim@xen.org> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 5378928..462bb0f 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) vector = intr_info & INTR_INFO_VECTOR_MASK; if ( vector == TRAP_machine_check ) do_machine_check(regs); + if ( vector == TRAP_machine_check + && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =+ (X86_EVENTTYPE_NMI << 8)) ) + { + do_nmi(regs); + enable_nmis(); + } break; case EXIT_REASON_MCE_DURING_VMENTRY: do_machine_check(regs); @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) (X86_EVENTTYPE_NMI << 8) ) goto exit_and_crash; HVMTRACE_0D(NMI); - self_nmi(); /* Real NMI, vector 2: normal processing. */ + /* Already handled above. */ break; case TRAP_machine_check: HVMTRACE_0D(MCE);
Andrew Cooper
2013-Feb-28 13:12 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/13 13:00, Tim Deegan wrote:> At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich wrote: >>>>> On 22.11.12 at 17:12, Jan Beulich wrote: >>>>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> On 22/11/12 15:55, Jan Beulich wrote: >>>>>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>>> A quick solution would be to execute a noop function with >>>>>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or >>>>>> so which does an inline iret to itself. Which of these would you prefer? >>>>> I would actually consider avoiding to run softirqs altogether in that >>>>> case, just like native Linux doesn''t do any event or scheduler >>>>> processing in that case. >>>> That would probably be the easiest solution. >>>> >>>> I was already going to do the same for the rentrant NMI and MCE handling >>>> case (and also the process pending upcalls checking), due to the >>>> complexities of fixing the race condition at the end of the handler. >>>> >>>> Unfortunately, I don''t think I have time to look at this issue >>>> immediately, but if it is ok to wait till the beginning of next week, I >>> That''s fine of course. >> So that was 3 months ago, and we''re likely going to need to do >> further unfixed releases if this won''t move forward. Can you >> estimate whether you''ll be able to get back to this? > Let''s make a step in the right direction before 4.3, at least: > > commit d278beed1df2d226911dce92295411018c9bba2f > Author: Tim Deegan <tim@xen.org> > Date: Thu Feb 28 12:42:15 2013 +0000 > > vmx: handle NMIs before re-enabling interrupts. > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > rather than raising a fake NMI and relying on the NMI processing to > IRET, since that handling code is likely to change a fair amount in > future. > > Signed-off-by: Tim Deegan <tim@xen.org>Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> As this is functionally equivalent to the patch I was just testing.> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index 5378928..462bb0f 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > vector = intr_info & INTR_INFO_VECTOR_MASK; > if ( vector == TRAP_machine_check ) > do_machine_check(regs); > + if ( vector == TRAP_machine_check > + && ((intr_info & INTR_INFO_INTR_TYPE_MASK) => + (X86_EVENTTYPE_NMI << 8)) ) > + { > + do_nmi(regs); > + enable_nmis(); > + } > break; > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs); > @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > (X86_EVENTTYPE_NMI << 8) ) > goto exit_and_crash; > HVMTRACE_0D(NMI); > - self_nmi(); /* Real NMI, vector 2: normal processing. */ > + /* Already handled above. */ > break; > case TRAP_machine_check: > HVMTRACE_0D(MCE);
Jan Beulich
2013-Feb-28 13:39 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote: > commit d278beed1df2d226911dce92295411018c9bba2f > Author: Tim Deegan <tim@xen.org> > Date: Thu Feb 28 12:42:15 2013 +0000 > > vmx: handle NMIs before re-enabling interrupts. > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > rather than raising a fake NMI and relying on the NMI processing to > IRET, since that handling code is likely to change a fair amount in > future.Isn''t that a gross understatement, considering that you or Malcolm had found that the use of self_nmi() here was actively broken?> Signed-off-by: Tim Deegan <tim@xen.org>Acked-by: Jan Beulich <jbeulich@suse.com> (realizing that dealing with the PV side of the issue will be left to me in the end)
Jan Beulich
2013-Feb-28 13:42 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote: > vmx: handle NMIs before re-enabling interrupts. > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > rather than raising a fake NMI and relying on the NMI processing to > IRET, since that handling code is likely to change a fair amount in > future. > > Signed-off-by: Tim Deegan <tim@xen.org>Sorry, my Acked-by just sent requires one change to be done first:> --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > vector = intr_info & INTR_INFO_VECTOR_MASK; > if ( vector == TRAP_machine_check ) > do_machine_check(regs); > + if ( vector == TRAP_machine_checkThis almost certainly needs to be TRAP_nmi. Jan> + && ((intr_info & INTR_INFO_INTR_TYPE_MASK) => + (X86_EVENTTYPE_NMI << 8)) ) > + { > + do_nmi(regs); > + enable_nmis(); > + } > break; > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs); > @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > (X86_EVENTTYPE_NMI << 8) ) > goto exit_and_crash; > HVMTRACE_0D(NMI); > - self_nmi(); /* Real NMI, vector 2: normal processing. */ > + /* Already handled above. */ > break; > case TRAP_machine_check: > HVMTRACE_0D(MCE);
Tim Deegan
2013-Feb-28 14:04 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 13:42 +0000 on 28 Feb (1362058925), Jan Beulich wrote:> >>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote: > > vmx: handle NMIs before re-enabling interrupts. > > > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > > rather than raising a fake NMI and relying on the NMI processing to > > IRET, since that handling code is likely to change a fair amount in > > future. > > > > Signed-off-by: Tim Deegan <tim@xen.org> > > Sorry, my Acked-by just sent requires one change to be done first: > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > > vector = intr_info & INTR_INFO_VECTOR_MASK; > > if ( vector == TRAP_machine_check ) > > do_machine_check(regs); > > + if ( vector == TRAP_machine_check > > This almost certainly needs to be TRAP_nmi.! I was foolishly using the watchdog for testing, and of course losing NMIs doesn''t cause the watchdog to trigger. :| Here''s v2, with that fixed and updating the description: commit 7dd3b06ff031c9a8c727df16c5def2afb382101c Author: Tim Deegan <tim@xen.org> Date: Thu Feb 28 12:42:15 2013 +0000 vmx: fix handling of NMI VMEXIT. Call do_nmi() directly and explicitly re-enable NMIs rather than raising an NMI through the APIC. Since NMIs are disabled after the VMEXIT, the raised NMI would be blocked until the next IRET instruction (i.e. the next real interrupt, or after scheduling a PV guest) and in the meantime the guest will spin taking NMI VMEXITS. Also, handle NMIs before re-enabling interrupts, since if we handle an interrupt (and therefore IRET) before calling do_nmi(), we may end up running the NMI handler with NMIs enabled. Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 5378928..04dbefb 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) vector = intr_info & INTR_INFO_VECTOR_MASK; if ( vector == TRAP_machine_check ) do_machine_check(regs); + if ( vector == TRAP_nmi + && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =+ (X86_EVENTTYPE_NMI << 8)) ) + { + do_nmi(regs); + enable_nmis(); + } break; case EXIT_REASON_MCE_DURING_VMENTRY: do_machine_check(regs); @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) (X86_EVENTTYPE_NMI << 8) ) goto exit_and_crash; HVMTRACE_0D(NMI); - self_nmi(); /* Real NMI, vector 2: normal processing. */ + /* Already handled above. */ break; case TRAP_machine_check: HVMTRACE_0D(MCE);
Tim Deegan
2013-Feb-28 14:25 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich wrote:> >>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote: > > commit d278beed1df2d226911dce92295411018c9bba2f > > Author: Tim Deegan <tim@xen.org> > > Date: Thu Feb 28 12:42:15 2013 +0000 > > > > vmx: handle NMIs before re-enabling interrupts. > > > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > > rather than raising a fake NMI and relying on the NMI processing to > > IRET, since that handling code is likely to change a fair amount in > > future. > > Isn''t that a gross understatement, considering that you or Malcolm > had found that the use of self_nmi() here was actively broken? > > > Signed-off-by: Tim Deegan <tim@xen.org> > > Acked-by: Jan Beulich <jbeulich@suse.com> > > (realizing that dealing with the PV side of the issue will be left to > me in the end)For PV, would you be happy with something like this, or do you want to avoid the extra IRET in cases where we would be returning with IRET anyway? diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 04dbefb..598a5b5 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2316,10 +2316,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) if ( vector == TRAP_nmi && ((intr_info & INTR_INFO_INTR_TYPE_MASK) = (X86_EVENTTYPE_NMI << 8)) ) - { do_nmi(regs); - enable_nmis(); - } break; case EXIT_REASON_MCE_DURING_VMENTRY: do_machine_check(regs); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index d36eddd..2a66080 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs) ++nmi_count(cpu); if ( nmi_callback(regs, cpu) ) - return; + goto out; if ( nmi_watchdog ) nmi_watchdog_tick(regs); @@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs) if ( !(reason & 0xc0) && !nmi_watchdog ) unknown_nmi_error(regs, reason); } + +out: + enable_nmis(); } void set_nmi_callback(nmi_callback_t callback)
Jan Beulich
2013-Feb-28 14:42 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 15:25, Tim Deegan <tim@xen.org> wrote: > At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich wrote: >> (realizing that dealing with the PV side of the issue will be left to >> me in the end) > > For PV, would you be happy with something like this, or do you want to > avoid the extra IRET in cases where we would be returning with IRET > anyway?No, and not so much because of the redundant IRET, but because ...> --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs) > ++nmi_count(cpu); > > if ( nmi_callback(regs, cpu) ) > - return; > + goto out; > > if ( nmi_watchdog ) > nmi_watchdog_tick(regs); > @@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs) > if ( !(reason & 0xc0) && !nmi_watchdog ) > unknown_nmi_error(regs, reason); > } > + > +out: > + enable_nmis();... this must not be done when on the NMI stack (i.e. when the NMI was raised while in hypervisor context). Checking for this here would be strait forward, but I was really considering to do all of this in the assembly exit path, and I was still undecided whether we shouldn''t follow Linux in skipping softirq processing (and hence scheduling) on the way out from an NMI (I don''t think we''d need to do the same for MCE). Jan> } > > void set_nmi_callback(nmi_callback_t callback)
Andrew Cooper
2013-Feb-28 14:45 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/13 14:42, Jan Beulich wrote:>>>> On 28.02.13 at 15:25, Tim Deegan <tim@xen.org> wrote: >> At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich wrote: >>> (realizing that dealing with the PV side of the issue will be left to >>> me in the end) >> For PV, would you be happy with something like this, or do you want to >> avoid the extra IRET in cases where we would be returning with IRET >> anyway? > No, and not so much because of the redundant IRET, but > because ... > >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs) >> ++nmi_count(cpu); >> >> if ( nmi_callback(regs, cpu) ) >> - return; >> + goto out; >> >> if ( nmi_watchdog ) >> nmi_watchdog_tick(regs); >> @@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs) >> if ( !(reason & 0xc0) && !nmi_watchdog ) >> unknown_nmi_error(regs, reason); >> } >> + >> +out: >> + enable_nmis(); > ... this must not be done when on the NMI stack (i.e. when the > NMI was raised while in hypervisor context). Checking for this > here would be strait forward, but I was really considering to do > all of this in the assembly exit path, and I was still undecided > whether we shouldn''t follow Linux in skipping softirq processing > (and hence scheduling) on the way out from an NMI (I don''t > think we''d need to do the same for MCE). > > JanIt is furthermore pointless. If we interrupted a guest with the NMI, we would have moved onto the main stack. We would only be on the NMI stack at this point if we interrupted Xen with the NMI, at which point we will be iret''ing back anyway. ~Andrew> >> } >> >> void set_nmi_callback(nmi_callback_t callback) > >
Tim Deegan
2013-Feb-28 14:49 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 14:42 +0000 on 28 Feb (1362062542), Jan Beulich wrote:> > +out: > > + enable_nmis(); > > ... this must not be done when on the NMI stack (i.e. when the > NMI was raised while in hypervisor context).Right; I had forgotten that we didn''t switch stacks in that case.> Checking for this > here would be strait forward, but I was really considering to do > all of this in the assembly exit path, and I was still undecided > whether we shouldn''t follow Linux in skipping softirq processing > (and hence scheduling) on the way out from an NMI (I don''t > think we''d need to do the same for MCE).That sounds sensible, if it simplifies the logic (i.e. if it avoids needing an extra IRET just to re-enable NMIs when we might as well IRET to the guest). Or is there any other reason to avoid it? Tim.
Konrad Rzeszutek Wilk
2013-Feb-28 14:51 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On Thu, Feb 28, 2013 at 01:00:08PM +0000, Tim Deegan wrote:> At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich wrote: > > >>> On 22.11.12 at 17:12, Jan Beulich wrote: > > >>>> On 22.11.12 at 17:05, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > > > On 22/11/12 15:55, Jan Beulich wrote: > > > >>>>> On 22.11.12 at 16:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > > >>> A quick solution would be to execute a noop function with > > > >>> run_in_exception_handler(). Alternatively, I can code enable_nmi() or > > > >>> so which does an inline iret to itself. Which of these would you prefer? > > > >> I would actually consider avoiding to run softirqs altogether in that > > > >> case, just like native Linux doesn''t do any event or scheduler > > > >> processing in that case. > > > > > > > > That would probably be the easiest solution. > > > > > > > > I was already going to do the same for the rentrant NMI and MCE handling > > > > case (and also the process pending upcalls checking), due to the > > > > complexities of fixing the race condition at the end of the handler. > > > > > > > > Unfortunately, I don''t think I have time to look at this issue > > > > immediately, but if it is ok to wait till the beginning of next week, I > > > > > > That''s fine of course. > > > > So that was 3 months ago, and we''re likely going to need to do > > further unfixed releases if this won''t move forward. Can you > > estimate whether you''ll be able to get back to this? > > Let''s make a step in the right direction before 4.3, at least:FYI, this also fixes oprofile being able to collect HVM guest data.> > commit d278beed1df2d226911dce92295411018c9bba2f > Author: Tim Deegan <tim@xen.org> > Date: Thu Feb 28 12:42:15 2013 +0000 > > vmx: handle NMIs before re-enabling interrupts. > > Also, switch to calling do_nmi() and explicitly re-enabling NMIs > rather than raising a fake NMI and relying on the NMI processing to > IRET, since that handling code is likely to change a fair amount in > future. > > Signed-off-by: Tim Deegan <tim@xen.org> > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index 5378928..462bb0f 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > vector = intr_info & INTR_INFO_VECTOR_MASK; > if ( vector == TRAP_machine_check ) > do_machine_check(regs); > + if ( vector == TRAP_machine_check > + && ((intr_info & INTR_INFO_INTR_TYPE_MASK) => + (X86_EVENTTYPE_NMI << 8)) ) > + { > + do_nmi(regs); > + enable_nmis(); > + } > break; > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs); > @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > (X86_EVENTTYPE_NMI << 8) ) > goto exit_and_crash; > HVMTRACE_0D(NMI); > - self_nmi(); /* Real NMI, vector 2: normal processing. */ > + /* Already handled above. */ > break; > case TRAP_machine_check: > HVMTRACE_0D(MCE); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Jan Beulich
2013-Feb-28 15:01 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 15:49, Tim Deegan <tim@xen.org> wrote: > At 14:42 +0000 on 28 Feb (1362062542), Jan Beulich wrote: >> Checking for this >> here would be strait forward, but I was really considering to do >> all of this in the assembly exit path, and I was still undecided >> whether we shouldn''t follow Linux in skipping softirq processing >> (and hence scheduling) on the way out from an NMI (I don''t >> think we''d need to do the same for MCE). > > That sounds sensible, if it simplifies the logic (i.e. if it avoids > needing an extra IRET just to re-enable NMIs when we might as well IRET > to the guest). Or is there any other reason to avoid it?No, I don''t think so. We only need to make sure we don''t execute one as long as we''re on the NMI stack. Jan
Jan Beulich
2013-Feb-28 15:41 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 15:42, "Jan Beulich" <JBeulich@suse.com> wrote: > ... this must not be done when on the NMI stack (i.e. when the > NMI was raised while in hypervisor context). Checking for this > here would be strait forward, but I was really considering to do > all of this in the assembly exit path, and I was still undecided > whether we shouldn''t follow Linux in skipping softirq processing > (and hence scheduling) on the way out from an NMI (I don''t > think we''d need to do the same for MCE).Like this: x86: skip processing events on the NMI exit path Otherwise, we may end up in the scheduler, keeping NMIs masked for a possibly unbounded time (until whenever the next IRET gets executed). Of course it''s open for discussion whether to always use the strait exit path from handle_ist_exception. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_64/compat/entry.S +++ b/xen/arch/x86/x86_64/compat/entry.S @@ -171,7 +171,7 @@ compat_bad_hypercall: jmp compat_test_all_events /* %rbx: struct vcpu, interrupts disabled */ -compat_restore_all_guest: +ENTRY(compat_restore_all_guest) ASSERT_INTERRUPTS_DISABLED RESTORE_ALL adj=8 compat=1 .Lft0: iretq --- a/xen/arch/x86/x86_64/entry.S +++ b/xen/arch/x86/x86_64/entry.S @@ -635,6 +635,9 @@ ENTRY(early_page_fault) jmp restore_all_xen .popsection +ENTRY(nmi) + pushq $0 + movl $TRAP_nmi,4(%rsp) handle_ist_exception: SAVE_ALL testb $3,UREGS_cs(%rsp) @@ -649,12 +652,17 @@ handle_ist_exception: movzbl UREGS_entry_vector(%rsp),%eax leaq exception_table(%rip),%rdx callq *(%rdx,%rax,8) - jmp ret_from_intr + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) + jne ret_from_intr -ENTRY(nmi) - pushq $0 - movl $TRAP_nmi,4(%rsp) - jmp handle_ist_exception + /* We want to get strait to the IRET in the NMI exit path. */ + testb $3,UREGS_cs(%rsp) + GET_CURRENT(%rbx) + jz restore_all_xen + movq VCPU_domain(%rbx),%rax + testb $1,DOMAIN_is_32bit_pv(%rax) + jz restore_all_guest + jmp compat_restore_all_guest ENTRY(nmi_crash) pushq $0
Andrew Cooper
2013-Feb-28 15:52 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/13 15:41, Jan Beulich wrote:>>>> On 28.02.13 at 15:42, "Jan Beulich" <JBeulich@suse.com> wrote: >> ... this must not be done when on the NMI stack (i.e. when the >> NMI was raised while in hypervisor context). Checking for this >> here would be strait forward, but I was really considering to do >> all of this in the assembly exit path, and I was still undecided >> whether we shouldn''t follow Linux in skipping softirq processing >> (and hence scheduling) on the way out from an NMI (I don''t >> think we''d need to do the same for MCE). > Like this: > > x86: skip processing events on the NMI exit path > > Otherwise, we may end up in the scheduler, keeping NMIs masked for a > possibly unbounded time (until whenever the next IRET gets executed). > > Of course it''s open for discussion whether to always use the strait > exit path from handle_ist_exception. > > Signed-off-by: Jan Beulich <jbeulich@suse.com>The logic looks good to me. As for the other IST exceptions, we will never return from a double fault, and will rarely return from an MCE, so I would say it doesn''t really matter at the moment. Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> ~Andrew> > --- a/xen/arch/x86/x86_64/compat/entry.S > +++ b/xen/arch/x86/x86_64/compat/entry.S > @@ -171,7 +171,7 @@ compat_bad_hypercall: > jmp compat_test_all_events > > /* %rbx: struct vcpu, interrupts disabled */ > -compat_restore_all_guest: > +ENTRY(compat_restore_all_guest) > ASSERT_INTERRUPTS_DISABLED > RESTORE_ALL adj=8 compat=1 > .Lft0: iretq > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -635,6 +635,9 @@ ENTRY(early_page_fault) > jmp restore_all_xen > .popsection > > +ENTRY(nmi) > + pushq $0 > + movl $TRAP_nmi,4(%rsp) > handle_ist_exception: > SAVE_ALL > testb $3,UREGS_cs(%rsp) > @@ -649,12 +652,17 @@ handle_ist_exception: > movzbl UREGS_entry_vector(%rsp),%eax > leaq exception_table(%rip),%rdx > callq *(%rdx,%rax,8) > - jmp ret_from_intr > + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) > + jne ret_from_intr > > -ENTRY(nmi) > - pushq $0 > - movl $TRAP_nmi,4(%rsp) > - jmp handle_ist_exception > + /* We want to get strait to the IRET in the NMI exit path. */ > + testb $3,UREGS_cs(%rsp) > + GET_CURRENT(%rbx) > + jz restore_all_xen > + movq VCPU_domain(%rbx),%rax > + testb $1,DOMAIN_is_32bit_pv(%rax) > + jz restore_all_guest > + jmp compat_restore_all_guest > > ENTRY(nmi_crash) > pushq $0 > > >
Tim Deegan
2013-Feb-28 15:55 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
At 15:41 +0000 on 28 Feb (1362066080), Jan Beulich wrote:> --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -635,6 +635,9 @@ ENTRY(early_page_fault) > jmp restore_all_xen > .popsection > > +ENTRY(nmi) > + pushq $0 > + movl $TRAP_nmi,4(%rsp) > handle_ist_exception: > SAVE_ALL > testb $3,UREGS_cs(%rsp) > @@ -649,12 +652,17 @@ handle_ist_exception: > movzbl UREGS_entry_vector(%rsp),%eax > leaq exception_table(%rip),%rdx > callq *(%rdx,%rax,8) > - jmp ret_from_intr > + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) > + jne ret_from_intr > > -ENTRY(nmi) > - pushq $0 > - movl $TRAP_nmi,4(%rsp) > - jmp handle_ist_exception > + /* We want to get strait to the IRET in the NMI exit path. */Language nit: ''straight'', meaning linear or directly, is suitable here. ''strait'' means narrow or constrained (as in strait-jacket).> + testb $3,UREGS_cs(%rsp) > + GET_CURRENT(%rbx) > + jz restore_all_xenGET_CURRENT() contains an addq, which updates ZF. Swap it with the testb? Otherwise, looks good to me. Reviewed-by: Tim Deegan <tim@xen.org>> + movq VCPU_domain(%rbx),%rax > + testb $1,DOMAIN_is_32bit_pv(%rax) > + jz restore_all_guest > + jmp compat_restore_all_guest > > ENTRY(nmi_crash) > pushq $0 > > >
Keir Fraser
2013-Feb-28 16:01 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/2013 15:41, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 28.02.13 at 15:42, "Jan Beulich" <JBeulich@suse.com> wrote: >> ... this must not be done when on the NMI stack (i.e. when the >> NMI was raised while in hypervisor context). Checking for this >> here would be strait forward, but I was really considering to do >> all of this in the assembly exit path, and I was still undecided >> whether we shouldn''t follow Linux in skipping softirq processing >> (and hence scheduling) on the way out from an NMI (I don''t >> think we''d need to do the same for MCE). > > Like this: > > x86: skip processing events on the NMI exit path > > Otherwise, we may end up in the scheduler, keeping NMIs masked for a > possibly unbounded time (until whenever the next IRET gets executed).Is this alternative that we might not process events for an unbounded time? No, I guess not -- either we would interrupt the notifying IPI and we will be IRETing into that IPI''s handler, or the notifying IPI is delayed until the NMI handler''s IRET. What about if the NMI handler itself raises an event (eg softirq)? Perhaps there are no very essential ones of those?> Of course it''s open for discussion whether to always use the strait > exit path from handle_ist_exception.s/strait/straight (and below in the code comment that you add).> Signed-off-by: Jan Beulich <jbeulich@suse.com>> --- a/xen/arch/x86/x86_64/compat/entry.S > +++ b/xen/arch/x86/x86_64/compat/entry.S > @@ -171,7 +171,7 @@ compat_bad_hypercall: > jmp compat_test_all_events > > /* %rbx: struct vcpu, interrupts disabled */ > -compat_restore_all_guest: > +ENTRY(compat_restore_all_guest) > ASSERT_INTERRUPTS_DISABLED > RESTORE_ALL adj=8 compat=1 > .Lft0: iretq > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -635,6 +635,9 @@ ENTRY(early_page_fault) > jmp restore_all_xen > .popsection > > +ENTRY(nmi) > + pushq $0 > + movl $TRAP_nmi,4(%rsp) > handle_ist_exception: > SAVE_ALL > testb $3,UREGS_cs(%rsp) > @@ -649,12 +652,17 @@ handle_ist_exception: > movzbl UREGS_entry_vector(%rsp),%eax > leaq exception_table(%rip),%rdx > callq *(%rdx,%rax,8) > - jmp ret_from_intr > + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) > + jne ret_from_intr > > -ENTRY(nmi) > - pushq $0 > - movl $TRAP_nmi,4(%rsp) > - jmp handle_ist_exception > + /* We want to get strait to the IRET in the NMI exit path. */ > + testb $3,UREGS_cs(%rsp) > + GET_CURRENT(%rbx) > + jz restore_all_xen > + movq VCPU_domain(%rbx),%rax > + testb $1,DOMAIN_is_32bit_pv(%rax) > + jz restore_all_guest > + jmp compat_restore_all_guest > > ENTRY(nmi_crash) > pushq $0 > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2013-Feb-28 16:12 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 16:55, Tim Deegan <tim@xen.org> wrote: > At 15:41 +0000 on 28 Feb (1362066080), Jan Beulich wrote: >> + testb $3,UREGS_cs(%rsp) >> + GET_CURRENT(%rbx) >> + jz restore_all_xen > > GET_CURRENT() contains an addq, which updates ZF. Swap it with the > testb?I''m glad you spotted that - thanks. I moved the GET_CURRENT() down though instead of up. Jan
Jan Beulich
2013-Feb-28 16:17 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
>>> On 28.02.13 at 17:01, Keir Fraser <keir@xen.org> wrote: > On 28/02/2013 15:41, "Jan Beulich" <JBeulich@suse.com> wrote: > >>>>> On 28.02.13 at 15:42, "Jan Beulich" <JBeulich@suse.com> wrote: >>> ... this must not be done when on the NMI stack (i.e. when the >>> NMI was raised while in hypervisor context). Checking for this >>> here would be strait forward, but I was really considering to do >>> all of this in the assembly exit path, and I was still undecided >>> whether we shouldn''t follow Linux in skipping softirq processing >>> (and hence scheduling) on the way out from an NMI (I don''t >>> think we''d need to do the same for MCE). >> >> Like this: >> >> x86: skip processing events on the NMI exit path >> >> Otherwise, we may end up in the scheduler, keeping NMIs masked for a >> possibly unbounded time (until whenever the next IRET gets executed). > > Is this alternative that we might not process events for an unbounded time? > No, I guess not -- either we would interrupt the notifying IPI and we will > be IRETing into that IPI''s handler, or the notifying IPI is delayed until > the NMI handler''s IRET. > > What about if the NMI handler itself raises an event (eg softirq)? Perhaps > there are no very essential ones of those?We do raise PCI_SERR_SOFTIRQ, and the possibly injected NMI (to Dom0) might get slightly deferred too. The latter seems of little concern (Dom0 will get the event eventually). For the former, we might want to explicitly send a self-IPI with EVENT_CHECK_VECTOR following the raise_softirq()? Jan
Keir Fraser
2013-Feb-28 19:02 UTC
Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler
On 28/02/2013 16:17, "Jan Beulich" <JBeulich@suse.com> wrote:>> Is this alternative that we might not process events for an unbounded time? >> No, I guess not -- either we would interrupt the notifying IPI and we will >> be IRETing into that IPI''s handler, or the notifying IPI is delayed until >> the NMI handler''s IRET. >> >> What about if the NMI handler itself raises an event (eg softirq)? Perhaps >> there are no very essential ones of those? > > We do raise PCI_SERR_SOFTIRQ, and the possibly injected NMI > (to Dom0) might get slightly deferred too. The latter seems of > little concern (Dom0 will get the event eventually). For the > former, we might want to explicitly send a self-IPI with > EVENT_CHECK_VECTOR following the raise_softirq()?Or perhaps self-IPI on the NMI exit path if softirq_pending is non-zero? Conservative but more generic. -- Keir
Jan Beulich
2013-Mar-01 10:49 UTC
[PATCH v2 0/2] x86: defer processing events on the NMI exit path
1: defer processing events on the NMI exit path 2: don''t rely on __softirq_pending to be the first field in irq_cpustat_t Signed-off-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich
2013-Mar-01 10:56 UTC
[PATCH v2 1/2] x86: defer processing events on the NMI exit path
Otherwise, we may end up in the scheduler, keeping NMIs masked for a possibly unbounded period of time (until whenever the next IRET gets executed). Enforce timely event processing by sending a self IPI. Of course it''s open for discussion whether to always use the straight exit path from handle_ist_exception. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Send IPI to ourselves to enforce event processing. Dropped all ACKs. --- a/xen/arch/x86/x86_64/compat/entry.S +++ b/xen/arch/x86/x86_64/compat/entry.S @@ -171,7 +171,7 @@ compat_bad_hypercall: jmp compat_test_all_events /* %rbx: struct vcpu, interrupts disabled */ -compat_restore_all_guest: +ENTRY(compat_restore_all_guest) ASSERT_INTERRUPTS_DISABLED RESTORE_ALL adj=8 compat=1 .Lft0: iretq --- a/xen/arch/x86/x86_64/entry.S +++ b/xen/arch/x86/x86_64/entry.S @@ -11,6 +11,7 @@ #include <asm/apicdef.h> #include <asm/page.h> #include <public/xen.h> +#include <irq_vectors.h> ALIGN /* %rbx: struct vcpu */ @@ -635,6 +636,9 @@ ENTRY(early_page_fault) jmp restore_all_xen .popsection +ENTRY(nmi) + pushq $0 + movl $TRAP_nmi,4(%rsp) handle_ist_exception: SAVE_ALL testb $3,UREGS_cs(%rsp) @@ -649,12 +653,25 @@ handle_ist_exception: movzbl UREGS_entry_vector(%rsp),%eax leaq exception_table(%rip),%rdx callq *(%rdx,%rax,8) - jmp ret_from_intr + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) + jne ret_from_intr -ENTRY(nmi) - pushq $0 - movl $TRAP_nmi,4(%rsp) - jmp handle_ist_exception + /* We want to get straight to the IRET on the NMI exit path. */ + testb $3,UREGS_cs(%rsp) + jz restore_all_xen + GET_CURRENT(%rbx) + /* Send an IPI to ourselves to cover for the lack of event checking. */ + movl VCPU_processor(%rbx),%eax + shll $IRQSTAT_shift,%eax + leaq irq_stat(%rip),%rcx + cmpl $0,(%rcx,%rax,1) + je 1f + movl $EVENT_CHECK_VECTOR,%edi + call send_IPI_self +1: movq VCPU_domain(%rbx),%rax + cmpb $0,DOMAIN_is_32bit_pv(%rax) + je restore_all_guest + jmp compat_restore_all_guest ENTRY(nmi_crash) pushq $0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Mar-01 10:57 UTC
[PATCH v2 2/2] x86: don''t rely on __softirq_pending to be the first field in irq_cpustat_t
This is even more so as the field doesn''t have a comment to that effect in the structure definition. Once modifying the respective assembly code, also convert the IRQSTAT_shift users to do a 32-bit shift only (as we won''t support 48M CPUs any time soon) and use "cmpl" instead of "testl" when checking the field (both reducing code size). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/svm/entry.S +++ b/xen/arch/x86/hvm/svm/entry.S @@ -41,10 +41,10 @@ ENTRY(svm_asm_do_resume) CLGI mov VCPU_processor(%rbx),%eax - shl $IRQSTAT_shift,%rax - lea irq_stat(%rip),%rdx - testl $~0,(%rdx,%rax,1) - jnz .Lsvm_process_softirqs + shl $IRQSTAT_shift,%eax + lea irq_stat+IRQSTAT_softirq_pending(%rip),%rdx + cmpl $0,(%rdx,%rax,1) + jne .Lsvm_process_softirqs testb $0, VCPU_nsvm_hap_enabled(%rbx) UNLIKELY_START(nz, nsvm_hap) --- a/xen/arch/x86/hvm/vmx/entry.S +++ b/xen/arch/x86/hvm/vmx/entry.S @@ -97,8 +97,8 @@ vmx_asm_do_vmentry: cli mov VCPU_processor(%rbx),%eax - shl $IRQSTAT_shift,%rax - lea irq_stat(%rip),%rdx + shl $IRQSTAT_shift,%eax + lea irq_stat+IRQSTAT_softirq_pending(%rip),%rdx cmpl $0,(%rdx,%rax,1) jnz .Lvmx_process_softirqs --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -162,6 +162,7 @@ void __dummy__(void) #endif DEFINE(IRQSTAT_shift, LOG_2(sizeof(irq_cpustat_t))); + OFFSET(IRQSTAT_softirq_pending, irq_cpustat_t, __softirq_pending); BLANK(); OFFSET(CPUINFO86_ext_features, struct cpuinfo_x86, x86_capability[1]); --- a/xen/arch/x86/x86_64/compat/entry.S +++ b/xen/arch/x86/x86_64/compat/entry.S @@ -96,10 +96,10 @@ ENTRY(compat_test_all_events) cli # tests must not race interrupts /*compat_test_softirqs:*/ movl VCPU_processor(%rbx),%eax - shlq $IRQSTAT_shift,%rax - leaq irq_stat(%rip),%rcx - testl $~0,(%rcx,%rax,1) - jnz compat_process_softirqs + shll $IRQSTAT_shift,%eax + leaq irq_stat+IRQSTAT_softirq_pending(%rip),%rcx + cmpl $0,(%rcx,%rax,1) + jne compat_process_softirqs testb $1,VCPU_mce_pending(%rbx) jnz compat_process_mce .Lcompat_test_guest_nmi: --- a/xen/arch/x86/x86_64/entry.S +++ b/xen/arch/x86/x86_64/entry.S @@ -195,8 +195,8 @@ test_all_events: cli # tests must not race interrupts /*test_softirqs:*/ movl VCPU_processor(%rbx),%eax - shl $IRQSTAT_shift,%rax - leaq irq_stat(%rip),%rcx + shll $IRQSTAT_shift,%eax + leaq irq_stat+IRQSTAT_softirq_pending(%rip),%rcx cmpl $0,(%rcx,%rax,1) jne process_softirqs testb $1,VCPU_mce_pending(%rbx) @@ -663,7 +663,7 @@ handle_ist_exception: /* Send an IPI to ourselves to cover for the lack of event checking. */ movl VCPU_processor(%rbx),%eax shll $IRQSTAT_shift,%eax - leaq irq_stat(%rip),%rcx + leaq irq_stat+IRQSTAT_softirq_pending(%rip),%rcx cmpl $0,(%rcx,%rax,1) je 1f movl $EVENT_CHECK_VECTOR,%edi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2013-Mar-01 11:37 UTC
Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path
On 01/03/13 10:56, Jan Beulich wrote:> Otherwise, we may end up in the scheduler, keeping NMIs masked for a > possibly unbounded period of time (until whenever the next IRET gets > executed). Enforce timely event processing by sending a self IPI. > > Of course it''s open for discussion whether to always use the straight > exit path from handle_ist_exception. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Send IPI to ourselves to enforce event processing. Dropped all ACKs. > > --- a/xen/arch/x86/x86_64/compat/entry.S > +++ b/xen/arch/x86/x86_64/compat/entry.S > @@ -171,7 +171,7 @@ compat_bad_hypercall: > jmp compat_test_all_events > > /* %rbx: struct vcpu, interrupts disabled */ > -compat_restore_all_guest: > +ENTRY(compat_restore_all_guest) > ASSERT_INTERRUPTS_DISABLED > RESTORE_ALL adj=8 compat=1 > .Lft0: iretq > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -11,6 +11,7 @@ > #include <asm/apicdef.h> > #include <asm/page.h> > #include <public/xen.h> > +#include <irq_vectors.h> > > ALIGN > /* %rbx: struct vcpu */ > @@ -635,6 +636,9 @@ ENTRY(early_page_fault) > jmp restore_all_xen > .popsection > > +ENTRY(nmi) > + pushq $0 > + movl $TRAP_nmi,4(%rsp) > handle_ist_exception: > SAVE_ALL > testb $3,UREGS_cs(%rsp) > @@ -649,12 +653,25 @@ handle_ist_exception: > movzbl UREGS_entry_vector(%rsp),%eax > leaq exception_table(%rip),%rdx > callq *(%rdx,%rax,8) > - jmp ret_from_intr > + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) > + jne ret_from_intr > > -ENTRY(nmi) > - pushq $0 > - movl $TRAP_nmi,4(%rsp) > - jmp handle_ist_exception > + /* We want to get straight to the IRET on the NMI exit path. */ > + testb $3,UREGS_cs(%rsp) > + jz restore_all_xen > + GET_CURRENT(%rbx) > + /* Send an IPI to ourselves to cover for the lack of event checking. */ > + movl VCPU_processor(%rbx),%eax > + shll $IRQSTAT_shift,%eax > + leaq irq_stat(%rip),%rcx > + cmpl $0,(%rcx,%rax,1)__softirq_pending is an unsigned long. Would it not be prudent to use cmpq to save obscure bugs if the implementation changes, or are we sufficiently sure that this wont happen? ~Andrew> + je 1f > + movl $EVENT_CHECK_VECTOR,%edi > + call send_IPI_self > +1: movq VCPU_domain(%rbx),%rax > + cmpb $0,DOMAIN_is_32bit_pv(%rax) > + je restore_all_guest > + jmp compat_restore_all_guest > > ENTRY(nmi_crash) > pushq $0 > > >
Jan Beulich
2013-Mar-01 11:53 UTC
Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path
>>> On 01.03.13 at 12:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 01/03/13 10:56, Jan Beulich wrote: >> Otherwise, we may end up in the scheduler, keeping NMIs masked for a >> possibly unbounded period of time (until whenever the next IRET gets >> executed). Enforce timely event processing by sending a self IPI. >> >> Of course it''s open for discussion whether to always use the straight >> exit path from handle_ist_exception. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> v2: Send IPI to ourselves to enforce event processing. Dropped all ACKs. >> >> --- a/xen/arch/x86/x86_64/compat/entry.S >> +++ b/xen/arch/x86/x86_64/compat/entry.S >> @@ -171,7 +171,7 @@ compat_bad_hypercall: >> jmp compat_test_all_events >> >> /* %rbx: struct vcpu, interrupts disabled */ >> -compat_restore_all_guest: >> +ENTRY(compat_restore_all_guest) >> ASSERT_INTERRUPTS_DISABLED >> RESTORE_ALL adj=8 compat=1 >> .Lft0: iretq >> --- a/xen/arch/x86/x86_64/entry.S >> +++ b/xen/arch/x86/x86_64/entry.S >> @@ -11,6 +11,7 @@ >> #include <asm/apicdef.h> >> #include <asm/page.h> >> #include <public/xen.h> >> +#include <irq_vectors.h> >> >> ALIGN >> /* %rbx: struct vcpu */ >> @@ -635,6 +636,9 @@ ENTRY(early_page_fault) >> jmp restore_all_xen >> .popsection >> >> +ENTRY(nmi) >> + pushq $0 >> + movl $TRAP_nmi,4(%rsp) >> handle_ist_exception: >> SAVE_ALL >> testb $3,UREGS_cs(%rsp) >> @@ -649,12 +653,25 @@ handle_ist_exception: >> movzbl UREGS_entry_vector(%rsp),%eax >> leaq exception_table(%rip),%rdx >> callq *(%rdx,%rax,8) >> - jmp ret_from_intr >> + cmpb $TRAP_nmi,UREGS_entry_vector(%rsp) >> + jne ret_from_intr >> >> -ENTRY(nmi) >> - pushq $0 >> - movl $TRAP_nmi,4(%rsp) >> - jmp handle_ist_exception >> + /* We want to get straight to the IRET on the NMI exit path. */ >> + testb $3,UREGS_cs(%rsp) >> + jz restore_all_xen >> + GET_CURRENT(%rbx) >> + /* Send an IPI to ourselves to cover for the lack of event > checking. */ >> + movl VCPU_processor(%rbx),%eax >> + shll $IRQSTAT_shift,%eax >> + leaq irq_stat(%rip),%rcx >> + cmpl $0,(%rcx,%rax,1) > > __softirq_pending is an unsigned long. Would it not be prudent to use > cmpq to save obscure bugs if the implementation changes, or are we > sufficiently sure that this wont happen?All other existing instances of similar assembly code use testl or cmpl, and in fact I merely copied some other instance. As we''re not getting close to 32, I think we might rather want to adjust the __softirq_pending type. Keir? Jan
Keir Fraser
2013-Mar-01 15:55 UTC
Re: [PATCH v2 0/2] x86: defer processing events on the NMI exit path
On 01/03/2013 10:49, "Jan Beulich" <JBeulich@suse.com> wrote:> 1: defer processing events on the NMI exit path > 2: don''t rely on __softirq_pending to be the first field in irq_cpustat_t > > Signed-off-by: Jan Beulich <jbeulich@suse.com>Nice. Acked-by: Keir Fraser <keir@xen.org>
Keir Fraser
2013-Mar-01 15:56 UTC
Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path
On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com> wrote:>> __softirq_pending is an unsigned long. Would it not be prudent to use >> cmpq to save obscure bugs if the implementation changes, or are we >> sufficiently sure that this wont happen? > > All other existing instances of similar assembly code use testl or > cmpl, and in fact I merely copied some other instance. > > As we''re not getting close to 32, I think we might rather want to > adjust the __softirq_pending type. Keir?Yup, I don''t see any reason why it couldn''t be a uint32_t. -- Keir
Andrew Cooper
2013-Mar-01 16:01 UTC
Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path
On 01/03/13 15:56, Keir Fraser wrote:> On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com> wrote: > >>> __softirq_pending is an unsigned long. Would it not be prudent to use >>> cmpq to save obscure bugs if the implementation changes, or are we >>> sufficiently sure that this wont happen? >> All other existing instances of similar assembly code use testl or >> cmpl, and in fact I merely copied some other instance. >> >> As we''re not getting close to 32, I think we might rather want to >> adjust the __softirq_pending type. Keir? > Yup, I don''t see any reason why it couldn''t be a uint32_t. > > -- Keir > >Further to that, is there any reason that it cant be per-cpu, to save having information like this moving around pcpus in a hot cache line? ~Andrew
Jan Beulich
2013-Mar-01 16:08 UTC
Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path
>>> On 01.03.13 at 17:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 01/03/13 15:56, Keir Fraser wrote: >> On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>>> __softirq_pending is an unsigned long. Would it not be prudent to use >>>> cmpq to save obscure bugs if the implementation changes, or are we >>>> sufficiently sure that this wont happen? >>> All other existing instances of similar assembly code use testl or >>> cmpl, and in fact I merely copied some other instance. >>> >>> As we''re not getting close to 32, I think we might rather want to >>> adjust the __softirq_pending type. Keir? >> Yup, I don''t see any reason why it couldn''t be a uint32_t. > > Further to that, is there any reason that it cant be per-cpu, to save > having information like this moving around pcpus in a hot cache line?The structure definition has __cacheline_aligned, there shouldn''t be much moving around (as long as our cache line size definition is still suitable for modern CPUs'' L1 caches at least). Jan