thr3ads.net - Xen devel - [PATCH V3] vmx/nmi: Do not use self

If this information is useful, please help other people find it:
Share via:

Andrew Cooper

2012-Nov-22 15:00 UTC

[PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

The self_nmi() code cause''s an NMI to be triggered by sending an APIC
message to the local processor.  However, NMIs are blocked by the
VMEXIT, until the next iret or VMENTER.

Volume 3 Chapter 27 Section 1 of the Intel SDM states:

An NMI causes subsequent NMIs to be blocked, but only after the VM exit
completes.

As a result, as soon as the VMENTER happens, an immediate VMEXIT
happens as a result of the queued NMI.  We have seen hundreds of
iterations of this VMEXIT/VMENTER loop before the HVM guest resumes
normal operation.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

--
Changes since v2
 * Switch from ''int $2'' to do_nmi()
 * Reworked commit message to more clearly explain the problem

diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
         vector = intr_info & INTR_INFO_VECTOR_MASK;
         if ( vector == TRAP_machine_check )
             do_machine_check(regs);
+        else if ( vector == TRAP_nmi &&
+                ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =+                
(X86_EVENTTYPE_NMI << 8) ) )
+            /* Must be called before interrupts are enabled to ensure
+             * the NMI handler code is run before the first IRET. The
+             * IRET unblocks subsequent NMI''s (Intel SDM Vol 3, 6.7.1)
+             */
+            do_nmi();
         break;
     case EXIT_REASON_MCE_DURING_VMENTRY:
         do_machine_check(regs);
@@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_
                  (X86_EVENTTYPE_NMI << 8) )
                 goto exit_and_crash;
             HVMTRACE_0D(NMI);
-            self_nmi(); /* Real NMI, vector 2: normal processing. */
             break;
         case TRAP_machine_check:
             HVMTRACE_0D(MCE);

Jan Beulich

2012-Nov-22 15:15 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> The self_nmi() code cause''s an NMI to be triggered by sending an
APIC
> message to the local processor.  However, NMIs are blocked by the
> VMEXIT, until the next iret or VMENTER.
> 
> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
> 
> An NMI causes subsequent NMIs to be blocked, but only after the VM exit
> completes.
> 
> As a result, as soon as the VMENTER happens, an immediate VMEXIT
> happens as a result of the queued NMI.  We have seen hundreds of
> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes
> normal operation.
> 
> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> --
> Changes since v2
>  * Switch from ''int $2'' to do_nmi()
>  * Reworked commit message to more clearly explain the problem
> 
> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>          if ( vector == TRAP_machine_check )
>              do_machine_check(regs);
> +        else if ( vector == TRAP_nmi &&
> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK) => +      
(X86_EVENTTYPE_NMI << 8) ) )
> +            /* Must be called before interrupts are enabled to ensure
> +             * the NMI handler code is run before the first IRET. The
> +             * IRET unblocks subsequent NMI''s (Intel SDM Vol 3,
6.7.1)
> +             */
> +            do_nmi();
But that''s only half of it, at least as far as I recall the outcome of
the discussion: You want an IRET (or VMRESUME) before possibly
going into the scheduler (and hence not back to the current VM).
And the same also on the PV exit path from an NMI.

Jan
>          break;
>      case EXIT_REASON_MCE_DURING_VMENTRY:
>          do_machine_check(regs);
> @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_
>                   (X86_EVENTTYPE_NMI << 8) )
>                  goto exit_and_crash;
>              HVMTRACE_0D(NMI);
> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
>              break;
>          case TRAP_machine_check:
>              HVMTRACE_0D(MCE);

Andrew Cooper

2012-Nov-22 15:16 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 15:15, Jan Beulich wrote:>>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> The self_nmi() code cause''s an NMI to be triggered by sending
an APIC
>> message to the local processor.  However, NMIs are blocked by the
>> VMEXIT, until the next iret or VMENTER.
>>
>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>
>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit
>> completes.
>>
>> As a result, as soon as the VMENTER happens, an immediate VMEXIT
>> happens as a result of the queued NMI.  We have seen hundreds of
>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes
>> normal operation.
>>
>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> --
>> Changes since v2
>>  * Switch from ''int $2'' to do_nmi()
>>  * Reworked commit message to more clearly explain the problem
>>
>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>>          if ( vector == TRAP_machine_check )
>>              do_machine_check(regs);
>> +        else if ( vector == TRAP_nmi &&
>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>
+                  (X86_EVENTTYPE_NMI << 8) ) )
>> +            /* Must be called before interrupts are enabled to ensure
>> +             * the NMI handler code is run before the first IRET. The
>> +             * IRET unblocks subsequent NMI''s (Intel SDM Vol
3, 6.7.1)
>> +             */
>> +            do_nmi();
> But that''s only half of it, at least as far as I recall the
outcome of
> the discussion: You want an IRET (or VMRESUME) before possibly
> going into the scheduler (and hence not back to the current VM).
> And the same also on the PV exit path from an NMI.
>
> Jan
When I read this codepath, there seemed to be no consideration for
re-scheduling.  I will double check, but I think the VMRESUME is
unconditional if the VMEXIT reason was a real NMI.

~Andrew
>
>>          break;
>>      case EXIT_REASON_MCE_DURING_VMENTRY:
>>          do_machine_check(regs);
>> @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_
>>                   (X86_EVENTTYPE_NMI << 8) )
>>                  goto exit_and_crash;
>>              HVMTRACE_0D(NMI);
>> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
>>              break;
>>          case TRAP_machine_check:
>>              HVMTRACE_0D(MCE);
>
>
-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Jan Beulich

2012-Nov-22 15:21 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 16:16, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 22/11/12 15:15, Jan Beulich wrote:
>>>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> The self_nmi() code cause''s an NMI to be triggered by
sending an APIC
>>> message to the local processor.  However, NMIs are blocked by the
>>> VMEXIT, until the next iret or VMENTER.
>>>
>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>>
>>> An NMI causes subsequent NMIs to be blocked, but only after the VM
exit
>>> completes.
>>>
>>> As a result, as soon as the VMENTER happens, an immediate VMEXIT
>>> happens as a result of the queued NMI.  We have seen hundreds of
>>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes
>>> normal operation.
>>>
>>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> --
>>> Changes since v2
>>>  * Switch from ''int $2'' to do_nmi()
>>>  * Reworked commit message to more clearly explain the problem
>>>
>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>>>          if ( vector == TRAP_machine_check )
>>>              do_machine_check(regs);
>>> +        else if ( vector == TRAP_nmi &&
>>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK)
=>>> +                  (X86_EVENTTYPE_NMI << 8) ) )
>>> +            /* Must be called before interrupts are enabled to
ensure
>>> +             * the NMI handler code is run before the first IRET.
The
>>> +             * IRET unblocks subsequent NMI''s (Intel SDM
Vol 3, 6.7.1)
>>> +             */
>>> +            do_nmi();
>> But that''s only half of it, at least as far as I recall the
outcome of
>> the discussion: You want an IRET (or VMRESUME) before possibly
>> going into the scheduler (and hence not back to the current VM).
>> And the same also on the PV exit path from an NMI.
> 
> When I read this codepath, there seemed to be no consideration for
> re-scheduling.  I will double check, but I think the VMRESUME is
> unconditional if the VMEXIT reason was a real NMI.
Interesting - I see nothing NMI-related in vmx/entry.S, i.e. it is
my understanding that just like in any other case you may end
up calling do_softirq on the way out from handling the NMI.

Jan

Mats Petersson

2012-Nov-22 15:22 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 15:16, Andrew Cooper wrote:> On 22/11/12 15:15, Jan Beulich wrote:
>>>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> The self_nmi() code cause''s an NMI to be triggered by
sending an APIC
>>> message to the local processor.  However, NMIs are blocked by the
>>> VMEXIT, until the next iret or VMENTER.
>>>
>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>>
>>> An NMI causes subsequent NMIs to be blocked, but only after the VM
exit
>>> completes.
>>>
>>> As a result, as soon as the VMENTER happens, an immediate VMEXIT
>>> happens as a result of the queued NMI.  We have seen hundreds of
>>> iterations of this VMEXIT/VMENTER loop before the HVM guest resumes
>>> normal operation.
>>>
>>> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> --
>>> Changes since v2
>>>   * Switch from ''int $2'' to do_nmi()
>>>   * Reworked commit message to more clearly explain the problem
>>>
>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>>           vector = intr_info & INTR_INFO_VECTOR_MASK;
>>>           if ( vector == TRAP_machine_check )
>>>               do_machine_check(regs);
>>> +        else if ( vector == TRAP_nmi &&
>>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK)
=>>> +                  (X86_EVENTTYPE_NMI << 8) ) )
>>> +            /* Must be called before interrupts are enabled to
ensure
>>> +             * the NMI handler code is run before the first IRET.
The
>>> +             * IRET unblocks subsequent NMI''s (Intel SDM
Vol 3, 6.7.1)
>>> +             */
>>> +            do_nmi();
>> But that''s only half of it, at least as far as I recall the
outcome of
>> the discussion: You want an IRET (or VMRESUME) before possibly
>> going into the scheduler (and hence not back to the current VM).
>> And the same also on the PV exit path from an NMI.
>>
>> Jan
> When I read this codepath, there seemed to be no consideration for
> re-scheduling.  I will double check, but I think the VMRESUME is
> unconditional if the VMEXIT reason was a real NMI.
Technically, we could probably get a timer interrupt which causes a 
scheduling event, between the NMI and the VMEXIT. However, that timer 
interrupt will have an IRET at the end of it, so it fixes
itself.>
> ~Andrew
>
>>>           break;
>>>       case EXIT_REASON_MCE_DURING_VMENTRY:
>>>           do_machine_check(regs);
>>> @@ -2442,7 +2450,6 @@ void vmx_vmexit_handler(struct cpu_user_
>>>                    (X86_EVENTTYPE_NMI << 8) )
>>>                   goto exit_and_crash;
>>>               HVMTRACE_0D(NMI);
>>> -            self_nmi(); /* Real NMI, vector 2: normal processing.
*/
>>>               break;
>>>           case TRAP_machine_check:
>>>               HVMTRACE_0D(MCE);
>>

Andrew Cooper

2012-Nov-22 15:37 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 15:21, Jan Beulich wrote:>>>> On 22.11.12 at 16:16, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> On 22/11/12 15:15, Jan Beulich wrote:
>>>>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>> The self_nmi() code cause''s an NMI to be triggered by
sending an APIC
>>>> message to the local processor.  However, NMIs are blocked by
the
>>>> VMEXIT, until the next iret or VMENTER.
>>>>
>>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>>>
>>>> An NMI causes subsequent NMIs to be blocked, but only after the
VM exit
>>>> completes.
>>>>
>>>> As a result, as soon as the VMENTER happens, an immediate
VMEXIT
>>>> happens as a result of the queued NMI.  We have seen hundreds
of
>>>> iterations of this VMEXIT/VMENTER loop before the HVM guest
resumes
>>>> normal operation.
>>>>
>>>> Signed-off-by: Malcolm Crossley
<malcolm.crossley@citrix.com>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>
>>>> --
>>>> Changes since v2
>>>>  * Switch from ''int $2'' to do_nmi()
>>>>  * Reworked commit message to more clearly explain the problem
>>>>
>>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>>>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>>>>          if ( vector == TRAP_machine_check )
>>>>              do_machine_check(regs);
>>>> +        else if ( vector == TRAP_nmi &&
>>>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK)
=>>>> +                  (X86_EVENTTYPE_NMI << 8) ) )
>>>> +            /* Must be called before interrupts are enabled to
ensure
>>>> +             * the NMI handler code is run before the first
IRET. The
>>>> +             * IRET unblocks subsequent NMI''s (Intel
SDM Vol 3, 6.7.1)
>>>> +             */
>>>> +            do_nmi();
>>> But that''s only half of it, at least as far as I recall
the outcome of
>>> the discussion: You want an IRET (or VMRESUME) before possibly
>>> going into the scheduler (and hence not back to the current VM).
>>> And the same also on the PV exit path from an NMI.
>> When I read this codepath, there seemed to be no consideration for
>> re-scheduling.  I will double check, but I think the VMRESUME is
>> unconditional if the VMEXIT reason was a real NMI.
> Interesting - I see nothing NMI-related in vmx/entry.S, i.e. it is
> my understanding that just like in any other case you may end
> up calling do_softirq on the way out from handling the NMI.
>
> Jan
>
Ah yes - vmx_asm_do_vmentry does have the potential to process softirqs,
which we will want to do with NMIs enabled.  I missed that last time I
looked.

So we need to deliberately execute an iretd.  int $2 is unusable because
it adds an extra dimension to the fake NMI corruption issue if you get
the combination int $2->MCE->NMI before the int $2 has correctly saved
its exception frame.

A quick solution would be to execute a noop function with
run_in_exception_handler().  Alternatively, I can code enable_nmi() or
so which does an inline iret to itself.  Which of these would you prefer?

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Jan Beulich

2012-Nov-22 15:55 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> A quick solution would be to execute a noop function with
> run_in_exception_handler().  Alternatively, I can code enable_nmi() or
> so which does an inline iret to itself.  Which of these would you prefer?
I would actually consider avoiding to run softirqs altogether in that
case, just like native Linux doesn''t do any event or scheduler
processing in that case.

Jan

Jan Beulich

2012-Nov-22 16:00 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 16:22, Mats Petersson
<mats.petersson@citrix.com> wrote:
> On 22/11/12 15:16, Andrew Cooper wrote:
>> On 22/11/12 15:15, Jan Beulich wrote:
>>>>>> On 22.11.12 at 16:00, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>> The self_nmi() code cause''s an NMI to be triggered by
sending an APIC
>>>> message to the local processor.  However, NMIs are blocked by
the
>>>> VMEXIT, until the next iret or VMENTER.
>>>>
>>>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>>>
>>>> An NMI causes subsequent NMIs to be blocked, but only after the
VM exit
>>>> completes.
>>>>
>>>> As a result, as soon as the VMENTER happens, an immediate
VMEXIT
>>>> happens as a result of the queued NMI.  We have seen hundreds
of
>>>> iterations of this VMEXIT/VMENTER loop before the HVM guest
resumes
>>>> normal operation.
>>>>
>>>> Signed-off-by: Malcolm Crossley
<malcolm.crossley@citrix.com>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>
>>>> --
>>>> Changes since v2
>>>>   * Switch from ''int $2'' to do_nmi()
>>>>   * Reworked commit message to more clearly explain the problem
>>>>
>>>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>>>           vector = intr_info & INTR_INFO_VECTOR_MASK;
>>>>           if ( vector == TRAP_machine_check )
>>>>               do_machine_check(regs);
>>>> +        else if ( vector == TRAP_nmi &&
>>>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK)
=>>>> +                  (X86_EVENTTYPE_NMI << 8) ) )
>>>> +            /* Must be called before interrupts are enabled to
ensure
>>>> +             * the NMI handler code is run before the first
IRET. The
>>>> +             * IRET unblocks subsequent NMI''s (Intel
SDM Vol 3, 6.7.1)
>>>> +             */
>>>> +            do_nmi();
>>> But that''s only half of it, at least as far as I recall
the outcome of
>>> the discussion: You want an IRET (or VMRESUME) before possibly
>>> going into the scheduler (and hence not back to the current VM).
>>> And the same also on the PV exit path from an NMI.
>>>
>>> Jan
>> When I read this codepath, there seemed to be no consideration for
>> re-scheduling.  I will double check, but I think the VMRESUME is
>> unconditional if the VMEXIT reason was a real NMI.
> 
> Technically, we could probably get a timer interrupt which causes a 
> scheduling event, between the NMI and the VMEXIT. However, that timer 
> interrupt will have an IRET at the end of it, so it fixes itself.
Not exactly - such a timer interrupt wouldn''t cause any scheduling
to happen (this and other softirqs get run only when exiting to a
guest, or from an idle vCPU); when exiting to the hypervisor
nothing like that happens (and it would be catastrophic if it did).

Furthermore, interrupts should _never_ get enabled while handling
an NMI.

Jan

Andrew Cooper

2012-Nov-22 16:05 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 15:55, Jan Beulich wrote:>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> A quick solution would be to execute a noop function with
>> run_in_exception_handler().  Alternatively, I can code enable_nmi() or
>> so which does an inline iret to itself.  Which of these would you
prefer?
> I would actually consider avoiding to run softirqs altogether in that
> case, just like native Linux doesn''t do any event or scheduler
> processing in that case.
>
> Jan
>
That would probably be the easiest solution.

I was already going to do the same for the rentrant NMI and MCE handling
case (and also the process pending upcalls checking), due to the
complexities of fixing the race condition at the end of the handler.

Unfortunately, I don''t think I have time to look at this issue
immediately, but if it is ok to wait till the beginning of next week, I
can roll it into the main dev work for the other NMI/MCE work.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Jan Beulich

2012-Nov-22 16:12 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 22/11/12 15:55, Jan Beulich wrote:
>>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> A quick solution would be to execute a noop function with
>>> run_in_exception_handler().  Alternatively, I can code enable_nmi()
or
>>> so which does an inline iret to itself.  Which of these would you
prefer?
>> I would actually consider avoiding to run softirqs altogether in that
>> case, just like native Linux doesn''t do any event or scheduler
>> processing in that case.
> 
> That would probably be the easiest solution.
> 
> I was already going to do the same for the rentrant NMI and MCE handling
> case (and also the process pending upcalls checking), due to the
> complexities of fixing the race condition at the end of the handler.
> 
> Unfortunately, I don''t think I have time to look at this issue
> immediately, but if it is ok to wait till the beginning of next week, I
That''s fine of course.
> can roll it into the main dev work for the other NMI/MCE work.
Hmm, that would mean it''s going to be more difficult to backport.
I''d prefer having this in as simple a fashion as possible in the
patch here, and then have your other patch(es) build on top of
that.

Jan

Andrew Cooper

2012-Nov-22 16:31 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 16:12, Jan Beulich wrote:>>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> On 22/11/12 15:55, Jan Beulich wrote:
>>>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>> A quick solution would be to execute a noop function with
>>>> run_in_exception_handler().  Alternatively, I can code
enable_nmi() or
>>>> so which does an inline iret to itself.  Which of these would
you prefer?
>>> I would actually consider avoiding to run softirqs altogether in
that
>>> case, just like native Linux doesn''t do any event or
scheduler
>>> processing in that case.
>> That would probably be the easiest solution.
>>
>> I was already going to do the same for the rentrant NMI and MCE
handling
>> case (and also the process pending upcalls checking), due to the
>> complexities of fixing the race condition at the end of the handler.
>>
>> Unfortunately, I don''t think I have time to look at this issue
>> immediately, but if it is ok to wait till the beginning of next week, I
> That''s fine of course.
>
>> can roll it into the main dev work for the other NMI/MCE work.
> Hmm, that would mean it''s going to be more difficult to backport.
> I''d prefer having this in as simple a fashion as possible in the
> patch here, and then have your other patch(es) build on top of
> that.
>
> Jan
>
I will ensure that the fix for this is suitably separate in the patch
series, and in the easiest backportable form.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Tim Deegan

2012-Nov-22 17:34 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 15:00 +0000 on 22 Nov (1353596446), Andrew Cooper
wrote:> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>          if ( vector == TRAP_machine_check )
>              do_machine_check(regs);
> +        else if ( vector == TRAP_nmi &&
> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK) => +      
(X86_EVENTTYPE_NMI << 8) ) )
> +            /* Must be called before interrupts are enabled to ensure
> +             * the NMI handler code is run before the first IRET. The
> +             * IRET unblocks subsequent NMI''s (Intel SDM Vol 3,
6.7.1)
> +             */
> +            do_nmi();
>          break;
>      case EXIT_REASON_MCE_DURING_VMENTRY:
>          do_machine_check(regs);
I think it might also make sense to move the HVMTARCE invocations that
are just above here down to after this block.  There''s quite a lot of
code behind there and though I don''t see any potential faults they
might
well get added later (including in places like printk() and
tasklet_schedule()).

I had a look at the rest of the code that runs between the vmexit and
here, and it looks OK to me.

George, would that make the tracing more confusing?

Tim.

George Dunlap

2012-Nov-26 11:50 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 22/11/12 17:34, Tim Deegan wrote:> At 15:00 +0000 on 22 Nov (1353596446), Andrew Cooper wrote:
>> diff -r 2489c2926698 -r d7ea938044ac xen/arch/x86/hvm/vmx/vmx.c
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2269,6 +2269,14 @@ void vmx_vmexit_handler(struct cpu_user_
>>           vector = intr_info & INTR_INFO_VECTOR_MASK;
>>           if ( vector == TRAP_machine_check )
>>               do_machine_check(regs);
>> +        else if ( vector == TRAP_nmi &&
>> +                ( (intr_info & INTR_INFO_INTR_TYPE_MASK) =>>
+                  (X86_EVENTTYPE_NMI << 8) ) )
>> +            /* Must be called before interrupts are enabled to ensure
>> +             * the NMI handler code is run before the first IRET. The
>> +             * IRET unblocks subsequent NMI''s (Intel SDM Vol
3, 6.7.1)
>> +             */
>> +            do_nmi();
>>           break;
>>       case EXIT_REASON_MCE_DURING_VMENTRY:
>>           do_machine_check(regs);
> I think it might also make sense to move the HVMTARCE invocations that
> are just above here down to after this block.  There''s quite a lot
of
> code behind there and though I don''t see any potential faults they
might
> well get added later (including in places like printk() and
> tasklet_schedule()).
>
> I had a look at the rest of the code that runs between the vmexit and
> here, and it looks OK to me.
>
> George, would that make the tracing more confusing?
Well, it would mildly, because you''d get the trace in vmx_do_extint() 
before the trace for the VMEXIT that caused it. :-) If it''s important 
for correctness, then xenalyze will just have to deal.  But it would be 
a lot nicer not to have to deal.

  -George

Jan Beulich

2013-Feb-28 09:58 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 22.11.12 at 17:12, Jan Beulich wrote:
>>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > On 22/11/12 15:55, Jan Beulich wrote:
> >>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> >>> A quick solution would be to execute a noop function with
> >>> run_in_exception_handler().  Alternatively, I can code
enable_nmi() or
> >>> so which does an inline iret to itself.  Which of these would
you prefer?
> >> I would actually consider avoiding to run softirqs altogether in
that
> >> case, just like native Linux doesn''t do any event or
scheduler
> >> processing in that case.
> > 
> > That would probably be the easiest solution.
> > 
> > I was already going to do the same for the rentrant NMI and MCE
handling
> > case (and also the process pending upcalls checking), due to the
> > complexities of fixing the race condition at the end of the handler.
> > 
> > Unfortunately, I don''t think I have time to look at this
issue
> > immediately, but if it is ok to wait till the beginning of next week,
I
> 
> That''s fine of course.
So that was 3 months ago, and we''re likely going to need to do
further unfixed releases if this won''t move forward. Can you
estimate whether you''ll be able to get back to this?

Jan

Andrew Cooper

2013-Feb-28 12:32 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/13 09:58, Jan Beulich wrote:>>>> On 22.11.12 at 17:12, Jan Beulich wrote:
>>>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> On 22/11/12 15:55, Jan Beulich wrote:
>>>>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>>> A quick solution would be to execute a noop function with
>>>>> run_in_exception_handler().  Alternatively, I can code
enable_nmi() or
>>>>> so which does an inline iret to itself.  Which of these
would you prefer?
>>>> I would actually consider avoiding to run softirqs altogether
in that
>>>> case, just like native Linux doesn''t do any event or
scheduler
>>>> processing in that case.
>>> That would probably be the easiest solution.
>>>
>>> I was already going to do the same for the rentrant NMI and MCE
handling
>>> case (and also the process pending upcalls checking), due to the
>>> complexities of fixing the race condition at the end of the
handler.
>>>
>>> Unfortunately, I don''t think I have time to look at this
issue
>>> immediately, but if it is ok to wait till the beginning of next
week, I
>> That''s fine of course.
> So that was 3 months ago, and we''re likely going to need to do
> further unfixed releases if this won''t move forward. Can you
> estimate whether you''ll be able to get back to this?
>
> Jan
>
With the new enable_nmi() function, I can spin a workaround patch quite
quickly, and will try to get that done this week.  The remainder of the
reentrant tangle is still happening very slowly.

~Andrew

Tim Deegan

2013-Feb-28 13:00 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich
wrote:> >>> On 22.11.12 at 17:12, Jan Beulich wrote:
> >>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > > On 22/11/12 15:55, Jan Beulich wrote:
> > >>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > >>> A quick solution would be to execute a noop function with
> > >>> run_in_exception_handler().  Alternatively, I can code
enable_nmi() or
> > >>> so which does an inline iret to itself.  Which of these
would you prefer?
> > >> I would actually consider avoiding to run softirqs altogether
in that
> > >> case, just like native Linux doesn''t do any event or
scheduler
> > >> processing in that case.
> > > 
> > > That would probably be the easiest solution.
> > > 
> > > I was already going to do the same for the rentrant NMI and MCE
handling
> > > case (and also the process pending upcalls checking), due to the
> > > complexities of fixing the race condition at the end of the
handler.
> > > 
> > > Unfortunately, I don''t think I have time to look at this
issue
> > > immediately, but if it is ok to wait till the beginning of next
week, I
> > 
> > That''s fine of course.
> 
> So that was 3 months ago, and we''re likely going to need to do
> further unfixed releases if this won''t move forward. Can you
> estimate whether you''ll be able to get back to this?
Let''s make a step in the right direction before 4.3, at least:

commit d278beed1df2d226911dce92295411018c9bba2f
Author: Tim Deegan <tim@xen.org>
Date:   Thu Feb 28 12:42:15 2013 +0000

    vmx: handle NMIs before re-enabling interrupts.
    
    Also, switch to calling do_nmi() and explicitly re-enabling NMIs
    rather than raising a fake NMI and relying on the NMI processing to
    IRET, since that handling code is likely to change a fair amount in
    future.
    
    Signed-off-by: Tim Deegan <tim@xen.org>

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 5378928..462bb0f 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         vector = intr_info & INTR_INFO_VECTOR_MASK;
         if ( vector == TRAP_machine_check )
             do_machine_check(regs);
+        if ( vector == TRAP_machine_check
+             && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =+         
(X86_EVENTTYPE_NMI << 8)) )
+        {
+            do_nmi(regs);
+            enable_nmis();
+        }
         break;
     case EXIT_REASON_MCE_DURING_VMENTRY:
         do_machine_check(regs);
@@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
                  (X86_EVENTTYPE_NMI << 8) )
                 goto exit_and_crash;
             HVMTRACE_0D(NMI);
-            self_nmi(); /* Real NMI, vector 2: normal processing. */
+            /* Already handled above. */
             break;
         case TRAP_machine_check:
             HVMTRACE_0D(MCE);

Andrew Cooper

2013-Feb-28 13:12 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/13 13:00, Tim Deegan wrote:> At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich wrote:
>>>>> On 22.11.12 at 17:12, Jan Beulich wrote:
>>>>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>> On 22/11/12 15:55, Jan Beulich wrote:
>>>>>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>>>> A quick solution would be to execute a noop function
with
>>>>>> run_in_exception_handler().  Alternatively, I can code
enable_nmi() or
>>>>>> so which does an inline iret to itself.  Which of these
would you prefer?
>>>>> I would actually consider avoiding to run softirqs
altogether in that
>>>>> case, just like native Linux doesn''t do any event
or scheduler
>>>>> processing in that case.
>>>> That would probably be the easiest solution.
>>>>
>>>> I was already going to do the same for the rentrant NMI and MCE
handling
>>>> case (and also the process pending upcalls checking), due to
the
>>>> complexities of fixing the race condition at the end of the
handler.
>>>>
>>>> Unfortunately, I don''t think I have time to look at
this issue
>>>> immediately, but if it is ok to wait till the beginning of next
week, I
>>> That''s fine of course.
>> So that was 3 months ago, and we''re likely going to need to do
>> further unfixed releases if this won''t move forward. Can you
>> estimate whether you''ll be able to get back to this?
> Let''s make a step in the right direction before 4.3, at least:
>
> commit d278beed1df2d226911dce92295411018c9bba2f
> Author: Tim Deegan <tim@xen.org>
> Date:   Thu Feb 28 12:42:15 2013 +0000
>
>     vmx: handle NMIs before re-enabling interrupts.
>     
>     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
>     rather than raising a fake NMI and relying on the NMI processing to
>     IRET, since that handling code is likely to change a fair amount in
>     future.
>     
>     Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

As this is functionally equivalent to the patch I was just testing.
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 5378928..462bb0f 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>          if ( vector == TRAP_machine_check )
>              do_machine_check(regs);
> +        if ( vector == TRAP_machine_check
> +             && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =>
+                 (X86_EVENTTYPE_NMI << 8)) )
> +        {
> +            do_nmi(regs);
> +            enable_nmis();
> +        }
>          break;
>      case EXIT_REASON_MCE_DURING_VMENTRY:
>          do_machine_check(regs);
> @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>                   (X86_EVENTTYPE_NMI << 8) )
>                  goto exit_and_crash;
>              HVMTRACE_0D(NMI);
> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> +            /* Already handled above. */
>              break;
>          case TRAP_machine_check:
>              HVMTRACE_0D(MCE);

Jan Beulich

2013-Feb-28 13:39 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote:
> commit d278beed1df2d226911dce92295411018c9bba2f
> Author: Tim Deegan <tim@xen.org>
> Date:   Thu Feb 28 12:42:15 2013 +0000
> 
>     vmx: handle NMIs before re-enabling interrupts.
>     
>     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
>     rather than raising a fake NMI and relying on the NMI processing to
>     IRET, since that handling code is likely to change a fair amount in
>     future.
Isn''t that a gross understatement, considering that you or Malcolm
had found that the use of self_nmi() here was actively broken?
>     Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>

(realizing that dealing with the PV side of the issue will be left to
me in the end)

Jan Beulich

2013-Feb-28 13:42 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote:
>     vmx: handle NMIs before re-enabling interrupts.
>     
>     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
>     rather than raising a fake NMI and relying on the NMI processing to
>     IRET, since that handling code is likely to change a fair amount in
>     future.
>     
>     Signed-off-by: Tim Deegan <tim@xen.org>
Sorry, my Acked-by just sent requires one change to be done first:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>          if ( vector == TRAP_machine_check )
>              do_machine_check(regs);
> +        if ( vector == TRAP_machine_check
This almost certainly needs to be TRAP_nmi.

Jan
> +             && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =>
+                 (X86_EVENTTYPE_NMI << 8)) )
> +        {
> +            do_nmi(regs);
> +            enable_nmis();
> +        }
>          break;
>      case EXIT_REASON_MCE_DURING_VMENTRY:
>          do_machine_check(regs);
> @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>                   (X86_EVENTTYPE_NMI << 8) )
>                  goto exit_and_crash;
>              HVMTRACE_0D(NMI);
> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> +            /* Already handled above. */
>              break;
>          case TRAP_machine_check:
>              HVMTRACE_0D(MCE);

Tim Deegan

2013-Feb-28 14:04 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 13:42 +0000 on 28 Feb (1362058925), Jan Beulich
wrote:> >>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote:
> >     vmx: handle NMIs before re-enabling interrupts.
> >     
> >     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
> >     rather than raising a fake NMI and relying on the NMI processing
to
> >     IRET, since that handling code is likely to change a fair amount
in
> >     future.
> >     
> >     Signed-off-by: Tim Deegan <tim@xen.org>
> 
> Sorry, my Acked-by just sent requires one change to be done first:
> 
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs
*regs)
> >          vector = intr_info & INTR_INFO_VECTOR_MASK;
> >          if ( vector == TRAP_machine_check )
> >              do_machine_check(regs);
> > +        if ( vector == TRAP_machine_check
> 
> This almost certainly needs to be TRAP_nmi.
!

I was foolishly using the watchdog for testing, and of course losing
NMIs doesn''t cause the watchdog to trigger. :|

Here''s v2, with that fixed and updating the description:


commit 7dd3b06ff031c9a8c727df16c5def2afb382101c
Author: Tim Deegan <tim@xen.org>
Date:   Thu Feb 28 12:42:15 2013 +0000

    vmx: fix handling of NMI VMEXIT.
    
    Call do_nmi() directly and explicitly re-enable NMIs rather than
    raising an NMI through the APIC. Since NMIs are disabled after the
    VMEXIT, the raised NMI would be blocked until the next IRET
    instruction (i.e. the next real interrupt, or after scheduling a PV
    guest) and in the meantime the guest will spin taking NMI VMEXITS.
    
    Also, handle NMIs before re-enabling interrupts, since if we handle an
    interrupt (and therefore IRET) before calling do_nmi(), we may end up
    running the NMI handler with NMIs enabled.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Acked-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 5378928..04dbefb 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         vector = intr_info & INTR_INFO_VECTOR_MASK;
         if ( vector == TRAP_machine_check )
             do_machine_check(regs);
+        if ( vector == TRAP_nmi
+             && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =+         
(X86_EVENTTYPE_NMI << 8)) )
+        {
+            do_nmi(regs);
+            enable_nmis();
+        }
         break;
     case EXIT_REASON_MCE_DURING_VMENTRY:
         do_machine_check(regs);
@@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
                  (X86_EVENTTYPE_NMI << 8) )
                 goto exit_and_crash;
             HVMTRACE_0D(NMI);
-            self_nmi(); /* Real NMI, vector 2: normal processing. */
+            /* Already handled above. */
             break;
         case TRAP_machine_check:
             HVMTRACE_0D(MCE);

Tim Deegan

2013-Feb-28 14:25 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich
wrote:> >>> On 28.02.13 at 14:00, Tim Deegan <tim@xen.org> wrote:
> > commit d278beed1df2d226911dce92295411018c9bba2f
> > Author: Tim Deegan <tim@xen.org>
> > Date:   Thu Feb 28 12:42:15 2013 +0000
> > 
> >     vmx: handle NMIs before re-enabling interrupts.
> >     
> >     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
> >     rather than raising a fake NMI and relying on the NMI processing
to
> >     IRET, since that handling code is likely to change a fair amount
in
> >     future.
> 
> Isn''t that a gross understatement, considering that you or Malcolm
> had found that the use of self_nmi() here was actively broken?
> 
> >     Signed-off-by: Tim Deegan <tim@xen.org>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> (realizing that dealing with the PV side of the issue will be left to
> me in the end)
For PV, would you be happy with something like this, or do you want to
avoid the extra IRET in cases where we would be returning with IRET
anyway?

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 04dbefb..598a5b5 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2316,10 +2316,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         if ( vector == TRAP_nmi
              && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =          
(X86_EVENTTYPE_NMI << 8)) )
-        {
             do_nmi(regs);
-            enable_nmis();
-        }
         break;
     case EXIT_REASON_MCE_DURING_VMENTRY:
         do_machine_check(regs);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index d36eddd..2a66080 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs)
     ++nmi_count(cpu);
 
     if ( nmi_callback(regs, cpu) )
-        return;
+        goto out;
 
     if ( nmi_watchdog )
         nmi_watchdog_tick(regs);
@@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs)
         if ( !(reason & 0xc0) && !nmi_watchdog )
             unknown_nmi_error(regs, reason);
     }
+
+out:
+    enable_nmis();
 }
 
 void set_nmi_callback(nmi_callback_t callback)

Jan Beulich

2013-Feb-28 14:42 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 15:25, Tim Deegan <tim@xen.org> wrote:
> At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich wrote:
>> (realizing that dealing with the PV side of the issue will be left to
>> me in the end)
> 
> For PV, would you be happy with something like this, or do you want to
> avoid the extra IRET in cases where we would be returning with IRET
> anyway?
No, and not so much because of the redundant IRET, but
because ...
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs)
>      ++nmi_count(cpu);
>  
>      if ( nmi_callback(regs, cpu) )
> -        return;
> +        goto out;
>  
>      if ( nmi_watchdog )
>          nmi_watchdog_tick(regs);
> @@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs)
>          if ( !(reason & 0xc0) && !nmi_watchdog )
>              unknown_nmi_error(regs, reason);
>      }
> +
> +out:
> +    enable_nmis();
... this must not be done when on the NMI stack (i.e. when the
NMI was raised while in hypervisor context). Checking for this
here would be strait forward, but I was really considering to do
all of this in the assembly exit path, and I was still undecided
whether we shouldn''t follow Linux in skipping softirq processing
(and hence scheduling) on the way out from an NMI (I don''t
think we''d need to do the same for MCE).

Jan
>  }
>  
>  void set_nmi_callback(nmi_callback_t callback)

Andrew Cooper

2013-Feb-28 14:45 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/13 14:42, Jan Beulich wrote:>>>> On 28.02.13 at 15:25, Tim Deegan <tim@xen.org> wrote:
>> At 13:39 +0000 on 28 Feb (1362058773), Jan Beulich wrote:
>>> (realizing that dealing with the PV side of the issue will be left
to
>>> me in the end)
>> For PV, would you be happy with something like this, or do you want to
>> avoid the extra IRET in cases where we would be returning with IRET
>> anyway?
> No, and not so much because of the redundant IRET, but
> because ...
>
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -3211,7 +3211,7 @@ void do_nmi(struct cpu_user_regs *regs)
>>      ++nmi_count(cpu);
>>  
>>      if ( nmi_callback(regs, cpu) )
>> -        return;
>> +        goto out;
>>  
>>      if ( nmi_watchdog )
>>          nmi_watchdog_tick(regs);
>> @@ -3227,6 +3227,9 @@ void do_nmi(struct cpu_user_regs *regs)
>>          if ( !(reason & 0xc0) && !nmi_watchdog )
>>              unknown_nmi_error(regs, reason);
>>      }
>> +
>> +out:
>> +    enable_nmis();
> ... this must not be done when on the NMI stack (i.e. when the
> NMI was raised while in hypervisor context). Checking for this
> here would be strait forward, but I was really considering to do
> all of this in the assembly exit path, and I was still undecided
> whether we shouldn''t follow Linux in skipping softirq processing
> (and hence scheduling) on the way out from an NMI (I don''t
> think we''d need to do the same for MCE).
>
> Jan
It is furthermore pointless.  If we interrupted a guest with the NMI, we
would have moved onto the main stack.  We would only be on the NMI stack
at this point if we interrupted Xen with the NMI, at which point we will
be iret''ing back anyway.

~Andrew
>
>>  }
>>  
>>  void set_nmi_callback(nmi_callback_t callback)
>
>

Tim Deegan

2013-Feb-28 14:49 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 14:42 +0000 on 28 Feb (1362062542), Jan Beulich
wrote:> > +out:
> > +    enable_nmis();
> 
> ... this must not be done when on the NMI stack (i.e. when the
> NMI was raised while in hypervisor context).
Right; I had forgotten that we didn''t switch stacks in that case. 
> Checking for this
> here would be strait forward, but I was really considering to do
> all of this in the assembly exit path, and I was still undecided
> whether we shouldn''t follow Linux in skipping softirq processing
> (and hence scheduling) on the way out from an NMI (I don''t
> think we''d need to do the same for MCE).
That sounds sensible, if it simplifies the logic (i.e. if it avoids
needing an extra IRET just to re-enable NMIs when we might as well IRET
to the guest).  Or is there any other reason to avoid it?

Tim.

Konrad Rzeszutek Wilk

2013-Feb-28 14:51 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On Thu, Feb 28, 2013 at 01:00:08PM +0000, Tim Deegan
wrote:> At 09:58 +0000 on 28 Feb (1362045494), Jan Beulich wrote:
> > >>> On 22.11.12 at 17:12, Jan Beulich wrote:
> > >>>> On 22.11.12 at 17:05, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > > > On 22/11/12 15:55, Jan Beulich wrote:
> > > >>>>> On 22.11.12 at 16:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > > >>> A quick solution would be to execute a noop function
with
> > > >>> run_in_exception_handler().  Alternatively, I can
code enable_nmi() or
> > > >>> so which does an inline iret to itself.  Which of
these would you prefer?
> > > >> I would actually consider avoiding to run softirqs
altogether in that
> > > >> case, just like native Linux doesn''t do any
event or scheduler
> > > >> processing in that case.
> > > > 
> > > > That would probably be the easiest solution.
> > > > 
> > > > I was already going to do the same for the rentrant NMI and
MCE handling
> > > > case (and also the process pending upcalls checking), due to
the
> > > > complexities of fixing the race condition at the end of the
handler.
> > > > 
> > > > Unfortunately, I don''t think I have time to look at
this issue
> > > > immediately, but if it is ok to wait till the beginning of
next week, I
> > > 
> > > That''s fine of course.
> > 
> > So that was 3 months ago, and we''re likely going to need to
do
> > further unfixed releases if this won''t move forward. Can you
> > estimate whether you''ll be able to get back to this?
> 
> Let''s make a step in the right direction before 4.3, at least:
FYI, this also fixes oprofile being able to collect HVM guest
data.
> 
> commit d278beed1df2d226911dce92295411018c9bba2f
> Author: Tim Deegan <tim@xen.org>
> Date:   Thu Feb 28 12:42:15 2013 +0000
> 
>     vmx: handle NMIs before re-enabling interrupts.
>     
>     Also, switch to calling do_nmi() and explicitly re-enabling NMIs
>     rather than raising a fake NMI and relying on the NMI processing to
>     IRET, since that handling code is likely to change a fair amount in
>     future.
>     
>     Signed-off-by: Tim Deegan <tim@xen.org>
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 5378928..462bb0f 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2313,6 +2313,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>          vector = intr_info & INTR_INFO_VECTOR_MASK;
>          if ( vector == TRAP_machine_check )
>              do_machine_check(regs);
> +        if ( vector == TRAP_machine_check
> +             && ((intr_info & INTR_INFO_INTR_TYPE_MASK) =>
+                 (X86_EVENTTYPE_NMI << 8)) )
> +        {
> +            do_nmi(regs);
> +            enable_nmis();
> +        }
>          break;
>      case EXIT_REASON_MCE_DURING_VMENTRY:
>          do_machine_check(regs);
> @@ -2486,7 +2493,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>                   (X86_EVENTTYPE_NMI << 8) )
>                  goto exit_and_crash;
>              HVMTRACE_0D(NMI);
> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> +            /* Already handled above. */
>              break;
>          case TRAP_machine_check:
>              HVMTRACE_0D(MCE);
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

Jan Beulich

2013-Feb-28 15:01 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 15:49, Tim Deegan <tim@xen.org> wrote:
> At 14:42 +0000 on 28 Feb (1362062542), Jan Beulich wrote:
>> Checking for this
>> here would be strait forward, but I was really considering to do
>> all of this in the assembly exit path, and I was still undecided
>> whether we shouldn''t follow Linux in skipping softirq
processing
>> (and hence scheduling) on the way out from an NMI (I don''t
>> think we''d need to do the same for MCE).
> 
> That sounds sensible, if it simplifies the logic (i.e. if it avoids
> needing an extra IRET just to re-enable NMIs when we might as well IRET
> to the guest).  Or is there any other reason to avoid it?
No, I don''t think so. We only need to make sure we don''t
execute
one as long as we''re on the NMI stack.

Jan

Jan Beulich

2013-Feb-28 15:41 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 15:42, "Jan Beulich"
<JBeulich@suse.com> wrote:
> ... this must not be done when on the NMI stack (i.e. when the
> NMI was raised while in hypervisor context). Checking for this
> here would be strait forward, but I was really considering to do
> all of this in the assembly exit path, and I was still undecided
> whether we shouldn''t follow Linux in skipping softirq processing
> (and hence scheduling) on the way out from an NMI (I don''t
> think we''d need to do the same for MCE).
Like this:

x86: skip processing events on the NMI exit path

Otherwise, we may end up in the scheduler, keeping NMIs masked for a
possibly unbounded time (until whenever the next IRET gets executed).

Of course it''s open for discussion whether to always use the strait
exit path from handle_ist_exception.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -171,7 +171,7 @@ compat_bad_hypercall:
         jmp  compat_test_all_events
 
 /* %rbx: struct vcpu, interrupts disabled */
-compat_restore_all_guest:
+ENTRY(compat_restore_all_guest)
         ASSERT_INTERRUPTS_DISABLED
         RESTORE_ALL adj=8 compat=1
 .Lft0:  iretq
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -635,6 +635,9 @@ ENTRY(early_page_fault)
         jmp   restore_all_xen
         .popsection
 
+ENTRY(nmi)
+        pushq $0
+        movl  $TRAP_nmi,4(%rsp)
 handle_ist_exception:
         SAVE_ALL
         testb $3,UREGS_cs(%rsp)
@@ -649,12 +652,17 @@ handle_ist_exception:
         movzbl UREGS_entry_vector(%rsp),%eax
         leaq  exception_table(%rip),%rdx
         callq *(%rdx,%rax,8)
-        jmp   ret_from_intr
+        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
+        jne   ret_from_intr
 
-ENTRY(nmi)
-        pushq $0
-        movl  $TRAP_nmi,4(%rsp)
-        jmp   handle_ist_exception
+        /* We want to get strait to the IRET in the NMI exit path. */
+        testb $3,UREGS_cs(%rsp)
+        GET_CURRENT(%rbx)
+        jz    restore_all_xen
+        movq  VCPU_domain(%rbx),%rax
+        testb $1,DOMAIN_is_32bit_pv(%rax)
+        jz    restore_all_guest
+        jmp   compat_restore_all_guest
 
 ENTRY(nmi_crash)
         pushq $0

Andrew Cooper

2013-Feb-28 15:52 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/13 15:41, Jan Beulich wrote:>>>> On 28.02.13 at 15:42, "Jan Beulich"
<JBeulich@suse.com> wrote:
>> ... this must not be done when on the NMI stack (i.e. when the
>> NMI was raised while in hypervisor context). Checking for this
>> here would be strait forward, but I was really considering to do
>> all of this in the assembly exit path, and I was still undecided
>> whether we shouldn''t follow Linux in skipping softirq
processing
>> (and hence scheduling) on the way out from an NMI (I don''t
>> think we''d need to do the same for MCE).
> Like this:
>
> x86: skip processing events on the NMI exit path
>
> Otherwise, we may end up in the scheduler, keeping NMIs masked for a
> possibly unbounded time (until whenever the next IRET gets executed).
>
> Of course it''s open for discussion whether to always use the
strait
> exit path from handle_ist_exception.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
The logic looks good to me.  As for the other IST exceptions, we will
never return from a double fault, and will rarely return from an MCE, so
I would say it doesn''t really matter at the moment.

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

~Andrew
>
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -171,7 +171,7 @@ compat_bad_hypercall:
>          jmp  compat_test_all_events
>  
>  /* %rbx: struct vcpu, interrupts disabled */
> -compat_restore_all_guest:
> +ENTRY(compat_restore_all_guest)
>          ASSERT_INTERRUPTS_DISABLED
>          RESTORE_ALL adj=8 compat=1
>  .Lft0:  iretq
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -635,6 +635,9 @@ ENTRY(early_page_fault)
>          jmp   restore_all_xen
>          .popsection
>  
> +ENTRY(nmi)
> +        pushq $0
> +        movl  $TRAP_nmi,4(%rsp)
>  handle_ist_exception:
>          SAVE_ALL
>          testb $3,UREGS_cs(%rsp)
> @@ -649,12 +652,17 @@ handle_ist_exception:
>          movzbl UREGS_entry_vector(%rsp),%eax
>          leaq  exception_table(%rip),%rdx
>          callq *(%rdx,%rax,8)
> -        jmp   ret_from_intr
> +        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
> +        jne   ret_from_intr
>  
> -ENTRY(nmi)
> -        pushq $0
> -        movl  $TRAP_nmi,4(%rsp)
> -        jmp   handle_ist_exception
> +        /* We want to get strait to the IRET in the NMI exit path. */
> +        testb $3,UREGS_cs(%rsp)
> +        GET_CURRENT(%rbx)
> +        jz    restore_all_xen
> +        movq  VCPU_domain(%rbx),%rax
> +        testb $1,DOMAIN_is_32bit_pv(%rax)
> +        jz    restore_all_guest
> +        jmp   compat_restore_all_guest
>  
>  ENTRY(nmi_crash)
>          pushq $0
>
>
>

Tim Deegan

2013-Feb-28 15:55 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

At 15:41 +0000 on 28 Feb (1362066080), Jan Beulich
wrote:> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -635,6 +635,9 @@ ENTRY(early_page_fault)
>          jmp   restore_all_xen
>          .popsection
>  
> +ENTRY(nmi)
> +        pushq $0
> +        movl  $TRAP_nmi,4(%rsp)
>  handle_ist_exception:
>          SAVE_ALL
>          testb $3,UREGS_cs(%rsp)
> @@ -649,12 +652,17 @@ handle_ist_exception:
>          movzbl UREGS_entry_vector(%rsp),%eax
>          leaq  exception_table(%rip),%rdx
>          callq *(%rdx,%rax,8)
> -        jmp   ret_from_intr
> +        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
> +        jne   ret_from_intr
>  
> -ENTRY(nmi)
> -        pushq $0
> -        movl  $TRAP_nmi,4(%rsp)
> -        jmp   handle_ist_exception
> +        /* We want to get strait to the IRET in the NMI exit path. */
Language nit: ''straight'', meaning linear or directly, is
suitable here.
''strait'' means narrow or constrained (as in strait-jacket).
> +        testb $3,UREGS_cs(%rsp)
> +        GET_CURRENT(%rbx)
> +        jz    restore_all_xen
GET_CURRENT() contains an addq, which updates ZF.  Swap it with the
testb?

Otherwise, looks good to me. 

Reviewed-by: Tim Deegan <tim@xen.org>
> +        movq  VCPU_domain(%rbx),%rax
> +        testb $1,DOMAIN_is_32bit_pv(%rax)
> +        jz    restore_all_guest
> +        jmp   compat_restore_all_guest
>  
>  ENTRY(nmi_crash)
>          pushq $0
> 
> 
>

Keir Fraser

2013-Feb-28 16:01 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/2013 15:41, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 28.02.13 at 15:42, "Jan Beulich"
<JBeulich@suse.com> wrote:
>> ... this must not be done when on the NMI stack (i.e. when the
>> NMI was raised while in hypervisor context). Checking for this
>> here would be strait forward, but I was really considering to do
>> all of this in the assembly exit path, and I was still undecided
>> whether we shouldn''t follow Linux in skipping softirq
processing
>> (and hence scheduling) on the way out from an NMI (I don''t
>> think we''d need to do the same for MCE).
> 
> Like this:
> 
> x86: skip processing events on the NMI exit path
> 
> Otherwise, we may end up in the scheduler, keeping NMIs masked for a
> possibly unbounded time (until whenever the next IRET gets executed).
Is this alternative that we might not process events for an unbounded time?
No, I guess not -- either we would interrupt the notifying IPI and we will
be IRETing into that IPI''s handler, or the notifying IPI is delayed
until
the NMI handler''s IRET.

What about if the NMI handler itself raises an event (eg softirq)? Perhaps
there are no very essential ones of those?
> Of course it''s open for discussion whether to always use the
strait
> exit path from handle_ist_exception.
s/strait/straight (and below in the code comment that you add).
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -171,7 +171,7 @@ compat_bad_hypercall:
>          jmp  compat_test_all_events
>  
>  /* %rbx: struct vcpu, interrupts disabled */
> -compat_restore_all_guest:
> +ENTRY(compat_restore_all_guest)
>          ASSERT_INTERRUPTS_DISABLED
>          RESTORE_ALL adj=8 compat=1
>  .Lft0:  iretq
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -635,6 +635,9 @@ ENTRY(early_page_fault)
>          jmp   restore_all_xen
>          .popsection
>  
> +ENTRY(nmi)
> +        pushq $0
> +        movl  $TRAP_nmi,4(%rsp)
>  handle_ist_exception:
>          SAVE_ALL
>          testb $3,UREGS_cs(%rsp)
> @@ -649,12 +652,17 @@ handle_ist_exception:
>          movzbl UREGS_entry_vector(%rsp),%eax
>          leaq  exception_table(%rip),%rdx
>          callq *(%rdx,%rax,8)
> -        jmp   ret_from_intr
> +        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
> +        jne   ret_from_intr
>  
> -ENTRY(nmi)
> -        pushq $0
> -        movl  $TRAP_nmi,4(%rsp)
> -        jmp   handle_ist_exception
> +        /* We want to get strait to the IRET in the NMI exit path. */
> +        testb $3,UREGS_cs(%rsp)
> +        GET_CURRENT(%rbx)
> +        jz    restore_all_xen
> +        movq  VCPU_domain(%rbx),%rax
> +        testb $1,DOMAIN_is_32bit_pv(%rax)
> +        jz    restore_all_guest
> +        jmp   compat_restore_all_guest
>  
>  ENTRY(nmi_crash)
>          pushq $0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Jan Beulich

2013-Feb-28 16:12 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 16:55, Tim Deegan <tim@xen.org> wrote:
> At 15:41 +0000 on 28 Feb (1362066080), Jan Beulich wrote:
>> +        testb $3,UREGS_cs(%rsp)
>> +        GET_CURRENT(%rbx)
>> +        jz    restore_all_xen
> 
> GET_CURRENT() contains an addq, which updates ZF.  Swap it with the
> testb?
I''m glad you spotted that - thanks. I moved the GET_CURRENT()
down though instead of up.

Jan

Jan Beulich

2013-Feb-28 16:17 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

>>> On 28.02.13 at 17:01, Keir Fraser <keir@xen.org> wrote:
> On 28/02/2013 15:41, "Jan Beulich" <JBeulich@suse.com>
wrote:
> 
>>>>> On 28.02.13 at 15:42, "Jan Beulich"
<JBeulich@suse.com> wrote:
>>> ... this must not be done when on the NMI stack (i.e. when the
>>> NMI was raised while in hypervisor context). Checking for this
>>> here would be strait forward, but I was really considering to do
>>> all of this in the assembly exit path, and I was still undecided
>>> whether we shouldn''t follow Linux in skipping softirq
processing
>>> (and hence scheduling) on the way out from an NMI (I don''t
>>> think we''d need to do the same for MCE).
>> 
>> Like this:
>> 
>> x86: skip processing events on the NMI exit path
>> 
>> Otherwise, we may end up in the scheduler, keeping NMIs masked for a
>> possibly unbounded time (until whenever the next IRET gets executed).
> 
> Is this alternative that we might not process events for an unbounded time?
> No, I guess not -- either we would interrupt the notifying IPI and we will
> be IRETing into that IPI''s handler, or the notifying IPI is
delayed until
> the NMI handler''s IRET.
> 
> What about if the NMI handler itself raises an event (eg softirq)? Perhaps
> there are no very essential ones of those?
We do raise PCI_SERR_SOFTIRQ, and the possibly injected NMI
(to Dom0) might get slightly deferred too. The latter seems of
little concern (Dom0 will get the event eventually). For the
former, we might want to explicitly send a self-IPI with
EVENT_CHECK_VECTOR following the raise_softirq()?

Jan

Keir Fraser

2013-Feb-28 19:02 UTC

head link

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

On 28/02/2013 16:17, "Jan Beulich" <JBeulich@suse.com> wrote:
>> Is this alternative that we might not process events for an unbounded
time?
>> No, I guess not -- either we would interrupt the notifying IPI and we
will
>> be IRETing into that IPI''s handler, or the notifying IPI is
delayed until
>> the NMI handler''s IRET.
>> 
>> What about if the NMI handler itself raises an event (eg softirq)?
Perhaps
>> there are no very essential ones of those?
> 
> We do raise PCI_SERR_SOFTIRQ, and the possibly injected NMI
> (to Dom0) might get slightly deferred too. The latter seems of
> little concern (Dom0 will get the event eventually). For the
> former, we might want to explicitly send a self-IPI with
> EVENT_CHECK_VECTOR following the raise_softirq()?
Or perhaps self-IPI on the NMI exit path if softirq_pending is non-zero?
Conservative but more generic.

 -- Keir

Jan Beulich

2013-Mar-01 10:49 UTC

head link

[PATCH v2 0/2] x86: defer processing events on the NMI exit path

1: defer processing events on the NMI exit path
2: don''t rely on __softirq_pending to be the first field in
irq_cpustat_t

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Jan Beulich

2013-Mar-01 10:56 UTC

head link

[PATCH v2 1/2] x86: defer processing events on the NMI exit path

Otherwise, we may end up in the scheduler, keeping NMIs masked for a
possibly unbounded period of time (until whenever the next IRET gets
executed). Enforce timely event processing by sending a self IPI.

Of course it''s open for discussion whether to always use the straight
exit path from handle_ist_exception.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Send IPI to ourselves to enforce event processing. Dropped all ACKs.

--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -171,7 +171,7 @@ compat_bad_hypercall:
         jmp  compat_test_all_events
 
 /* %rbx: struct vcpu, interrupts disabled */
-compat_restore_all_guest:
+ENTRY(compat_restore_all_guest)
         ASSERT_INTERRUPTS_DISABLED
         RESTORE_ALL adj=8 compat=1
 .Lft0:  iretq
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -11,6 +11,7 @@
 #include <asm/apicdef.h>
 #include <asm/page.h>
 #include <public/xen.h>
+#include <irq_vectors.h>
 
         ALIGN
 /* %rbx: struct vcpu */
@@ -635,6 +636,9 @@ ENTRY(early_page_fault)
         jmp   restore_all_xen
         .popsection
 
+ENTRY(nmi)
+        pushq $0
+        movl  $TRAP_nmi,4(%rsp)
 handle_ist_exception:
         SAVE_ALL
         testb $3,UREGS_cs(%rsp)
@@ -649,12 +653,25 @@ handle_ist_exception:
         movzbl UREGS_entry_vector(%rsp),%eax
         leaq  exception_table(%rip),%rdx
         callq *(%rdx,%rax,8)
-        jmp   ret_from_intr
+        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
+        jne   ret_from_intr
 
-ENTRY(nmi)
-        pushq $0
-        movl  $TRAP_nmi,4(%rsp)
-        jmp   handle_ist_exception
+        /* We want to get straight to the IRET on the NMI exit path. */
+        testb $3,UREGS_cs(%rsp)
+        jz    restore_all_xen
+        GET_CURRENT(%rbx)
+        /* Send an IPI to ourselves to cover for the lack of event checking. */
+        movl  VCPU_processor(%rbx),%eax
+        shll  $IRQSTAT_shift,%eax
+        leaq  irq_stat(%rip),%rcx
+        cmpl  $0,(%rcx,%rax,1)
+        je    1f
+        movl  $EVENT_CHECK_VECTOR,%edi
+        call  send_IPI_self
+1:      movq  VCPU_domain(%rbx),%rax
+        cmpb  $0,DOMAIN_is_32bit_pv(%rax)
+        je    restore_all_guest
+        jmp   compat_restore_all_guest
 
 ENTRY(nmi_crash)
         pushq $0





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Jan Beulich

2013-Mar-01 10:57 UTC

head link

[PATCH v2 2/2] x86: don''t rely on __softirq_pending to be the first field in irq_cpustat_t

This is even more so as the field doesn''t have a comment to that effect
in the structure definition.

Once modifying the respective assembly code, also convert the
IRQSTAT_shift users to do a 32-bit shift only (as we won''t support 48M
CPUs any time soon) and use "cmpl" instead of "testl" when
checking the
field (both reducing code size).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -41,10 +41,10 @@ ENTRY(svm_asm_do_resume)
         CLGI
 
         mov  VCPU_processor(%rbx),%eax
-        shl  $IRQSTAT_shift,%rax
-        lea  irq_stat(%rip),%rdx
-        testl $~0,(%rdx,%rax,1)
-        jnz  .Lsvm_process_softirqs
+        shl  $IRQSTAT_shift,%eax
+        lea  irq_stat+IRQSTAT_softirq_pending(%rip),%rdx
+        cmpl $0,(%rdx,%rax,1)
+        jne  .Lsvm_process_softirqs
 
         testb $0, VCPU_nsvm_hap_enabled(%rbx)
 UNLIKELY_START(nz, nsvm_hap)
--- a/xen/arch/x86/hvm/vmx/entry.S
+++ b/xen/arch/x86/hvm/vmx/entry.S
@@ -97,8 +97,8 @@ vmx_asm_do_vmentry:
         cli
 
         mov  VCPU_processor(%rbx),%eax
-        shl  $IRQSTAT_shift,%rax
-        lea  irq_stat(%rip),%rdx
+        shl  $IRQSTAT_shift,%eax
+        lea  irq_stat+IRQSTAT_softirq_pending(%rip),%rdx
         cmpl $0,(%rdx,%rax,1)
         jnz  .Lvmx_process_softirqs
 
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -162,6 +162,7 @@ void __dummy__(void)
 #endif
 
     DEFINE(IRQSTAT_shift, LOG_2(sizeof(irq_cpustat_t)));
+    OFFSET(IRQSTAT_softirq_pending, irq_cpustat_t, __softirq_pending);
     BLANK();
 
     OFFSET(CPUINFO86_ext_features, struct cpuinfo_x86, x86_capability[1]);
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -96,10 +96,10 @@ ENTRY(compat_test_all_events)
         cli                             # tests must not race interrupts
 /*compat_test_softirqs:*/
         movl  VCPU_processor(%rbx),%eax
-        shlq  $IRQSTAT_shift,%rax
-        leaq  irq_stat(%rip),%rcx
-        testl $~0,(%rcx,%rax,1)
-        jnz   compat_process_softirqs
+        shll  $IRQSTAT_shift,%eax
+        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
+        cmpl  $0,(%rcx,%rax,1)
+        jne   compat_process_softirqs
         testb $1,VCPU_mce_pending(%rbx)
         jnz   compat_process_mce
 .Lcompat_test_guest_nmi:
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -195,8 +195,8 @@ test_all_events:
         cli                             # tests must not race interrupts
 /*test_softirqs:*/  
         movl  VCPU_processor(%rbx),%eax
-        shl   $IRQSTAT_shift,%rax
-        leaq  irq_stat(%rip),%rcx
+        shll  $IRQSTAT_shift,%eax
+        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
         cmpl  $0,(%rcx,%rax,1)
         jne   process_softirqs
         testb $1,VCPU_mce_pending(%rbx)
@@ -663,7 +663,7 @@ handle_ist_exception:
         /* Send an IPI to ourselves to cover for the lack of event checking. */
         movl  VCPU_processor(%rbx),%eax
         shll  $IRQSTAT_shift,%eax
-        leaq  irq_stat(%rip),%rcx
+        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
         cmpl  $0,(%rcx,%rax,1)
         je    1f
         movl  $EVENT_CHECK_VECTOR,%edi





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Andrew Cooper

2013-Mar-01 11:37 UTC

head link

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

On 01/03/13 10:56, Jan Beulich wrote:> Otherwise, we may end up in the scheduler, keeping NMIs masked for a
> possibly unbounded period of time (until whenever the next IRET gets
> executed). Enforce timely event processing by sending a self IPI.
>
> Of course it''s open for discussion whether to always use the
straight
> exit path from handle_ist_exception.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Send IPI to ourselves to enforce event processing. Dropped all ACKs.
>
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -171,7 +171,7 @@ compat_bad_hypercall:
>          jmp  compat_test_all_events
>  
>  /* %rbx: struct vcpu, interrupts disabled */
> -compat_restore_all_guest:
> +ENTRY(compat_restore_all_guest)
>          ASSERT_INTERRUPTS_DISABLED
>          RESTORE_ALL adj=8 compat=1
>  .Lft0:  iretq
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -11,6 +11,7 @@
>  #include <asm/apicdef.h>
>  #include <asm/page.h>
>  #include <public/xen.h>
> +#include <irq_vectors.h>
>  
>          ALIGN
>  /* %rbx: struct vcpu */
> @@ -635,6 +636,9 @@ ENTRY(early_page_fault)
>          jmp   restore_all_xen
>          .popsection
>  
> +ENTRY(nmi)
> +        pushq $0
> +        movl  $TRAP_nmi,4(%rsp)
>  handle_ist_exception:
>          SAVE_ALL
>          testb $3,UREGS_cs(%rsp)
> @@ -649,12 +653,25 @@ handle_ist_exception:
>          movzbl UREGS_entry_vector(%rsp),%eax
>          leaq  exception_table(%rip),%rdx
>          callq *(%rdx,%rax,8)
> -        jmp   ret_from_intr
> +        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
> +        jne   ret_from_intr
>  
> -ENTRY(nmi)
> -        pushq $0
> -        movl  $TRAP_nmi,4(%rsp)
> -        jmp   handle_ist_exception
> +        /* We want to get straight to the IRET on the NMI exit path. */
> +        testb $3,UREGS_cs(%rsp)
> +        jz    restore_all_xen
> +        GET_CURRENT(%rbx)
> +        /* Send an IPI to ourselves to cover for the lack of event
checking. */
> +        movl  VCPU_processor(%rbx),%eax
> +        shll  $IRQSTAT_shift,%eax
> +        leaq  irq_stat(%rip),%rcx
> +        cmpl  $0,(%rcx,%rax,1)
__softirq_pending is an unsigned long.  Would it not be prudent to use
cmpq to save obscure bugs if the implementation changes, or are we
sufficiently sure that this wont happen?

~Andrew
> +        je    1f
> +        movl  $EVENT_CHECK_VECTOR,%edi
> +        call  send_IPI_self
> +1:      movq  VCPU_domain(%rbx),%rax
> +        cmpb  $0,DOMAIN_is_32bit_pv(%rax)
> +        je    restore_all_guest
> +        jmp   compat_restore_all_guest
>  
>  ENTRY(nmi_crash)
>          pushq $0
>
>
>

Jan Beulich

2013-Mar-01 11:53 UTC

head link

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

>>> On 01.03.13 at 12:37, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 01/03/13 10:56, Jan Beulich wrote:
>> Otherwise, we may end up in the scheduler, keeping NMIs masked for a
>> possibly unbounded period of time (until whenever the next IRET gets
>> executed). Enforce timely event processing by sending a self IPI.
>>
>> Of course it''s open for discussion whether to always use the
straight
>> exit path from handle_ist_exception.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v2: Send IPI to ourselves to enforce event processing. Dropped all
ACKs.
>>
>> --- a/xen/arch/x86/x86_64/compat/entry.S
>> +++ b/xen/arch/x86/x86_64/compat/entry.S
>> @@ -171,7 +171,7 @@ compat_bad_hypercall:
>>          jmp  compat_test_all_events
>>  
>>  /* %rbx: struct vcpu, interrupts disabled */
>> -compat_restore_all_guest:
>> +ENTRY(compat_restore_all_guest)
>>          ASSERT_INTERRUPTS_DISABLED
>>          RESTORE_ALL adj=8 compat=1
>>  .Lft0:  iretq
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -11,6 +11,7 @@
>>  #include <asm/apicdef.h>
>>  #include <asm/page.h>
>>  #include <public/xen.h>
>> +#include <irq_vectors.h>
>>  
>>          ALIGN
>>  /* %rbx: struct vcpu */
>> @@ -635,6 +636,9 @@ ENTRY(early_page_fault)
>>          jmp   restore_all_xen
>>          .popsection
>>  
>> +ENTRY(nmi)
>> +        pushq $0
>> +        movl  $TRAP_nmi,4(%rsp)
>>  handle_ist_exception:
>>          SAVE_ALL
>>          testb $3,UREGS_cs(%rsp)
>> @@ -649,12 +653,25 @@ handle_ist_exception:
>>          movzbl UREGS_entry_vector(%rsp),%eax
>>          leaq  exception_table(%rip),%rdx
>>          callq *(%rdx,%rax,8)
>> -        jmp   ret_from_intr
>> +        cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
>> +        jne   ret_from_intr
>>  
>> -ENTRY(nmi)
>> -        pushq $0
>> -        movl  $TRAP_nmi,4(%rsp)
>> -        jmp   handle_ist_exception
>> +        /* We want to get straight to the IRET on the NMI exit path.
*/
>> +        testb $3,UREGS_cs(%rsp)
>> +        jz    restore_all_xen
>> +        GET_CURRENT(%rbx)
>> +        /* Send an IPI to ourselves to cover for the lack of event 
> checking. */
>> +        movl  VCPU_processor(%rbx),%eax
>> +        shll  $IRQSTAT_shift,%eax
>> +        leaq  irq_stat(%rip),%rcx
>> +        cmpl  $0,(%rcx,%rax,1)
> 
> __softirq_pending is an unsigned long.  Would it not be prudent to use
> cmpq to save obscure bugs if the implementation changes, or are we
> sufficiently sure that this wont happen?
All other existing instances of similar assembly code use testl or
cmpl, and in fact I merely copied some other instance.

As we''re not getting close to 32, I think we might rather want to
adjust the __softirq_pending type. Keir?

Jan

Keir Fraser

2013-Mar-01 15:55 UTC

head link

Re: [PATCH v2 0/2] x86: defer processing events on the NMI exit path

On 01/03/2013 10:49, "Jan Beulich" <JBeulich@suse.com> wrote:
> 1: defer processing events on the NMI exit path
> 2: don''t rely on __softirq_pending to be the first field in
irq_cpustat_t
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Nice.

Acked-by: Keir Fraser <keir@xen.org>

Keir Fraser

2013-Mar-01 15:56 UTC

head link

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com> wrote:
>> __softirq_pending is an unsigned long.  Would it not be prudent to use
>> cmpq to save obscure bugs if the implementation changes, or are we
>> sufficiently sure that this wont happen?
> 
> All other existing instances of similar assembly code use testl or
> cmpl, and in fact I merely copied some other instance.
> 
> As we''re not getting close to 32, I think we might rather want to
> adjust the __softirq_pending type. Keir?
Yup, I don''t see any reason why it couldn''t be a uint32_t.

 -- Keir

Andrew Cooper

2013-Mar-01 16:01 UTC

head link

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

On 01/03/13 15:56, Keir Fraser wrote:> On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com>
wrote:
>
>>> __softirq_pending is an unsigned long.  Would it not be prudent to
use
>>> cmpq to save obscure bugs if the implementation changes, or are we
>>> sufficiently sure that this wont happen?
>> All other existing instances of similar assembly code use testl or
>> cmpl, and in fact I merely copied some other instance.
>>
>> As we''re not getting close to 32, I think we might rather want
to
>> adjust the __softirq_pending type. Keir?
> Yup, I don''t see any reason why it couldn''t be a
uint32_t.
>
>  -- Keir
>
>
Further to that, is there any reason that it cant be per-cpu, to save
having information like this moving around pcpus in a hot cache line?

~Andrew

Jan Beulich

2013-Mar-01 16:08 UTC

head link

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

>>> On 01.03.13 at 17:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 01/03/13 15:56, Keir Fraser wrote:
>> On 01/03/2013 11:53, "Jan Beulich" <JBeulich@suse.com>
wrote:
>>
>>>> __softirq_pending is an unsigned long.  Would it not be prudent
to use
>>>> cmpq to save obscure bugs if the implementation changes, or are
we
>>>> sufficiently sure that this wont happen?
>>> All other existing instances of similar assembly code use testl or
>>> cmpl, and in fact I merely copied some other instance.
>>>
>>> As we''re not getting close to 32, I think we might rather
want to
>>> adjust the __softirq_pending type. Keir?
>> Yup, I don''t see any reason why it couldn''t be a
uint32_t.
> 
> Further to that, is there any reason that it cant be per-cpu, to save
> having information like this moving around pcpus in a hot cache line?
The structure definition has __cacheline_aligned, there shouldn''t
be much moving around (as long as our cache line size definition
is still suitable for modern CPUs'' L1 caches at least).

Jan

Maybe Matching Threads

Search for more reasonably related threads

Xen devel - Nov 2012 - [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

[PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

Re: [PATCH V3] vmx/nmi: Do not use self_nmi() in VMEXIT handler

[PATCH v2 0/2] x86: defer processing events on the NMI exit path

[PATCH v2 1/2] x86: defer processing events on the NMI exit path

[PATCH v2 2/2] x86: don''t rely on __softirq_pending to be the first field in irq_cpustat_t

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

Re: [PATCH v2 0/2] x86: defer processing events on the NMI exit path

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

Re: [PATCH v2 1/2] x86: defer processing events on the NMI exit path

Maybe Matching Threads