Peter Zijlstra
2021-May-19 17:54 UTC
[PATCH v2 5/8] x86/sev-es: Leave NMI-mode before sending signals
On Wed, May 19, 2021 at 03:52:48PM +0200, Joerg Roedel wrote:> --- a/arch/x86/kernel/sev.c > +++ b/arch/x86/kernel/sev.c > @@ -1343,9 +1343,10 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication) > return; > } > > + instrumentation_begin(); > + > irq_state = irqentry_nmi_enter(regs); > lockdep_assert_irqs_disabled(); > - instrumentation_begin(); > > /* > * This is invoked through an interrupt gate, so IRQs are disabled. TheThat's just plain wrong. No instrumentation is allowed before you enter the exception context.> @@ -1395,13 +1396,19 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication) > BUG(); > } > > -out: > - instrumentation_end(); > irqentry_nmi_exit(regs, irq_state); > + instrumentation_end();And this can't be right either, same issue, no instrumentation is allowed after you leave the exception context.> > return; > > fail: > + /* > + * Leave NMI mode - the GHCB is not busy anymore and depending on where > + * the #VC came from this code is about to either kill the task (when in > + * task context) or kill the machine. > + */ > + irqentry_nmi_exit(regs, irq_state); > +And this is wrong too; because at this point the handler doesn't run in _any_ context anymore, certainly not one you can call regular C code from.> if (user_mode(regs)) { > /* > * Do not kill the machine if user-space triggered the > @@ -1423,7 +1430,9 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication) > panic("Returned from Terminate-Request to Hypervisor\n"); > } > > - goto out; > + instrumentation_end(); > + > + return; > }You either get to do what MCE does, or what MCE does. That is, either use task_work or MCE_USER and have the _user() handler use irqentry_enter_from_user_mode(). The above is an absolute no-go.
Joerg Roedel
2021-May-19 19:13 UTC
[PATCH v2 5/8] x86/sev-es: Leave NMI-mode before sending signals
Hi Peter, thanks for your review. On Wed, May 19, 2021 at 07:54:50PM +0200, Peter Zijlstra wrote:> On Wed, May 19, 2021 at 03:52:48PM +0200, Joerg Roedel wrote: > > --- a/arch/x86/kernel/sev.c > > +++ b/arch/x86/kernel/sev.c > > @@ -1343,9 +1343,10 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication) > > return; > > } > > > > + instrumentation_begin(); > > + > > irq_state = irqentry_nmi_enter(regs); > > lockdep_assert_irqs_disabled(); > > - instrumentation_begin(); > > > > /* > > * This is invoked through an interrupt gate, so IRQs are disabled. The > > That's just plain wrong. No instrumentation is allowed before you enter > the exception context.Okay.> > + irqentry_nmi_exit(regs, irq_state); > > + > > And this is wrong too; because at this point the handler doesn't run in > _any_ context anymore, certainly not one you can call regular C code > from.The #VC handler is at this point not running on the IST stack anymore, but on the stack it came from or on the task stack. So my believe was that at this point it inherits the context it came from (just like the page-fault handler). But I also don't fully understand the context tracking, so is my assumption wrong? Regards, Joerg