thr3ads.net - Virtualization - Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace) [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Joerg Roedel

2020-Jun-23 14:49 UTC

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On Tue, Jun 23, 2020 at 03:03:22PM +0200, Peter Zijlstra
wrote:> On Tue, Jun 23, 2020 at 02:12:37PM +0200, Joerg Roedel wrote:
> > On Tue, Jun 23, 2020 at 01:50:14PM +0200, Peter Zijlstra wrote:
> > > If SNP is the sole reason #VC needs to be IST, then I'd
strongly urge
> > > you to only make it IST if/when you try and make SNP happen, not
before.
> > 
> > It is not the only reason, when ES guests gain debug register support
> > then #VC also needs to be IST, because #DB can be promoted into #VC
> > then, and as #DB is IST for a reason, #VC needs to be too.
> 
> Didn't I read somewhere that that is only so for Rome/Naples but not
for
> the later chips (Milan) which have #DB pass-through?
Probably, not sure which chips will get debug register virtualization
under SEV-ES. But even when it is supported, the HV can (and sometimes
will) intercept #DB, which then causes it to be promoted to #VC.
> We're talking about the 3rd case where the only reason things
'work' is
> because we'll have to panic():
> 
>  - #MC
Okay, #MC is special and can only be handled on a best-effort basis, as
#MC could happen anytime, also while already executing the #MC handler.
>  - #DB with BUS LOCK DEBUG EXCEPTION
If I understand the problem correctly, this can be solved by moving off
the IST stack to the current task stack in the #DB handler, like I plan
to do for #VC, no?
>  - #VC SNP
This has to panic for other reasons that can't be worked around. It
boils down to detecting that the HV is doing something fishy and bail
out to avoid further harm (like in the #MC handler).


Regards,

	Joerg

Peter Zijlstra

2020-Jun-23 15:16 UTC

head link

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On Tue, Jun 23, 2020 at 04:49:40PM +0200, Joerg Roedel
wrote:> > We're talking about the 3rd case where the only reason things
'work' is
> > because we'll have to panic():
> > 
> >  - #MC
> 
> Okay, #MC is special and can only be handled on a best-effort basis, as
> #MC could happen anytime, also while already executing the #MC handler.
I think the hardware has a MCE-mask bit somewhere. Flaky though because
clearing it isn't 'atomic' with IRET, so there's a
'funny' window.

It also interacts really bad with the NMI handler. If we get an #MC
early in the NMI, where we hard-rely on the NMI-mask being set to set-up
the recursion stack, then the #MC IRET will clear the NMI-mask, and
we're toast.

Andy has wild and crazy ideas, but I don't think we need more crazy
here.

#VC SNP has a similar problem vs NMI, that needs to die() irrespective
of the #VC IST recursion.
> >  - #DB with BUS LOCK DEBUG EXCEPTION
> 
> If I understand the problem correctly, this can be solved by moving off
> the IST stack to the current task stack in the #DB handler, like I plan
> to do for #VC, no?
Hmm, probably. Would take a bit of care, but should be doable.
> >  - #VC SNP
> 
> This has to panic for other reasons that can't be worked around. It
> boils down to detecting that the HV is doing something fishy and bail
> out to avoid further harm (like in the #MC handler).
Right, but it doesn't take away that IST-any-time vectors are
fundamentally screwy.

Both the MCE and NMI have masks that are, as per the above, differently
funny, but the other ISTs do not. Also, even if they had masks, the
interaction between them is still screwy.

#VC would've been so much better if it would've had a mask bit
somewhere, then at least we could've had the exception entry covered.
Another #VC with the mask set should probably result in #DF or Shutdown,
but that's all water under the bridge I suspect.

Andrew Cooper

2020-Jun-23 15:32 UTC

head link

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On 23/06/2020 16:16, Peter Zijlstra wrote:> On Tue, Jun 23, 2020 at 04:49:40PM +0200, Joerg Roedel wrote:
>>> We're talking about the 3rd case where the only reason things
'work' is
>>> because we'll have to panic():
>>>
>>>  - #MC
>> Okay, #MC is special and can only be handled on a best-effort basis, as
>> #MC could happen anytime, also while already executing the #MC handler.
> I think the hardware has a MCE-mask bit somewhere. Flaky though because
> clearing it isn't 'atomic' with IRET, so there's a
'funny' window.
MSR_MCG_STATUS.MCIP, and yes - any #MC before that point will
immediately Shutdown.? Any #MC between that point and IRET will clobber
its IST stack and end up sad.
> It also interacts really bad with the NMI handler. If we get an #MC
> early in the NMI, where we hard-rely on the NMI-mask being set to set-up
> the recursion stack, then the #MC IRET will clear the NMI-mask, and
> we're toast.
>
> Andy has wild and crazy ideas, but I don't think we need more crazy
> here.
Want, certainly not.? Need, maybe :-/>>>  - #DB with BUS LOCK DEBUG EXCEPTION
>> If I understand the problem correctly, this can be solved by moving off
>> the IST stack to the current task stack in the #DB handler, like I plan
>> to do for #VC, no?
> Hmm, probably. Would take a bit of care, but should be doable.
Andy and I have spent some time considering this.

Having exactly one vector move off IST and onto an in-use task-stack is
doable, I think, so long as it can sort out self-recursion protections.

Having more than one vector do this is very complicated.? You've got to
take care to walk up the list of IST-nesting to see if any interrupted
context is in the middle of trying to copy themselves onto the stack, so
you don't clobber the frame they're in the middle of copying.

~Andrew

Virtualization - Jun 2020 - Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)