thr3ads.net - Virtualization - Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace) [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Joerg Roedel

2020-Jun-23 12:04 UTC

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On Tue, Jun 23, 2020 at 01:48:18PM +0200, Peter Zijlstra
wrote:> On Tue, Jun 23, 2020 at 01:30:07PM +0200, Joerg Roedel wrote:
> But you cannot do a recursion check in #VC, because the NMI can happen
> on the first instruction of #VC, before we can increment our counter,
> and then the #VC can happen on NMI because the IST stack is a goner, and
> we're fscked again (or on a per-cpu variable we touch in our elaborate
> NMI setup, etc..).
No, the recursion check is fine, because overwriting an already used IST
stack doesn't matter (as long as it can be detected) if we are going to
panic anyway. It doesn't matter because the kernel will not leave the
currently running handler anymore.

I agree there is no way to keep the system running if that happens, but
that is also not what is wanted. If stack recursion happens, something
malicious from the HV side is going on, and all the kernel needs to be
able to is to safely and reliably detect the situation and panic the VM
to prevent any data corruption or loss or even leakage.
> I'll keep repeating this, x86_64 exceptions are a trainwreck, and IST
in
> specific is utter crap.
I agree, but don't forget the most prominent underlying reason for IST:
The SYSCALL gap. If SYSCALL would switch stacks most of those issues
would not exist. IST would still be needed because there are no task
gates in x86-64, but still...

Regards,

	Joerg

Peter Zijlstra

2020-Jun-23 12:52 UTC

head link

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On Tue, Jun 23, 2020 at 02:04:33PM +0200, Joerg Roedel
wrote:> On Tue, Jun 23, 2020 at 01:48:18PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2020 at 01:30:07PM +0200, Joerg Roedel wrote:
> 
> > But you cannot do a recursion check in #VC, because the NMI can happen
> > on the first instruction of #VC, before we can increment our counter,
> > and then the #VC can happen on NMI because the IST stack is a goner,
and
> > we're fscked again (or on a per-cpu variable we touch in our
elaborate
> > NMI setup, etc..).
> 
> No, the recursion check is fine, because overwriting an already used IST
> stack doesn't matter (as long as it can be detected) if we are going to
> panic anyway. It doesn't matter because the kernel will not leave the
> currently running handler anymore.
You only have that guarantee when any SNP #VC from kernel is an
automatic panic. But in that case, what's the point of having the
recursion count?
> > I'll keep repeating this, x86_64 exceptions are a trainwreck, and
IST in
> > specific is utter crap.
> 
> I agree, but don't forget the most prominent underlying reason for IST:
> The SYSCALL gap. If SYSCALL would switch stacks most of those issues
> would not exist. IST would still be needed because there are no task
> gates in x86-64, but still...
We could all go back to int80 ;-) /me runs like heck

Joerg Roedel

2020-Jun-23 13:40 UTC

head link

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

On Tue, Jun 23, 2020 at 02:52:01PM +0200, Peter Zijlstra
wrote:> On Tue, Jun 23, 2020 at 02:04:33PM +0200, Joerg Roedel wrote:
> > No, the recursion check is fine, because overwriting an already used
IST
> > stack doesn't matter (as long as it can be detected) if we are
going to
> > panic anyway. It doesn't matter because the kernel will not leave
the
> > currently running handler anymore.
> 
> You only have that guarantee when any SNP #VC from kernel is an
> automatic panic. But in that case, what's the point of having the
> recursion count?
It is not a recursion count, it is a stack-recursion check. Basically
walk down the stack and look if your current stack is already in use.
Yes, this can be optimized, but that is what is needed.

IIRC the current prototype code for SNP just pre-validates all memory in
the VM and doesn't support moving pages around on the host. So any #VC
SNP exception would be fatal, yes.

In a scenario with on-demand validation of guest pages and support for
guest-assisted page-moving on the HV side it would be more complicated.
Basically all memory that is accessed during #VC exception handling must
stay validated at all times, including the IST stack.

So saying this, I don't understand why _all_ SNP #VC exceptions from
kernel space must be fatal?

Regards,

	Joerg

Virtualization - Jun 2020 - Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)