From: Joerg Roedel <jroedel at suse.de> Hi, here is a collection of fixes for the SEV-ES guest support. They fix simple and more severe issues and are all targeted for v5.13. The most important fixes are the revert of 7024f60d6552 which just doesn't work in any context the #VC handler could run and the fix to forward #PF exceptions caused during emulation. The issue that 7024f60d6552 intended to fix should be fixed correctly with these patches too. Please review and test. Regards, Joerg Joerg Roedel (6): x86/sev-es: Don't return NULL from sev_es_get_ghcb() x86/sev-es: Forward page-faults which happen during emulation x86/sev-es: Use __put_user()/__get_user Revert "x86/sev-es: Handle string port IO to kernel memory properly" x86/sev-es: Fix error message in runtime #VC handler x86/sev-es: Leave NMI-mode before sending signals arch/x86/kernel/sev.c | 76 +++++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 38 deletions(-) base-commit: 059e5c321a65657877924256ea8ad9c0df257b45 -- 2.31.1
Joerg Roedel
2021-May-12  07:54 UTC
[PATCH 1/6] x86/sev-es: Don't return NULL from sev_es_get_ghcb()
From: Joerg Roedel <jroedel at suse.de>
The sev_es_get_ghcb() is called from several places, but only one of
them checks the return value. The reaction to returning NULL is always
the same: Calling panic() and kill the machine.
Instead of adding checks to all call-places, move the panic() into the
function itself so that it will no longer return NULL.
Fixes: 0786138c78e7 ("x86/sev-es: Add a Runtime #VC Exception
Handler")
Cc: stable at vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 9578c82832aa..c49270c7669e 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -203,8 +203,18 @@ static __always_inline struct ghcb *sev_es_get_ghcb(struct
ghcb_state *state)
 	if (unlikely(data->ghcb_active)) {
 		/* GHCB is already in use - save its contents */
 
-		if (unlikely(data->backup_ghcb_active))
-			return NULL;
+		if (unlikely(data->backup_ghcb_active)) {
+			/*
+			 * Backup-GHCB is also already in use. There is no way
+			 * to continue here so just kill the machine. To make
+			 * panic() work, mark GHCBs inactive so that messages
+			 * can be printed out.
+			 */
+			data->ghcb_active        = false;
+			data->backup_ghcb_active = false;
+
+			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already
in use");
+		}
 
 		/* Mark backup_ghcb active before writing to it */
 		data->backup_ghcb_active = true;
@@ -1284,7 +1294,6 @@ static __always_inline bool on_vc_fallback_stack(struct
pt_regs *regs)
  */
 DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 {
-	struct sev_es_runtime_data *data = this_cpu_read(runtime_data);
 	irqentry_state_t irq_state;
 	struct ghcb_state state;
 	struct es_em_ctxt ctxt;
@@ -1310,16 +1319,6 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 	 */
 
 	ghcb = sev_es_get_ghcb(&state);
-	if (!ghcb) {
-		/*
-		 * Mark GHCBs inactive so that panic() is able to print the
-		 * message.
-		 */
-		data->ghcb_active        = false;
-		data->backup_ghcb_active = false;
-
-		panic("Unable to handle #VC exception! GHCB and Backup GHCB are already
in use");
-	}
 
 	vc_ghcb_invalidate(ghcb);
 	result = vc_init_em_ctxt(&ctxt, regs, error_code);
-- 
2.31.1
Joerg Roedel
2021-May-12  07:54 UTC
[PATCH 2/6] x86/sev-es: Forward page-faults which happen during emulation
From: Joerg Roedel <jroedel at suse.de>
When emulating guest instructions for MMIO or IOIO accesses the #VC
handler might get a page-fault and will not be able to complete. Forward
the page-fault in this case to the correct handler instead of killing
the machine.
Fixes: 0786138c78e7 ("x86/sev-es: Add a Runtime #VC Exception
Handler")
Cc: stable at vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c49270c7669e..6530a844eb61 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1265,6 +1265,10 @@ static __always_inline void vc_forward_exception(struct
es_em_ctxt *ctxt)
 	case X86_TRAP_UD:
 		exc_invalid_op(ctxt->regs);
 		break;
+	case X86_TRAP_PF:
+		write_cr2(ctxt->fi.cr2);
+		exc_page_fault(ctxt->regs, error_code);
+		break;
 	case X86_TRAP_AC:
 		exc_alignment_check(ctxt->regs, error_code);
 		break;
-- 
2.31.1
From: Joerg Roedel <jroedel at suse.de>
The put_user() and get_user() functions do checks on the address which is
passed to them. They check whether the address is actually a user-space
address and whether its fine to access it. They also call might_fault()
to indicate that they could fault and possibly sleep.
All of these checks are neither wanted nor required in the #VC exception
handler, which can be invoked from almost any context and also for MMIO
instructions from kernel space on kernel memory. All the #VC handler
wants to know is whether a fault happened when the access was tried.
This is provided by __put_user()/__get_user(), which just do the access
no matter what.
Fixes: f980f9c31a92 ("x86/sev-es: Compile early handler code into kernel
image")
Cc: stable at vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 6530a844eb61..110b39345b40 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -342,22 +342,22 @@ static enum es_result vc_write_mem(struct es_em_ctxt
*ctxt,
 	switch (size) {
 	case 1:
 		memcpy(&d1, buf, 1);
-		if (put_user(d1, target))
+		if (__put_user(d1, target))
 			goto fault;
 		break;
 	case 2:
 		memcpy(&d2, buf, 2);
-		if (put_user(d2, target))
+		if (__put_user(d2, target))
 			goto fault;
 		break;
 	case 4:
 		memcpy(&d4, buf, 4);
-		if (put_user(d4, target))
+		if (__put_user(d4, target))
 			goto fault;
 		break;
 	case 8:
 		memcpy(&d8, buf, 8);
-		if (put_user(d8, target))
+		if (__put_user(d8, target))
 			goto fault;
 		break;
 	default:
@@ -396,22 +396,22 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 
 	switch (size) {
 	case 1:
-		if (get_user(d1, s))
+		if (__get_user(d1, s))
 			goto fault;
 		memcpy(buf, &d1, 1);
 		break;
 	case 2:
-		if (get_user(d2, s))
+		if (__get_user(d2, s))
 			goto fault;
 		memcpy(buf, &d2, 2);
 		break;
 	case 4:
-		if (get_user(d4, s))
+		if (__get_user(d4, s))
 			goto fault;
 		memcpy(buf, &d4, 4);
 		break;
 	case 8:
-		if (get_user(d8, s))
+		if (__get_user(d8, s))
 			goto fault;
 		memcpy(buf, &d8, 8);
 		break;
-- 
2.31.1
Joerg Roedel
2021-May-12  07:54 UTC
[PATCH 4/6] Revert "x86/sev-es: Handle string port IO to kernel memory properly"
From: Joerg Roedel <jroedel at suse.de>
This reverts commit 7024f60d655272bd2ca1d3a4c9e0a63319b1eea1.
The commit reverted here introduces a short-cut into the #VC handlers
memory access code which only works reliably in task context. But the
kernels #VC handler can be invoked from any context, making the
access_ok() call trigger a warning with CONFIG_DEBUG_ATOMIC_SLEEP
enabled.
Also the memcpy() used in the reverted patch is wrong, as it has no
page-fault handling. Access to kernel memory can also fault due to
kernel bugs, and those should not be reported as faults from the #VC
handler but as bugs of their real call-site, which is correctly later
done from vc_forward_exception().
Fixes: 7024f60d6552 ("x86/sev-es: Handle string port IO to kernel memory
properly")
Cc: stable at vger.kernel.org # v5.11
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 12 ------------
 1 file changed, 12 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 110b39345b40..f4f319004713 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -333,12 +333,6 @@ static enum es_result vc_write_mem(struct es_em_ctxt *ctxt,
 	u16 d2;
 	u8  d1;
 
-	/* If instruction ran in kernel mode and the I/O buffer is in kernel space */
-	if (!user_mode(ctxt->regs) && !access_ok(target, size)) {
-		memcpy(dst, buf, size);
-		return ES_OK;
-	}
-
 	switch (size) {
 	case 1:
 		memcpy(&d1, buf, 1);
@@ -388,12 +382,6 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 	u16 d2;
 	u8  d1;
 
-	/* If instruction ran in kernel mode and the I/O buffer is in kernel space */
-	if (!user_mode(ctxt->regs) && !access_ok(s, size)) {
-		memcpy(buf, src, size);
-		return ES_OK;
-	}
-
 	switch (size) {
 	case 1:
 		if (__get_user(d1, s))
-- 
2.31.1
Joerg Roedel
2021-May-12  07:54 UTC
[PATCH 5/6] x86/sev-es: Fix error message in runtime #VC handler
From: Joerg Roedel <jroedel at suse.de>
The runtime #VC handler is not "early" anymore. Fix the copy&paste
error
and remove that word from the error message.
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index f4f319004713..77155c4f07f5 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1326,7 +1326,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		vc_finish_insn(&ctxt);
 		break;
 	case ES_UNSUPPORTED:
-		pr_err_ratelimited("Unsupported exit-code 0x%02lx in early #VC exception
(IP: 0x%lx)\n",
+		pr_err_ratelimited("Unsupported exit-code 0x%02lx in #VC exception (IP:
0x%lx)\n",
 				   error_code, regs->ip);
 		goto fail;
 	case ES_VMM_ERROR:
-- 
2.31.1
Joerg Roedel
2021-May-12  07:54 UTC
[PATCH 6/6] x86/sev-es: Leave NMI-mode before sending signals
From: Joerg Roedel <jroedel at suse.de>
The error path in the runtime #VC handler sends a signal to kill the
current task if the exception was raised from user-space. Some parts of
the #VC handler run in NMI mode, because it is critical that it is not
interrupted (except from an NMI) while the GHCB is in use.
But sending signals in NMI-mode is actually broken and triggers lockdep
warnings. On the other side, when the signal is sent, there is no reason
for the handler to still be in NMI-mode, as the GHCB is not used
anymore.
Leave NMI-mode before entering the error path to get rid of the lockdep
warnings.
Fixes: 62441a1fb532 ("x86/sev-es: Correctly track IRQ states in runtime #VC
handler")
Signed-off-by: Joerg Roedel <jroedel at suse.de>
---
 arch/x86/kernel/sev.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 77155c4f07f5..79cbed2f28de 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1300,9 +1300,10 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		return;
 	}
 
+	instrumentation_begin();
+
 	irq_state = irqentry_nmi_enter(regs);
 	lockdep_assert_irqs_disabled();
-	instrumentation_begin();
 
 	/*
 	 * This is invoked through an interrupt gate, so IRQs are disabled. The
@@ -1352,13 +1353,19 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		BUG();
 	}
 
-out:
-	instrumentation_end();
 	irqentry_nmi_exit(regs, irq_state);
+	instrumentation_end();
 
 	return;
 
 fail:
+	/*
+	 * Leave NMI mode - the GHCB is not busy anymore and depending on where
+	 * the #VC came from this code is about to either kill the task (when in
+	 * task context) or kill the machine.
+	 */
+	irqentry_nmi_exit(regs, irq_state);
+
 	if (user_mode(regs)) {
 		/*
 		 * Do not kill the machine if user-space triggered the
@@ -1380,7 +1387,9 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		panic("Returned from Terminate-Request to Hypervisor\n");
 	}
 
-	goto out;
+	instrumentation_end();
+
+	return;
 }
 
 /* This handler runs on the #VC fall-back stack. It can cause further #VC
exceptions */
-- 
2.31.1