Srivatsa S. Bhat
2023-Jan-16 06:01 UTC
[PATCH v2] x86/hotplug: Do not put offline vCPUs in mwait idle state
From: "Srivatsa S. Bhat (VMware)" <srivatsa at csail.mit.edu> Under hypervisors that support mwait passthrough, a vCPU in mwait CPU-idle state remains in guest context (instead of yielding to the hypervisor via VMEXIT), which helps speed up wakeups from idle. However, this runs into problems with CPU hotplug, because the Linux CPU offline path prefers to put the vCPU-to-be-offlined in mwait state, whenever mwait is available. As a result, since a vCPU in mwait remains in guest context and does not yield to the hypervisor, an offline vCPU *appears* to be 100% busy as viewed from the host, which prevents the hypervisor from running other vCPUs or workloads on the corresponding pCPU. [ Note that such a vCPU is not actually busy spinning though; it remains in mwait idle state in the guest ]. Fix this by preventing the use of mwait idle state in the vCPU offline play_dead() path for any hypervisor, even if mwait support is available. Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org> Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa at csail.mit.edu> Cc: Thomas Gleixner <tglx at linutronix.de> Cc: Peter Zijlstra <peterz at infradead.org> Cc: Ingo Molnar <mingo at redhat.com> Cc: Borislav Petkov <bp at alien8.de> Cc: Dave Hansen <dave.hansen at linux.intel.com> Cc: "H. Peter Anvin" <hpa at zytor.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki at intel.com> Cc: "Paul E. McKenney" <paulmck at kernel.org> Cc: Wyes Karny <wyes.karny at amd.com> Cc: Lewis Caroll <lewis.carroll at amd.com> Cc: Tom Lendacky <thomas.lendacky at amd.com> Cc: Alexey Makhalov <amakhalov at vmware.com> Cc: Juergen Gross <jgross at suse.com> Cc: x86 at kernel.org Cc: VMware PV-Drivers Reviewers <pv-drivers at vmware.com> Cc: virtualization at lists.linux-foundation.org Cc: kvm at vger.kernel.org Cc: xen-devel at lists.xenproject.org --- v1: https://lore.kernel.org/lkml/165843627080.142207.12667479241667142176.stgit at csail.mit.edu/ arch/x86/kernel/smpboot.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 55cad72715d9..125a5d4bfded 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1763,6 +1763,15 @@ static inline void mwait_play_dead(void) return; if (!this_cpu_has(X86_FEATURE_CLFLUSH)) return; + + /* + * Do not use mwait in CPU offline play_dead if running under + * any hypervisor, to make sure that the offline vCPU actually + * yields to the hypervisor (which may not happen otherwise if + * the hypervisor supports mwait passthrough). + */ + if (this_cpu_has(X86_FEATURE_HYPERVISOR)) + return; if (__this_cpu_read(cpu_info.cpuid_level) < CPUID_MWAIT_LEAF) return; -- 2.25.1
Juergen Gross
2023-Jan-16 08:38 UTC
[PATCH v2] x86/hotplug: Do not put offline vCPUs in mwait idle state
On 16.01.23 07:01, Srivatsa S. Bhat wrote:> From: "Srivatsa S. Bhat (VMware)" <srivatsa at csail.mit.edu> > > Under hypervisors that support mwait passthrough, a vCPU in mwait > CPU-idle state remains in guest context (instead of yielding to the > hypervisor via VMEXIT), which helps speed up wakeups from idle. > > However, this runs into problems with CPU hotplug, because the Linux > CPU offline path prefers to put the vCPU-to-be-offlined in mwait > state, whenever mwait is available. As a result, since a vCPU in mwait > remains in guest context and does not yield to the hypervisor, an > offline vCPU *appears* to be 100% busy as viewed from the host, which > prevents the hypervisor from running other vCPUs or workloads on the > corresponding pCPU. [ Note that such a vCPU is not actually busy > spinning though; it remains in mwait idle state in the guest ]. > > Fix this by preventing the use of mwait idle state in the vCPU offline > play_dead() path for any hypervisor, even if mwait support is > available. > > Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org> > Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa at csail.mit.edu> > Cc: Thomas Gleixner <tglx at linutronix.de> > Cc: Peter Zijlstra <peterz at infradead.org> > Cc: Ingo Molnar <mingo at redhat.com> > Cc: Borislav Petkov <bp at alien8.de> > Cc: Dave Hansen <dave.hansen at linux.intel.com> > Cc: "H. Peter Anvin" <hpa at zytor.com> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki at intel.com> > Cc: "Paul E. McKenney" <paulmck at kernel.org> > Cc: Wyes Karny <wyes.karny at amd.com> > Cc: Lewis Caroll <lewis.carroll at amd.com> > Cc: Tom Lendacky <thomas.lendacky at amd.com> > Cc: Alexey Makhalov <amakhalov at vmware.com> > Cc: Juergen Gross <jgross at suse.com> > Cc: x86 at kernel.org > Cc: VMware PV-Drivers Reviewers <pv-drivers at vmware.com> > Cc: virtualization at lists.linux-foundation.org > Cc: kvm at vger.kernel.org > Cc: xen-devel at lists.xenproject.orgReviewed-by: Juergen Gross <jgross at suse.com> Juergen -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB0DE9DD628BF132F.asc Type: application/pgp-keys Size: 3098 bytes Desc: OpenPGP public key URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20230116/a278e281/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 495 bytes Desc: OpenPGP digital signature URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20230116/a278e281/attachment.sig>
Igor Mammedov
2023-Jan-16 14:55 UTC
[PATCH v2] x86/hotplug: Do not put offline vCPUs in mwait idle state
On Sun, 15 Jan 2023 22:01:34 -0800 "Srivatsa S. Bhat" <srivatsa at csail.mit.edu> wrote:> From: "Srivatsa S. Bhat (VMware)" <srivatsa at csail.mit.edu> > > Under hypervisors that support mwait passthrough, a vCPU in mwait > CPU-idle state remains in guest context (instead of yielding to the > hypervisor via VMEXIT), which helps speed up wakeups from idle. > > However, this runs into problems with CPU hotplug, because the Linux > CPU offline path prefers to put the vCPU-to-be-offlined in mwait > state, whenever mwait is available. As a result, since a vCPU in mwait > remains in guest context and does not yield to the hypervisor, an > offline vCPU *appears* to be 100% busy as viewed from the host, which > prevents the hypervisor from running other vCPUs or workloads on the > corresponding pCPU. [ Note that such a vCPU is not actually busy > spinning though; it remains in mwait idle state in the guest ]. > > Fix this by preventing the use of mwait idle state in the vCPU offline > play_dead() path for any hypervisor, even if mwait support is > available.if mwait is enabled, it's very likely guest to have cpuidle enabled and using the same mwait as well. So exiting early from mwait_play_dead(), might just punt workflow down: native_play_dead() ... mwait_play_dead(); if (cpuidle_play_dead()) <- possible mwait here hlt_play_dead(); and it will end up in mwait again and only if that fails it will go HLT route and maybe transition to VMM. Instead of workaround on guest side, shouldn't hypervisor force VMEXIT on being uplugged vCPU when it's actually hot-unplugging vCPU? (ex: QEMU kicks vCPU out from guest context when it is removing vCPU, among other things)> Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org> > Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa at csail.mit.edu> > Cc: Thomas Gleixner <tglx at linutronix.de> > Cc: Peter Zijlstra <peterz at infradead.org> > Cc: Ingo Molnar <mingo at redhat.com> > Cc: Borislav Petkov <bp at alien8.de> > Cc: Dave Hansen <dave.hansen at linux.intel.com> > Cc: "H. Peter Anvin" <hpa at zytor.com> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki at intel.com> > Cc: "Paul E. McKenney" <paulmck at kernel.org> > Cc: Wyes Karny <wyes.karny at amd.com> > Cc: Lewis Caroll <lewis.carroll at amd.com> > Cc: Tom Lendacky <thomas.lendacky at amd.com> > Cc: Alexey Makhalov <amakhalov at vmware.com> > Cc: Juergen Gross <jgross at suse.com> > Cc: x86 at kernel.org > Cc: VMware PV-Drivers Reviewers <pv-drivers at vmware.com> > Cc: virtualization at lists.linux-foundation.org > Cc: kvm at vger.kernel.org > Cc: xen-devel at lists.xenproject.org > --- > > v1: https://lore.kernel.org/lkml/165843627080.142207.12667479241667142176.stgit at csail.mit.edu/ > > arch/x86/kernel/smpboot.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index 55cad72715d9..125a5d4bfded 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1763,6 +1763,15 @@ static inline void mwait_play_dead(void) > return; > if (!this_cpu_has(X86_FEATURE_CLFLUSH)) > return; > + > + /* > + * Do not use mwait in CPU offline play_dead if running under > + * any hypervisor, to make sure that the offline vCPU actually > + * yields to the hypervisor (which may not happen otherwise if > + * the hypervisor supports mwait passthrough). > + */ > + if (this_cpu_has(X86_FEATURE_HYPERVISOR)) > + return; > if (__this_cpu_read(cpu_info.cpuid_level) < CPUID_MWAIT_LEAF) > return; >