Quan Xu
2017-Nov-14 09:38 UTC
[PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
On 2017/11/14 15:30, Juergen Gross wrote:> On 14/11/17 08:02, Quan Xu wrote: >> >> On 2017/11/13 18:53, Juergen Gross wrote: >>> On 13/11/17 11:06, Quan Xu wrote: >>>> From: Quan Xu <quan.xu0 at gmail.com> >>>> >>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>> in idle path which will poll for a while before we enter the real idle >>>> state. >>>> >>>> In virtualization, idle path includes several heavy operations >>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>> hurt performance especially for latency intensive workload like message >>>> passing task. The cost is mainly from the vmexit which is a hardware >>>> context switch between virtual machine and hypervisor. Our solution is >>>> to poll for a while and do not enter real idle path if we can get the >>>> schedule event during polling. >>>> >>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>> reduce the useless poll. >>>> >>>> Signed-off-by: Yang Zhang <yang.zhang.wz at gmail.com> >>>> Signed-off-by: Quan Xu <quan.xu0 at gmail.com> >>>> Cc: Juergen Gross <jgross at suse.com> >>>> Cc: Alok Kataria <akataria at vmware.com> >>>> Cc: Rusty Russell <rusty at rustcorp.com.au> >>>> Cc: Thomas Gleixner <tglx at linutronix.de> >>>> Cc: Ingo Molnar <mingo at redhat.com> >>>> Cc: "H. Peter Anvin" <hpa at zytor.com> >>>> Cc: x86 at kernel.org >>>> Cc: virtualization at lists.linux-foundation.org >>>> Cc: linux-kernel at vger.kernel.org >>>> Cc: xen-devel at lists.xenproject.org >>> Hmm, is the idle entry path really so critical to performance that a new >>> pvops function is necessary? >> Juergen, Here is the data we get when running benchmark netperf: >> ?1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >> ??? 29031.6 bit/s -- 76.1 %CPU >> >> ?2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >> ??? 35787.7 bit/s -- 129.4 %CPU >> >> ?3. w/ kvm dynamic poll: >> ??? 35735.6 bit/s -- 200.0 %CPU >> >> ?4. w/patch and w/ kvm dynamic poll: >> ??? 42225.3 bit/s -- 198.7 %CPU >> >> ?5. idle=poll >> ??? 37081.7 bit/s -- 998.1 %CPU >> >> >> >> ?w/ this patch, we will improve performance by 23%.. even we could improve >> ?performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the >> ?cost of CPU is much lower than 'idle=poll' case.. > I don't question the general idea. I just think pvops isn't the best way > to implement it. > >>> Wouldn't a function pointer, maybe guarded >>> by a static key, be enough? A further advantage would be that this would >>> work on other architectures, too. >> I assume this feature will be ported to other archs.. a new pvops makes????? sorry, a typo.. /other archs/other hypervisors/ ????? it refers hypervisor like Xen, HyperV and VMware)..>> code >> clean and easy to maintain. also I tried to add it into existed pvops, >> but it >> doesn't match. > You are aware that pvops is x86 only?yes, I'm aware..> I really don't see the big difference in maintainability compared to the > static key / function pointer variant: > > void (*guest_idle_poll_func)(void); > struct static_key guest_idle_poll_key __read_mostly; > > static inline void guest_idle_poll(void) > { > if (static_key_false(&guest_idle_poll_key)) > guest_idle_poll_func(); > }thank you for your sample code :) I agree there is no big difference.. I think we are discussion for two things: ?1) x86 VM on different hypervisors ?2) different archs VM on kvm hypervisor What I want to do is x86 VM on different hypervisors, such as kvm / xen / hyperv ..> And KVM would just need to set guest_idle_poll_func and enable the > static key. Works on non-x86 architectures, too. >.. referred to 'pv_mmu_ops', HyperV and Xen can implement their own functions for 'pv_mmu_ops'. I think it is the same to pv_idle_ops. with above explaination, do you still think I need to define the static key/function pointer variant? btw, any interest to port it to Xen HVM guest? :) Quan Alibaba Cloud
Juergen Gross
2017-Nov-14 10:27 UTC
[PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
On 14/11/17 10:38, Quan Xu wrote:> > > On 2017/11/14 15:30, Juergen Gross wrote: >> On 14/11/17 08:02, Quan Xu wrote: >>> >>> On 2017/11/13 18:53, Juergen Gross wrote: >>>> On 13/11/17 11:06, Quan Xu wrote: >>>>> From: Quan Xu <quan.xu0 at gmail.com> >>>>> >>>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>>> in idle path which will poll for a while before we enter the real idle >>>>> state. >>>>> >>>>> In virtualization, idle path includes several heavy operations >>>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>>> hurt performance especially for latency intensive workload like >>>>> message >>>>> passing task. The cost is mainly from the vmexit which is a hardware >>>>> context switch between virtual machine and hypervisor. Our solution is >>>>> to poll for a while and do not enter real idle path if we can get the >>>>> schedule event during polling. >>>>> >>>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>>> reduce the useless poll. >>>>> >>>>> Signed-off-by: Yang Zhang <yang.zhang.wz at gmail.com> >>>>> Signed-off-by: Quan Xu <quan.xu0 at gmail.com> >>>>> Cc: Juergen Gross <jgross at suse.com> >>>>> Cc: Alok Kataria <akataria at vmware.com> >>>>> Cc: Rusty Russell <rusty at rustcorp.com.au> >>>>> Cc: Thomas Gleixner <tglx at linutronix.de> >>>>> Cc: Ingo Molnar <mingo at redhat.com> >>>>> Cc: "H. Peter Anvin" <hpa at zytor.com> >>>>> Cc: x86 at kernel.org >>>>> Cc: virtualization at lists.linux-foundation.org >>>>> Cc: linux-kernel at vger.kernel.org >>>>> Cc: xen-devel at lists.xenproject.org >>>> Hmm, is the idle entry path really so critical to performance that a >>>> new >>>> pvops function is necessary? >>> Juergen, Here is the data we get when running benchmark netperf: >>> ??1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >>> ???? 29031.6 bit/s -- 76.1 %CPU >>> >>> ??2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >>> ???? 35787.7 bit/s -- 129.4 %CPU >>> >>> ??3. w/ kvm dynamic poll: >>> ???? 35735.6 bit/s -- 200.0 %CPU >>> >>> ??4. w/patch and w/ kvm dynamic poll: >>> ???? 42225.3 bit/s -- 198.7 %CPU >>> >>> ??5. idle=poll >>> ???? 37081.7 bit/s -- 998.1 %CPU >>> >>> >>> >>> ??w/ this patch, we will improve performance by 23%.. even we could >>> improve >>> ??performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. >>> also the >>> ??cost of CPU is much lower than 'idle=poll' case.. >> I don't question the general idea. I just think pvops isn't the best way >> to implement it. >> >>>> Wouldn't a function pointer, maybe guarded >>>> by a static key, be enough? A further advantage would be that this >>>> would >>>> work on other architectures, too. >>> I assume this feature will be ported to other archs.. a new pvops makes > > ????? sorry, a typo.. /other archs/other hypervisors/ > ????? it refers hypervisor like Xen, HyperV and VMware).. > >>> code >>> clean and easy to maintain. also I tried to add it into existed pvops, >>> but it >>> doesn't match. >> You are aware that pvops is x86 only? > > yes, I'm aware.. > >> I really don't see the big difference in maintainability compared to the >> static key / function pointer variant: >> >> void (*guest_idle_poll_func)(void); >> struct static_key guest_idle_poll_key __read_mostly; >> >> static inline void guest_idle_poll(void) >> { >> ????if (static_key_false(&guest_idle_poll_key)) >> ??????? guest_idle_poll_func(); >> } > > > > thank you for your sample code :) > I agree there is no big difference.. I think we are discussion for two > things: > ?1) x86 VM on different hypervisors > ?2) different archs VM on kvm hypervisor > > What I want to do is x86 VM on different hypervisors, such as kvm / xen > / hyperv ..Why limit the solution to x86 if the more general solution isn't harder? As you didn't give any reason why the pvops approach is better other than you don't care for non-x86 platforms you won't get an "Ack" from me for this patch.> >> And KVM would just need to set guest_idle_poll_func and enable the >> static key. Works on non-x86 architectures, too. >> > > .. referred to 'pv_mmu_ops', HyperV and Xen can implement their own > functions for 'pv_mmu_ops'. > I think it is the same to pv_idle_ops. > > with above explaination, do you still think I need to define the static > key/function pointer variant? > > btw, any interest to port it to Xen HVM guest? :)Maybe. But this should work for Xen on ARM, too. Juergen
Quan Xu
2017-Nov-14 11:43 UTC
[PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
On 2017/11/14 18:27, Juergen Gross wrote:> On 14/11/17 10:38, Quan Xu wrote: >> >> On 2017/11/14 15:30, Juergen Gross wrote: >>> On 14/11/17 08:02, Quan Xu wrote: >>>> On 2017/11/13 18:53, Juergen Gross wrote: >>>>> On 13/11/17 11:06, Quan Xu wrote: >>>>>> From: Quan Xu <quan.xu0 at gmail.com> >>>>>> >>>>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>>>> in idle path which will poll for a while before we enter the real idle >>>>>> state. >>>>>> >>>>>> In virtualization, idle path includes several heavy operations >>>>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>>>> hurt performance especially for latency intensive workload like >>>>>> message >>>>>> passing task. The cost is mainly from the vmexit which is a hardware >>>>>> context switch between virtual machine and hypervisor. Our solution is >>>>>> to poll for a while and do not enter real idle path if we can get the >>>>>> schedule event during polling. >>>>>> >>>>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>>>> reduce the useless poll. >>>>>> >>>>>> Signed-off-by: Yang Zhang <yang.zhang.wz at gmail.com> >>>>>> Signed-off-by: Quan Xu <quan.xu0 at gmail.com> >>>>>> Cc: Juergen Gross <jgross at suse.com> >>>>>> Cc: Alok Kataria <akataria at vmware.com> >>>>>> Cc: Rusty Russell <rusty at rustcorp.com.au> >>>>>> Cc: Thomas Gleixner <tglx at linutronix.de> >>>>>> Cc: Ingo Molnar <mingo at redhat.com> >>>>>> Cc: "H. Peter Anvin" <hpa at zytor.com> >>>>>> Cc: x86 at kernel.org >>>>>> Cc: virtualization at lists.linux-foundation.org >>>>>> Cc: linux-kernel at vger.kernel.org >>>>>> Cc: xen-devel at lists.xenproject.org >>>>> Hmm, is the idle entry path really so critical to performance that a >>>>> new >>>>> pvops function is necessary? >>>> Juergen, Here is the data we get when running benchmark netperf: >>>> ??1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> ???? 29031.6 bit/s -- 76.1 %CPU >>>> >>>> ??2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> ???? 35787.7 bit/s -- 129.4 %CPU >>>> >>>> ??3. w/ kvm dynamic poll: >>>> ???? 35735.6 bit/s -- 200.0 %CPU >>>> >>>> ??4. w/patch and w/ kvm dynamic poll: >>>> ???? 42225.3 bit/s -- 198.7 %CPU >>>> >>>> ??5. idle=poll >>>> ???? 37081.7 bit/s -- 998.1 %CPU >>>> >>>> >>>> >>>> ??w/ this patch, we will improve performance by 23%.. even we could >>>> improve >>>> ??performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. >>>> also the >>>> ??cost of CPU is much lower than 'idle=poll' case.. >>> I don't question the general idea. I just think pvops isn't the best way >>> to implement it. >>> >>>>> Wouldn't a function pointer, maybe guarded >>>>> by a static key, be enough? A further advantage would be that this >>>>> would >>>>> work on other architectures, too. >>>> I assume this feature will be ported to other archs.. a new pvops makes >> ????? sorry, a typo.. /other archs/other hypervisors/ >> ????? it refers hypervisor like Xen, HyperV and VMware).. >> >>>> code >>>> clean and easy to maintain. also I tried to add it into existed pvops, >>>> but it >>>> doesn't match. >>> You are aware that pvops is x86 only? >> yes, I'm aware.. >> >>> I really don't see the big difference in maintainability compared to the >>> static key / function pointer variant: >>> >>> void (*guest_idle_poll_func)(void); >>> struct static_key guest_idle_poll_key __read_mostly; >>> >>> static inline void guest_idle_poll(void) >>> { >>> ????if (static_key_false(&guest_idle_poll_key)) >>> ??????? guest_idle_poll_func(); >>> } >> >> >> thank you for your sample code :) >> I agree there is no big difference.. I think we are discussion for two >> things: >> ?1) x86 VM on different hypervisors >> ?2) different archs VM on kvm hypervisor >> >> What I want to do is x86 VM on different hypervisors, such as kvm / xen >> / hyperv .. > Why limit the solution to x86 if the more general solution isn't > harder? > > As you didn't give any reason why the pvops approach is better other > than you don't care for non-x86 platforms you won't get an "Ack" from > me for this patch.It just looks a little odder to me. I understand you care about no-x86 arch. Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in ?? - arch/arm64/include/asm/paravirt.h ?? - arch/x86/include/asm/paravirt_types.h ?? - arch/arm/include/asm/paravirt.h I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops for arm/arm64 arch, you'd define a same structure in ?? - arch/arm64/include/asm/paravirt.h???? or ?? - arch/arm/include/asm/paravirt.h .. instead of static key / fuction. then implement a real function in ?? - arch/arm/kernel/paravirt.c. Also I wonder HOW/WHERE to define a static key/function, then to benifit x86/no-x86 archs? Quan Alibaba Cloud>>> And KVM would just need to set guest_idle_poll_func and enable the >>> static key. Works on non-x86 architectures, too. >>> >> .. referred to 'pv_mmu_ops', HyperV and Xen can implement their own >> functions for 'pv_mmu_ops'. >> I think it is the same to pv_idle_ops. >> >> with above explaination, do you still think I need to define the static >> key/function pointer variant? >> >> btw, any interest to port it to Xen HVM guest? :) > Maybe. But this should work for Xen on ARM, too. > > > Juergen >
Reasonably Related Threads
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops