Quan Xu
2017-Nov-14 11:43 UTC
[PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
On 2017/11/14 18:27, Juergen Gross wrote:> On 14/11/17 10:38, Quan Xu wrote: >> >> On 2017/11/14 15:30, Juergen Gross wrote: >>> On 14/11/17 08:02, Quan Xu wrote: >>>> On 2017/11/13 18:53, Juergen Gross wrote: >>>>> On 13/11/17 11:06, Quan Xu wrote: >>>>>> From: Quan Xu <quan.xu0 at gmail.com> >>>>>> >>>>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>>>> in idle path which will poll for a while before we enter the real idle >>>>>> state. >>>>>> >>>>>> In virtualization, idle path includes several heavy operations >>>>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>>>> hurt performance especially for latency intensive workload like >>>>>> message >>>>>> passing task. The cost is mainly from the vmexit which is a hardware >>>>>> context switch between virtual machine and hypervisor. Our solution is >>>>>> to poll for a while and do not enter real idle path if we can get the >>>>>> schedule event during polling. >>>>>> >>>>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>>>> reduce the useless poll. >>>>>> >>>>>> Signed-off-by: Yang Zhang <yang.zhang.wz at gmail.com> >>>>>> Signed-off-by: Quan Xu <quan.xu0 at gmail.com> >>>>>> Cc: Juergen Gross <jgross at suse.com> >>>>>> Cc: Alok Kataria <akataria at vmware.com> >>>>>> Cc: Rusty Russell <rusty at rustcorp.com.au> >>>>>> Cc: Thomas Gleixner <tglx at linutronix.de> >>>>>> Cc: Ingo Molnar <mingo at redhat.com> >>>>>> Cc: "H. Peter Anvin" <hpa at zytor.com> >>>>>> Cc: x86 at kernel.org >>>>>> Cc: virtualization at lists.linux-foundation.org >>>>>> Cc: linux-kernel at vger.kernel.org >>>>>> Cc: xen-devel at lists.xenproject.org >>>>> Hmm, is the idle entry path really so critical to performance that a >>>>> new >>>>> pvops function is necessary? >>>> Juergen, Here is the data we get when running benchmark netperf: >>>> ??1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> ???? 29031.6 bit/s -- 76.1 %CPU >>>> >>>> ??2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> ???? 35787.7 bit/s -- 129.4 %CPU >>>> >>>> ??3. w/ kvm dynamic poll: >>>> ???? 35735.6 bit/s -- 200.0 %CPU >>>> >>>> ??4. w/patch and w/ kvm dynamic poll: >>>> ???? 42225.3 bit/s -- 198.7 %CPU >>>> >>>> ??5. idle=poll >>>> ???? 37081.7 bit/s -- 998.1 %CPU >>>> >>>> >>>> >>>> ??w/ this patch, we will improve performance by 23%.. even we could >>>> improve >>>> ??performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. >>>> also the >>>> ??cost of CPU is much lower than 'idle=poll' case.. >>> I don't question the general idea. I just think pvops isn't the best way >>> to implement it. >>> >>>>> Wouldn't a function pointer, maybe guarded >>>>> by a static key, be enough? A further advantage would be that this >>>>> would >>>>> work on other architectures, too. >>>> I assume this feature will be ported to other archs.. a new pvops makes >> ????? sorry, a typo.. /other archs/other hypervisors/ >> ????? it refers hypervisor like Xen, HyperV and VMware).. >> >>>> code >>>> clean and easy to maintain. also I tried to add it into existed pvops, >>>> but it >>>> doesn't match. >>> You are aware that pvops is x86 only? >> yes, I'm aware.. >> >>> I really don't see the big difference in maintainability compared to the >>> static key / function pointer variant: >>> >>> void (*guest_idle_poll_func)(void); >>> struct static_key guest_idle_poll_key __read_mostly; >>> >>> static inline void guest_idle_poll(void) >>> { >>> ????if (static_key_false(&guest_idle_poll_key)) >>> ??????? guest_idle_poll_func(); >>> } >> >> >> thank you for your sample code :) >> I agree there is no big difference.. I think we are discussion for two >> things: >> ?1) x86 VM on different hypervisors >> ?2) different archs VM on kvm hypervisor >> >> What I want to do is x86 VM on different hypervisors, such as kvm / xen >> / hyperv .. > Why limit the solution to x86 if the more general solution isn't > harder? > > As you didn't give any reason why the pvops approach is better other > than you don't care for non-x86 platforms you won't get an "Ack" from > me for this patch.It just looks a little odder to me. I understand you care about no-x86 arch. Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in ?? - arch/arm64/include/asm/paravirt.h ?? - arch/x86/include/asm/paravirt_types.h ?? - arch/arm/include/asm/paravirt.h I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops for arm/arm64 arch, you'd define a same structure in ?? - arch/arm64/include/asm/paravirt.h???? or ?? - arch/arm/include/asm/paravirt.h .. instead of static key / fuction. then implement a real function in ?? - arch/arm/kernel/paravirt.c. Also I wonder HOW/WHERE to define a static key/function, then to benifit x86/no-x86 archs? Quan Alibaba Cloud>>> And KVM would just need to set guest_idle_poll_func and enable the >>> static key. Works on non-x86 architectures, too. >>> >> .. referred to 'pv_mmu_ops', HyperV and Xen can implement their own >> functions for 'pv_mmu_ops'. >> I think it is the same to pv_idle_ops. >> >> with above explaination, do you still think I need to define the static >> key/function pointer variant? >> >> btw, any interest to port it to Xen HVM guest? :) > Maybe. But this should work for Xen on ARM, too. > > > Juergen >
Juergen Gross
2017-Nov-14 11:58 UTC
[PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
On 14/11/17 12:43, Quan Xu wrote:> > > On 2017/11/14 18:27, Juergen Gross wrote: >> On 14/11/17 10:38, Quan Xu wrote: >>> >>> On 2017/11/14 15:30, Juergen Gross wrote: >>>> On 14/11/17 08:02, Quan Xu wrote: >>>>> On 2017/11/13 18:53, Juergen Gross wrote: >>>>>> On 13/11/17 11:06, Quan Xu wrote: >>>>>>> From: Quan Xu <quan.xu0 at gmail.com> >>>>>>> >>>>>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is >>>>>>> called >>>>>>> in idle path which will poll for a while before we enter the real >>>>>>> idle >>>>>>> state. >>>>>>> >>>>>>> In virtualization, idle path includes several heavy operations >>>>>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>>>>> hurt performance especially for latency intensive workload like >>>>>>> message >>>>>>> passing task. The cost is mainly from the vmexit which is a hardware >>>>>>> context switch between virtual machine and hypervisor. Our >>>>>>> solution is >>>>>>> to poll for a while and do not enter real idle path if we can get >>>>>>> the >>>>>>> schedule event during polling. >>>>>>> >>>>>>> Poll may cause the CPU waste so we adopt a smart polling >>>>>>> mechanism to >>>>>>> reduce the useless poll. >>>>>>> >>>>>>> Signed-off-by: Yang Zhang <yang.zhang.wz at gmail.com> >>>>>>> Signed-off-by: Quan Xu <quan.xu0 at gmail.com> >>>>>>> Cc: Juergen Gross <jgross at suse.com> >>>>>>> Cc: Alok Kataria <akataria at vmware.com> >>>>>>> Cc: Rusty Russell <rusty at rustcorp.com.au> >>>>>>> Cc: Thomas Gleixner <tglx at linutronix.de> >>>>>>> Cc: Ingo Molnar <mingo at redhat.com> >>>>>>> Cc: "H. Peter Anvin" <hpa at zytor.com> >>>>>>> Cc: x86 at kernel.org >>>>>>> Cc: virtualization at lists.linux-foundation.org >>>>>>> Cc: linux-kernel at vger.kernel.org >>>>>>> Cc: xen-devel at lists.xenproject.org >>>>>> Hmm, is the idle entry path really so critical to performance that a >>>>>> new >>>>>> pvops function is necessary? >>>>> Juergen, Here is the data we get when running benchmark netperf: >>>>> ???1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >>>>> ????? 29031.6 bit/s -- 76.1 %CPU >>>>> >>>>> ???2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >>>>> ????? 35787.7 bit/s -- 129.4 %CPU >>>>> >>>>> ???3. w/ kvm dynamic poll: >>>>> ????? 35735.6 bit/s -- 200.0 %CPU >>>>> >>>>> ???4. w/patch and w/ kvm dynamic poll: >>>>> ????? 42225.3 bit/s -- 198.7 %CPU >>>>> >>>>> ???5. idle=poll >>>>> ????? 37081.7 bit/s -- 998.1 %CPU >>>>> >>>>> >>>>> >>>>> ???w/ this patch, we will improve performance by 23%.. even we could >>>>> improve >>>>> ???performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. >>>>> also the >>>>> ???cost of CPU is much lower than 'idle=poll' case.. >>>> I don't question the general idea. I just think pvops isn't the best >>>> way >>>> to implement it. >>>> >>>>>> Wouldn't a function pointer, maybe guarded >>>>>> by a static key, be enough? A further advantage would be that this >>>>>> would >>>>>> work on other architectures, too. >>>>> I assume this feature will be ported to other archs.. a new pvops >>>>> makes >>> ?????? sorry, a typo.. /other archs/other hypervisors/ >>> ?????? it refers hypervisor like Xen, HyperV and VMware).. >>> >>>>> code >>>>> clean and easy to maintain. also I tried to add it into existed pvops, >>>>> but it >>>>> doesn't match. >>>> You are aware that pvops is x86 only? >>> yes, I'm aware.. >>> >>>> I really don't see the big difference in maintainability compared to >>>> the >>>> static key / function pointer variant: >>>> >>>> void (*guest_idle_poll_func)(void); >>>> struct static_key guest_idle_poll_key __read_mostly; >>>> >>>> static inline void guest_idle_poll(void) >>>> { >>>> ?????if (static_key_false(&guest_idle_poll_key)) >>>> ???????? guest_idle_poll_func(); >>>> } >>> >>> >>> thank you for your sample code :) >>> I agree there is no big difference.. I think we are discussion for two >>> things: >>> ??1) x86 VM on different hypervisors >>> ??2) different archs VM on kvm hypervisor >>> >>> What I want to do is x86 VM on different hypervisors, such as kvm / xen >>> / hyperv .. >> Why limit the solution to x86 if the more general solution isn't >> harder? >> >> As you didn't give any reason why the pvops approach is better other >> than you don't care for non-x86 platforms you won't get an "Ack" from >> me for this patch. > > > It just looks a little odder to me. I understand you care about no-x86 > arch. > > Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in > ?? - arch/arm64/include/asm/paravirt.h > ?? - arch/x86/include/asm/paravirt_types.h > ?? - arch/arm/include/asm/paravirt.hYes, I know. This is just a hack to make it compile. Other than the same names this has nothing to do with pvops, but is just a function vector.> I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops > for arm/arm64 arch, you'd define a same structure in > ?? - arch/arm64/include/asm/paravirt.h???? or > ?? - arch/arm/include/asm/paravirt.h > > .. instead of static key / fuction. > > then implement a real function in > ?? - arch/arm/kernel/paravirt.c.So just to use pvops you want to implement it in each arch instead of using a mechanism available everywhere?> Also I wonder HOW/WHERE to define a static key/function, then to benifit > x86/no-x86 archs?What? There are plenty of examples in the kernel. Please stop wasting my time. Either write a patch which is acceptable or let it be. I won't take your pvops approach without a really good reason to do so. And so far you haven't given any reason other than you are too lazy to write a proper patch, sorry. Juergen
Apparently Analagous Threads
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops
- [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops