Peter Zijlstra
2016-Nov-16 10:23 UTC
[PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote:> Hi, Peter. > I think we can avoid a function call in a simpler way. How about below > > static inline bool vcpu_is_preempted(int cpu) > { > /* only set in pv case*/ > if (pv_lock_ops.vcpu_is_preempted) > return pv_lock_ops.vcpu_is_preempted(cpu); > return false; > }That is still more expensive. It needs to do an actual load and makes it hard to predict the branch, you'd have to actually wait for the load to complete etc. Also, it generates more code. Paravirt muck should strive to be as cheap as possible when ran on native hardware.
Christian Borntraeger
2016-Nov-16 11:29 UTC
[PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
On 11/16/2016 11:23 AM, Peter Zijlstra wrote:> On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote: >> Hi, Peter. >> I think we can avoid a function call in a simpler way. How about below >> >> static inline bool vcpu_is_preempted(int cpu) >> { >> /* only set in pv case*/ >> if (pv_lock_ops.vcpu_is_preempted) >> return pv_lock_ops.vcpu_is_preempted(cpu); >> return false; >> } > > That is still more expensive. It needs to do an actual load and makes it > hard to predict the branch, you'd have to actually wait for the load to > complete etc.Out of curiosity, why is that hard to predict? On s390 the branch prediction runs asynchronously ahead of the downstream pipeline (e.g. search for "IBM z Systems Processor Optimization Primer" page 11). given enough capacity, I would assume that modern x86 processors would do the same and be able to predict this is as soon as it becomes hot (and otherwise you would not notice the branch miss anyway). Is x86 behaving differently here?> Also, it generates more code. > > Paravirt muck should strive to be as cheap as possible when ran on > native hardware.As I am interested in this series from the s390 point of view, this is the only thing that block this series? Is there a chance to add a static key around the paravirt ops somehow?
Peter Zijlstra
2016-Nov-16 11:43 UTC
[PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
On Wed, Nov 16, 2016 at 12:29:44PM +0100, Christian Borntraeger wrote:> On 11/16/2016 11:23 AM, Peter Zijlstra wrote: > > On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote: > >> Hi, Peter. > >> I think we can avoid a function call in a simpler way. How about below > >> > >> static inline bool vcpu_is_preempted(int cpu) > >> { > >> /* only set in pv case*/ > >> if (pv_lock_ops.vcpu_is_preempted) > >> return pv_lock_ops.vcpu_is_preempted(cpu); > >> return false; > >> } > > > > That is still more expensive. It needs to do an actual load and makes it > > hard to predict the branch, you'd have to actually wait for the load to > > complete etc. > > Out of curiosity, why is that hard to predict? > On s390 the branch prediction runs asynchronously ahead of the downstream > pipeline (e.g. search for "IBM z Systems Processor Optimization Primer" page 11). > given enough capacity, I would assume that modern x86 processors would do the same > and be able to predict this is as soon as it becomes hot (and otherwise you would > not notice the branch miss anyway). Is x86 behaving differently here?Not sure how exactly it works, but it seems to me that an immediate assignment to the value you're going to compare would leave very little doubt. Then again, maybe cores aren't that smart and only look at the hysterical btb for prediction.> > Also, it generates more code. > > > > Paravirt muck should strive to be as cheap as possible when ran on > > native hardware. > > As I am interested in this series from the s390 point of view, this is > the only thing that block this series?Ingo was rewriting the changelog, other than that, no, I can do this on top. Just spotted this because Ingo and me talked it over.> Is there a chance to add a static key around the paravirt ops somehow?More code generation still, replacing the call with an immediate assignment to the return register is the shortest possible option I think.
Pan Xinhui
2016-Nov-17 05:16 UTC
[PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
? 2016/11/16 18:23, Peter Zijlstra ??:> On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote: >> Hi, Peter. >> I think we can avoid a function call in a simpler way. How about below >> >> static inline bool vcpu_is_preempted(int cpu) >> { >> /* only set in pv case*/ >> if (pv_lock_ops.vcpu_is_preempted) >> return pv_lock_ops.vcpu_is_preempted(cpu); >> return false; >> } > > That is still more expensive. It needs to do an actual load and makes it > hard to predict the branch, you'd have to actually wait for the load to > complete etc. >yes, one more load in native case. I think this is acceptable as vcpu_is_preempted is not a critical function. however if we use pv_callee_save_regs_thunk, more unnecessary registers might be save/resotred in pv case. that will introduce a little overhead. but I think I am okay with your idea. I can make another patch based on this patchset with your suggested-by. thanks xinhui> Also, it generates more code. > > Paravirt muck should strive to be as cheap as possible when ran on > native hardware. >
Seemingly Similar Threads
- [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
- [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
- [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
- [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
- [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check