Peter Zijlstra
2017-Feb-13 21:52 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote:> On 02/13/2017 02:42 PM, Waiman Long wrote: > > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: > >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: > >>> That way we'd end up with something like: > >>> > >>> asm(" > >>> push %rdi; > >>> movslq %edi, %rdi; > >>> movq __per_cpu_offset(,%rdi,8), %rax; > >>> cmpb $0, %[offset](%rax); > >>> setne %al; > >>> pop %rdi; > >>> " : : [offset] "i" (((unsigned long)&steal_time) + offsetof(struct steal_time, preempted))); > >>> > >>> And if we could get rid of the sign extend on edi we could avoid all the > >>> push-pop nonsense, but I'm not sure I see how to do that (then again, > >>> this asm foo isn't my strongest point). > >> Maybe: > >> > >> movsql %edi, %rax; > >> movq __per_cpu_offset(,%rax,8), %rax; > >> cmpb $0, %[offset](%rax); > >> setne %al; > >> > >> ? > > Yes, that looks good to me. > > > > Cheers, > > Longman > > > Sorry, I am going to take it back. The displacement or offset can only > be up to 32-bit. So we will still need to use at least one more > register, I think.I don't think that would be a problem, I very much doubt we declare more than 4G worth of per-cpu variables in the kernel. In any case, use "e" or "Z" as constraint (I never quite know when to use which). That are s32 and u32 displacement immediates resp. and should fail compile with a semi-sensible failure if the displacement is too big.
hpa at zytor.com
2017-Feb-13 22:00 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On February 13, 2017 1:52:20 PM PST, Peter Zijlstra <peterz at infradead.org> wrote:>On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >> On 02/13/2017 02:42 PM, Waiman Long wrote: >> > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >> >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >> >>> That way we'd end up with something like: >> >>> >> >>> asm(" >> >>> push %rdi; >> >>> movslq %edi, %rdi; >> >>> movq __per_cpu_offset(,%rdi,8), %rax; >> >>> cmpb $0, %[offset](%rax); >> >>> setne %al; >> >>> pop %rdi; >> >>> " : : [offset] "i" (((unsigned long)&steal_time) + >offsetof(struct steal_time, preempted))); >> >>> >> >>> And if we could get rid of the sign extend on edi we could avoid >all the >> >>> push-pop nonsense, but I'm not sure I see how to do that (then >again, >> >>> this asm foo isn't my strongest point). >> >> Maybe: >> >> >> >> movsql %edi, %rax; >> >> movq __per_cpu_offset(,%rax,8), %rax; >> >> cmpb $0, %[offset](%rax); >> >> setne %al; >> >> >> >> ? >> > Yes, that looks good to me. >> > >> > Cheers, >> > Longman >> > >> Sorry, I am going to take it back. The displacement or offset can >only >> be up to 32-bit. So we will still need to use at least one more >> register, I think. > >I don't think that would be a problem, I very much doubt we declare >more >than 4G worth of per-cpu variables in the kernel. > >In any case, use "e" or "Z" as constraint (I never quite know when to >use which). That are s32 and u32 displacement immediates resp. and >should fail compile with a semi-sensible failure if the displacement is >too big.e for signed, Z for unsigned. Obviously you have to use a matching instruction: an immediate or displacement in a 64-bit instruction is sign-extended, in a 32-bit instruction zero-extended. E.g.: movl %0,%%eax # use Z, all of %rax will be set movq %0,%%rax # use e -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
hpa at zytor.com
2017-Feb-13 22:07 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On February 13, 2017 1:52:20 PM PST, Peter Zijlstra <peterz at infradead.org> wrote:>On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >> On 02/13/2017 02:42 PM, Waiman Long wrote: >> > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >> >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >> >>> That way we'd end up with something like: >> >>> >> >>> asm(" >> >>> push %rdi; >> >>> movslq %edi, %rdi; >> >>> movq __per_cpu_offset(,%rdi,8), %rax; >> >>> cmpb $0, %[offset](%rax); >> >>> setne %al; >> >>> pop %rdi; >> >>> " : : [offset] "i" (((unsigned long)&steal_time) + >offsetof(struct steal_time, preempted))); >> >>> >> >>> And if we could get rid of the sign extend on edi we could avoid >all the >> >>> push-pop nonsense, but I'm not sure I see how to do that (then >again, >> >>> this asm foo isn't my strongest point). >> >> Maybe: >> >> >> >> movsql %edi, %rax; >> >> movq __per_cpu_offset(,%rax,8), %rax; >> >> cmpb $0, %[offset](%rax); >> >> setne %al; >> >> >> >> ? >> > Yes, that looks good to me. >> > >> > Cheers, >> > Longman >> > >> Sorry, I am going to take it back. The displacement or offset can >only >> be up to 32-bit. So we will still need to use at least one more >> register, I think. > >I don't think that would be a problem, I very much doubt we declare >more >than 4G worth of per-cpu variables in the kernel. > >In any case, use "e" or "Z" as constraint (I never quite know when to >use which). That are s32 and u32 displacement immediates resp. and >should fail compile with a semi-sensible failure if the displacement is >too big.Oh, and unless you are explicitly forcing 32-bit addressing mode, displacements are always "e" (or "m" if you let gcc pick the addressing mode.) -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Waiman Long
2017-Feb-13 22:34 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On 02/13/2017 04:52 PM, Peter Zijlstra wrote:> On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >> On 02/13/2017 02:42 PM, Waiman Long wrote: >>> On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >>>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>>>> That way we'd end up with something like: >>>>> >>>>> asm(" >>>>> push %rdi; >>>>> movslq %edi, %rdi; >>>>> movq __per_cpu_offset(,%rdi,8), %rax; >>>>> cmpb $0, %[offset](%rax); >>>>> setne %al; >>>>> pop %rdi; >>>>> " : : [offset] "i" (((unsigned long)&steal_time) + offsetof(struct steal_time, preempted))); >>>>> >>>>> And if we could get rid of the sign extend on edi we could avoid all the >>>>> push-pop nonsense, but I'm not sure I see how to do that (then again, >>>>> this asm foo isn't my strongest point). >>>> Maybe: >>>> >>>> movsql %edi, %rax; >>>> movq __per_cpu_offset(,%rax,8), %rax; >>>> cmpb $0, %[offset](%rax); >>>> setne %al; >>>> >>>> ? >>> Yes, that looks good to me. >>> >>> Cheers, >>> Longman >>> >> Sorry, I am going to take it back. The displacement or offset can only >> be up to 32-bit. So we will still need to use at least one more >> register, I think. > I don't think that would be a problem, I very much doubt we declare more > than 4G worth of per-cpu variables in the kernel. > > In any case, use "e" or "Z" as constraint (I never quite know when to > use which). That are s32 and u32 displacement immediates resp. and > should fail compile with a semi-sensible failure if the displacement is > too big. >It is the address of &steal_time that will exceed the 32-bit limit. Cheers, Longman
hpa at zytor.com
2017-Feb-13 22:36 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On February 13, 2017 2:34:01 PM PST, Waiman Long <longman at redhat.com> wrote:>On 02/13/2017 04:52 PM, Peter Zijlstra wrote: >> On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >>> On 02/13/2017 02:42 PM, Waiman Long wrote: >>>> On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >>>>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>>>>> That way we'd end up with something like: >>>>>> >>>>>> asm(" >>>>>> push %rdi; >>>>>> movslq %edi, %rdi; >>>>>> movq __per_cpu_offset(,%rdi,8), %rax; >>>>>> cmpb $0, %[offset](%rax); >>>>>> setne %al; >>>>>> pop %rdi; >>>>>> " : : [offset] "i" (((unsigned long)&steal_time) + >offsetof(struct steal_time, preempted))); >>>>>> >>>>>> And if we could get rid of the sign extend on edi we could avoid >all the >>>>>> push-pop nonsense, but I'm not sure I see how to do that (then >again, >>>>>> this asm foo isn't my strongest point). >>>>> Maybe: >>>>> >>>>> movsql %edi, %rax; >>>>> movq __per_cpu_offset(,%rax,8), %rax; >>>>> cmpb $0, %[offset](%rax); >>>>> setne %al; >>>>> >>>>> ? >>>> Yes, that looks good to me. >>>> >>>> Cheers, >>>> Longman >>>> >>> Sorry, I am going to take it back. The displacement or offset can >only >>> be up to 32-bit. So we will still need to use at least one more >>> register, I think. >> I don't think that would be a problem, I very much doubt we declare >more >> than 4G worth of per-cpu variables in the kernel. >> >> In any case, use "e" or "Z" as constraint (I never quite know when to >> use which). That are s32 and u32 displacement immediates resp. and >> should fail compile with a semi-sensible failure if the displacement >is >> too big. >> >It is the address of &steal_time that will exceed the 32-bit limit. > >Cheers, >LongmanThat seems odd in the extreme? -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Peter Zijlstra
2017-Feb-14 09:39 UTC
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote:> It is the address of &steal_time that will exceed the 32-bit limit.That seems extremely unlikely. That would mean we have more than 4G worth of per-cpu variables declared in the kernel.
Possibly Parallel Threads
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function