thr3ads.net - search: "cmovc"

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020 Jul 23

2

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

...hem, since world+dog uses what distros pick. Anyway, instead of adding a second per-cpu variable, can you see how horrible something like this is: unsigned char adds(unsigned char var, unsigned char val) { unsigned short sat = 0xff, tmp = var; asm ("addb %[val], %b[var];" "cmovc %[sat], %[var];" : [var] "+r" (tmp) : [val] "ir" (val), [sat] "r" (sat) ); return tmp; } Another thing to try is, instead of threading that lockval throughout the thing, simply: #define _Q_LOCKED_VAL this_cpu_read_stable(cpu_sat) or combined...

[LLVMdev] X86TargetLowering::LowerToBT

2015 Jan 23

2

[LLVMdev] X86TargetLowering::LowerToBT

I suspect that this is because the mask in your example is the result of a variable shift, which (a) has it’s own performance and flags hazards pre-SHLX and (b) requires additional µops to do with TEST. I expect that ICC is putting a dummy TEST or XOR ahead of the BT to break the false flags dependency, as well. If the mask were constant, I expect ICC would generate TEST instead (but I don’t

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020 Jul 23

4

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: > BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? > I will have to update the patch to fix the reported 0-day test problem, but > I want to collect other feedback before sending out v3. I want to say I hate it all, it adds instructions to a path we spend an aweful lot of time optimizing without

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020 Jul 23

4

[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: > BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? > I will have to update the patch to fix the reported 0-day test problem, but > I want to collect other feedback before sending out v3. I want to say I hate it all, it adds instructions to a path we spend an aweful lot of time optimizing without

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 06

4

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big

search for: cmovc