Displaying 5 results from an estimated 5 matches for "cmovc".
Did you mean:
cmov
2020 Jul 23
2
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
...hem, since world+dog uses what distros pick.
Anyway, instead of adding a second per-cpu variable, can you see how
horrible something like this is:
unsigned char adds(unsigned char var, unsigned char val)
{
unsigned short sat = 0xff, tmp = var;
asm ("addb %[val], %b[var];"
"cmovc %[sat], %[var];"
: [var] "+r" (tmp)
: [val] "ir" (val), [sat] "r" (sat)
);
return tmp;
}
Another thing to try is, instead of threading that lockval throughout
the thing, simply:
#define _Q_LOCKED_VAL this_cpu_read_stable(cpu_sat)
or combined...
2015 Jan 23
2
[LLVMdev] X86TargetLowering::LowerToBT
I suspect that this is because the mask in your example is the result of a variable shift, which (a) has it’s own performance and flags hazards pre-SHLX and (b) requires additional µops to do with TEST. I expect that ICC is putting a dummy TEST or XOR ahead of the BT to break the false flags dependency, as well.
If the mask were constant, I expect ICC would generate TEST instead (but I don’t
2020 Jul 23
4
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
> I will have to update the patch to fix the reported 0-day test problem, but
> I want to collect other feedback before sending out v3.
I want to say I hate it all, it adds instructions to a path we spend an
aweful lot of time optimizing without
2020 Jul 23
4
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
> I will have to update the patch to fix the reported 0-day test problem, but
> I want to collect other feedback before sending out v3.
I want to say I hate it all, it adds instructions to a path we spend an
aweful lot of time optimizing without
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll,
while clang/LLVM recognizes common bit-twiddling idioms/expressions
like
unsigned int rotate(unsigned int x, unsigned int n)
{
return (x << n) | (x >> (32 - n));
}
and typically generates "rotate" machine instructions for this
expression, it fails to recognize other also common bit-twiddling
idioms/expressions.
The standard IEEE CRC-32 for "big