thr3ads.net - search: "neg1"

Displaying 6 results from an estimated 6 matches for "neg1".

Did you mean: neg

2016 Sep 13

undef * 0

Hi Soham, You're right that in LLVM IR arithmetic (with the current definition of `undef`) is not distributive. You can't replace `A * (B + C)` with `A * B + A * C` in general, since (exactly as you said) for A = `undef`, B = `1`, C = `-1` the former always computes `0` while the latter computes `undef`. This is fundamentally because replacing `A * (B + C)` with `A * B + A * C`

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

2017 Mar 26

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA

undef * 0

2016 Sep 13

undef * 0

...// clang++ -emit-llvm udf.cpp -S ---------------------- define i32 @_Z3foov() #0 { entry: %a = alloca i32, align 4 %b = alloca i32, align 4 %0 = load i32, i32* %a, align 4 %1 = load i32, i32* %b, align 4 %neg = xor i32 %1, -1 %and = and i32 %0, %neg %2 = load i32, i32* %a, align 4 %neg1 = xor i32 %2, -1 %3 = load i32, i32* %b, align 4 %and2 = and i32 %neg1, %3 %or = or i32 %and, %and2 ret i32 %or } Here %or must return some constant e.g. 0. optimized IR // opt -O3 udf.ll -S ------------------- define i32 @_Z3foov() #0 { entry: ret i32 undef } > Sanjoy Das wrote:...

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

...); void emitFMAD(const Instruction *); + void emitDMAD(const Instruction *); void emitMADSP(const Instruction *); void emitNOT(Instruction *); @@ -523,6 +526,25 @@ CodeEmitterNVC0::emitFMAD(const Instruction *i) } void +CodeEmitterNVC0::emitDMAD(const Instruction *i) +{ + bool neg1 = (i->src(0).mod ^ i->src(1).mod).neg(); + + emitForm_A(i, HEX64(20000000, 00000001)); + + if (i->src(2).mod.neg()) + code[0] |= 1 << 8; + + roundMode_A(i); + + if (neg1) + code[0] |= 1 << 9; + + assert(!i->saturate); + assert(!i->ftz); +} + +void C...

[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)

2015 Feb 23

[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)

...+} + +void CodeEmitterNV50::emitFADD(const Instruction *i) { const int neg0 = i->src(0).mod.neg(); @@ -997,6 +1022,25 @@ CodeEmitterNV50::emitFADD(const Instruction *i) } void +CodeEmitterNV50::emitDADD(const Instruction *i) +{ + const int neg0 = i->src(0).mod.neg(); + const int neg1 = i->src(1).mod.neg() ^ ((i->op == OP_SUB) ? 1 : 0); + + assert(!(i->src(0).mod | i->src(1).mod).abs()); + assert(!i->saturate); + assert(i->encSize == 8); + + code[1] = 0x60000000; + code[0] = 0xe0000000; + + emitForm_ADD(i); + + code[1] |= neg0 << 26; + cod...

[LLVMdev] [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction

2014 Mar 07

[LLVMdev] [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction

...8, i8 0, i8 1 acquire acquire ; X64: lock ; X64: cmpxchgb ; X32: lock diff --git a/test/CodeGen/X86/atomic_op.ll b/test/CodeGen/X86/atomic_op.ll index a378d6e..b3045ed 100644 --- a/test/CodeGen/X86/atomic_op.ll +++ b/test/CodeGen/X86/atomic_op.ll @@ -101,11 +101,11 @@ entry: %neg1 = sub i32 0, 10 ; <i32> [#uses=1] ; CHECK: lock ; CHECK: cmpxchgl - %16 = cmpxchg i32* %val2, i32 %neg1, i32 1 monotonic + %16 = cmpxchg i32* %val2, i32 %neg1, i32 1 monotonic monotonic store i32 %16, i32* %old ; CHECK: lock ; CHECK: cmpxchgl - %17 =...

search for: neg1