thr3ads.net - search: "trycollapsechainedmul"

Displaying 10 results from an estimated 10 matches for "trycollapsechainedmul".

[PATCH v2] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH v2] nv50/ir: constant fold OP_SPLIT

...= 0; i->defExists(d); ++d) { + bld.mkMov(i->getDef(d), bld.mkImm(val & ((1 << shift) - 1)), type); + val >>= shift; + } + delete_Instruction(prog, i); + } + } + break; case OP_MUL: if (i->dType == TYPE_F32) tryCollapseChainedMULs(i, s, imm0); -- 2.10.0

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 27

[PATCH] nv50/ir: constant fold OP_SPLIT

..._NONE) { + bld.mkMov(i->getDef(0), bld.mkImm(imm0.reg.data.u64 >> shift), type); + bld.mkMov(i->getDef(1), bld.mkImm(imm0.reg.data.u64), type); + delete_Instruction(prog, i); + } + } + break; case OP_MUL: if (i->dType == TYPE_F32) tryCollapseChainedMULs(i, s, imm0); -- 2.10.0

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH] nv50/ir: constant fold OP_SPLIT

...U32. Not sure if that is what we want... other than that that, shorten it like this would be nice! > >> + delete_Instruction(prog, i); >> + } >> + } >> + break; >> case OP_MUL: >> if (i->dType == TYPE_F32) >> tryCollapseChainedMULs(i, s, imm0); >> -- >> 2.10.0 >> >> _______________________________________________ >> Nouveau mailing list >> Nouveau at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/nouveau

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

2014 Jul 08

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

...um/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index c162ac4..8d052c5 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -686,7 +686,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2, // b = mul a, imm // d = mul b, c -> d = mul_x_imm a, c int s2, t2; - insn = mul2->getDef(0)->uses.front()->getInsn(); + insn = (*mul2->getDef(0)->uses.begin())->getInsn(); if (!insn) return; mul1 = mu...

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 28

[PATCH] nv50/ir: constant fold OP_SPLIT

...hift) - 1)); val >>= shift; } I think this will account for every case, and with a lot less special-casing. What do you think? > + delete_Instruction(prog, i); > + } > + } > + break; > case OP_MUL: > if (i->dType == TYPE_F32) > tryCollapseChainedMULs(i, s, imm0); > -- > 2.10.0 > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH] nv50/ir: constant fold OP_SPLIT

...>dType), isSignedType(i->dType)) How's that :p > > >> >>> + delete_Instruction(prog, i); >>> + } >>> + } >>> + break; >>> case OP_MUL: >>> if (i->dType == TYPE_F32) >>> tryCollapseChainedMULs(i, s, imm0); >>> -- >>> 2.10.0 >>> >>> _______________________________________________ >>> Nouveau mailing list >>> Nouveau at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/nouveau > >

[PATCH v2] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH v2] nv50/ir: constant fold OP_SPLIT

...ov(i->getDef(d), bld.mkImm(val & ((1 << shift) - 1)), type); 1ULL > + val >>= shift; > + } > + delete_Instruction(prog, i); > + } > + } > + break; > case OP_MUL: > if (i->dType == TYPE_F32) > tryCollapseChainedMULs(i, s, imm0); > -- > 2.10.0 > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

2015 Mar 25

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

...9,6 +581,7 @@ ConstantFolding::expr(Instruction *i, i->src(0).mod = Modifier(0); i->src(1).mod = Modifier(0); + i->postFactor = 0; i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res.data.u32)); i->setSrc(1, NULL); @@ -682,7 +685,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2, Instruction *insn; Instruction *mul1 = NULL; // mul1 before mul2 int e = 0; - float f = imm2.reg.data.f32; + float f = imm2.reg.data.f32 * exp2f(mul2->postFactor); ImmediateValue imm1; assert(mul2->op == OP_MUL && mul2->dType == TYPE_F3...

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

2015 May 09

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Pretty sure there's nothing wrong with it, but it looks odd in the code. src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 ++ src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 +++++-- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 ++ 3 files changed, 9 insertions(+), 2 deletions(-)

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand".

search for: trycollapsechainedmul