Tobias Klausmann
2014-Jun-03 22:35 UTC
[Nouveau] [PATCH v3 0/4] Constant folding of new Instructions
Yet another try for constant folding of Instructions for nvc0. Please Review this again! (Hopefully the last time ;-) ) Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant expressions .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 41 ++++++++++++++++++++-- 1 file changed, 39 insertions(+), 2 deletions(-) -- 1.8.4.5
Tobias Klausmann
2014-Jun-03 22:35 UTC
[Nouveau] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions
Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set. After folding, make sure that it is cleared Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 1a2c2e6..58092f4 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i, } else { i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */ } + i->subOp = 0; } void -- 1.8.4.5
Tobias Klausmann
2014-Jun-03 22:35 UTC
[Nouveau] [PATCH v3 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: Handle the instruction right (shift after reverse) V3: Reverse once, not independently for every TYPE src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 58092f4..538e745 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -528,9 +528,17 @@ ConstantFolding::expr(Instruction *i, rshift = 32 - width; lshift = 32 - width - offset; } + if (i->subOp == NV50_IR_SUBOP_EXTBF_REV) + res.data.u32 = util_bitreverse(a->data.u32); + else + res.data.u32 = a->data.u32; switch (i->dType) { - case TYPE_S32: res.data.s32 = (a->data.s32 << lshift) >> rshift; break; - case TYPE_U32: res.data.u32 = (a->data.u32 << lshift) >> rshift; break; + case TYPE_S32: + res.data.s32 = (res.data.s32 << lshift) >> rshift; + break; + case TYPE_U32: + res.data.u32 = (res.data.u32 << lshift) >> rshift; + break; default: return; } -- 1.8.4.5
Tobias Klausmann
2014-Jun-03 22:35 UTC
[Nouveau] [PATCH v3 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V3: Use BuildUtil for the Immediate instead of a type conversion .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 538e745..e4d91d7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -944,6 +944,23 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) case OP_EX2: unary(i, imm0); break; + case OP_BFIND: { + int32_t res; + switch (i->dType) { + case TYPE_S32: res = util_last_bit_signed(imm0.reg.data.s32) - 1; break; + case TYPE_U32: res = util_last_bit(imm0.reg.data.u32) - 1; break; + default: + return; + } + if (i->subOp == NV50_IR_SUBOP_BFIND_SAMT && res >= 0) + res = 31 - res; + bld.setPosition(i, false); /* make sure bld is init'ed */ + i->setSrc(0, bld.mkImm(res)); + i->setSrc(1, NULL); + i->op = OP_MOV; + i->subOp = 0; + break; + } default: return; } -- 1.8.4.5
Tobias Klausmann
2014-Jun-03 22:35 UTC
[Nouveau] [PATCH v3 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: Add support for a single-argument version of POPCNT for Maxwell (SM5) V3: Clean up a bit more src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index e4d91d7..8dde7c1 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -544,6 +544,8 @@ ConstantFolding::expr(Instruction *i, } break; } + case OP_POPCNT: res.data.u32 = util_bitcount(a->data.u32 & b->data.u32); + break; default: return; } @@ -961,6 +963,15 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->subOp = 0; break; } + case OP_POPCNT: { + uint32_t res; + res = util_bitcount(imm0.reg.data.u32); + i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res)); + i->setSrc(1, NULL); + i->op = OP_MOV; + i->subOp = 0; + break; + } default: return; } -- 1.8.4.5
Ilia Mirkin
2014-Jun-03 22:44 UTC
[Nouveau] [PATCH v3 0/4] Constant folding of new Instructions
Series is Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu> There are a few minor formatting items that I'll take care of before pushing this out later tonight... take a look at the changes I made for the future :) On Tue, Jun 3, 2014 at 6:35 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> Yet another try for constant folding of Instructions for nvc0. > > Please Review this again! (Hopefully the last time ;-) ) > > > Tobias Klausmann (4): > nvc0/ir: clear subop when folding constant expressions > nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant > expressions > nvc0/ir: Handle OP_BFIND when folding constant expressions > nvc0/ir: Handle OP_POPCNT when folding constant expressions > > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 41 ++++++++++++++++++++-- > 1 file changed, 39 insertions(+), 2 deletions(-) > > -- > 1.8.4.5 >
Carl Worth
2014-Jun-23 23:25 UTC
[Nouveau] [Mesa-dev] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions
Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> writes:> Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set. > After folding, make sure that it is cleared > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>Cc: "10.1 10.2" <mesa-stable at lists.freedesktop.org> Hi Tobias and Ilia, This patch isn't picking cleanly over to the 10.1 branch. Can you give me some guidance here? Either of the following replies would be great: Don't worry about this for 10.1 because... Here's a backported patch for 10.1... Thanks! -Carl -- carl.d.worth at intel.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140623/060f9933/attachment.sig>
Possibly Parallel Threads
- [PATCH 1/4] nvc0/ir: clear subop when folding constant expressions
- [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions
- [Mesa-dev] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions
- [PATCH v3 0/4] Constant folding of new Instructions
- [Mesa-dev] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions