Tobias Klausmann
2014-Jun-03 20:58 UTC
[Nouveau] [PATCH v2 0/4] Constant folding of new Instructions
And another try for constant folding of Instructions for nvc0.
Please Review this!
Thanks,
Tobias Klausmann
Tobias Klausmann (4):
nvc0/ir: clear subop when folding constant expressions
nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
expressions
nvc0/ir: Handle OP_BFIND when folding constant expressions
nvc0/ir: Handle OP_POPCNT when folding constant expressions
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 50 +++++++++++++++++++++-
1 file changed, 48 insertions(+), 2 deletions(-)
--
1.8.4.5
Tobias Klausmann
2014-Jun-03 20:58 UTC
[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions
Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set.
After folding, make sure that it is cleared
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1a2c2e6..58092f4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i,
} else {
i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
}
+ i->subOp = 0;
}
void
--
1.8.4.5
Tobias Klausmann
2014-Jun-03 20:58 UTC
[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
V2: Handle the instruction right (shift after reverse)
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 58092f4..a214ffc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -529,8 +529,20 @@ ConstantFolding::expr(Instruction *i,
lshift = 32 - width - offset;
}
switch (i->dType) {
- case TYPE_S32: res.data.s32 = (a->data.s32 << lshift) >>
rshift; break;
- case TYPE_U32: res.data.u32 = (a->data.u32 << lshift) >>
rshift; break;
+ case TYPE_S32:
+ if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
+ res.data.s32 = util_bitreverse(a->data.s32);
+ else
+ res.data.s32 = a->data.s32;
+ res.data.s32 = (res.data.s32 << lshift) >> rshift;
+ break;
+ case TYPE_U32:
+ if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
+ res.data.u32 = util_bitreverse(a->data.u32);
+ else
+ res.data.u32 = a->data.u32;
+ res.data.u32 = (res.data.u32 << lshift) >> rshift;
+ break;
default:
return;
}
--
1.8.4.5
Tobias Klausmann
2014-Jun-03 20:58 UTC
[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index a214ffc..c497335 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -948,6 +948,24 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
case OP_EX2:
unary(i, imm0);
break;
+ case OP_BFIND: {
+ int32_t res;
+ switch (i->dType) {
+ case TYPE_S32:
+ res = util_last_bit_signed(imm0.reg.data.s32) - 1; break;
+ case TYPE_U32:
+ res = util_last_bit(imm0.reg.data.u32) -1; break;
+ default:
+ return;
+ }
+ if ((i->subOp == NV50_IR_SUBOP_BFIND_SAMT) && (res >= 0))
+ res = 31 - res;
+ i->setSrc(0, new_ImmediateValue(i->bb->getProgram(),
(uint32_t)res));
+ i->setSrc(1, NULL);
+ i->op = OP_MOV;
+ i->subOp = 0;
+ break;
+ }
default:
return;
}
--
1.8.4.5
Tobias Klausmann
2014-Jun-03 20:58 UTC
[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions
V2: Add support for a single-argument version of POPCNT for Maxwell (SM5)
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index c497335..19767b4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -548,6 +548,10 @@ ConstantFolding::expr(Instruction *i,
}
break;
}
+ case OP_POPCNT: {
+ res.data.u32 = util_bitcount(a->data.u32 & b->data.u32);
break;
+ break;
+ }
default:
return;
}
@@ -966,6 +970,17 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
i->subOp = 0;
break;
}
+ case OP_POPCNT: {
+ uint32_t res;
+ if (!i->srcExists(1)) {
+ res = util_bitcount(imm0.reg.data.u32);
+ i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res));
+ i->setSrc(1, NULL);
+ i->op = OP_MOV;
+ i->subOp = 0;
+ }
+ break;
+ }
default:
return;
}
--
1.8.4.5
Ilia Mirkin
2014-Jun-03 21:01 UTC
[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions
On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> V2: Add support for a single-argument version of POPCNT for Maxwell (SM5) > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index c497335..19767b4 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -548,6 +548,10 @@ ConstantFolding::expr(Instruction *i, > } > break; > } > + case OP_POPCNT: { > + res.data.u32 = util_bitcount(a->data.u32 & b->data.u32); break; > + break;Do you really need 2 breaks here? Also, funy indentation.> + } > default: > return; > } > @@ -966,6 +970,17 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > i->subOp = 0; > break; > } > + case OP_POPCNT: { > + uint32_t res; > + if (!i->srcExists(1)) { > + res = util_bitcount(imm0.reg.data.u32); > + i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res)); > + i->setSrc(1, NULL);A little overkill -- src(1) already doesn't exist... can get rid of that, I think.> + i->op = OP_MOV; > + i->subOp = 0; > + } > + break; > + } > default: > return; > } > -- > 1.8.4.5 >
Ilia Mirkin
2014-Jun-03 21:03 UTC
[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> V2: Handle the instruction right (shift after reverse) > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 58092f4..a214ffc 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -529,8 +529,20 @@ ConstantFolding::expr(Instruction *i, > lshift = 32 - width - offset; > } > switch (i->dType) { > - case TYPE_S32: res.data.s32 = (a->data.s32 << lshift) >> rshift; break; > - case TYPE_U32: res.data.u32 = (a->data.u32 << lshift) >> rshift; break; > + case TYPE_S32: > + if (i->subOp == NV50_IR_SUBOP_EXTBF_REV) > + res.data.s32 = util_bitreverse(a->data.s32); > + else > + res.data.s32 = a->data.s32;Why not do this once outside of the switch statement? The two are actually the same -- util_bitreverse doesn't care about signed/unsigned, and res.data is a union.> + res.data.s32 = (res.data.s32 << lshift) >> rshift; > + break; > + case TYPE_U32: > + if (i->subOp == NV50_IR_SUBOP_EXTBF_REV) > + res.data.u32 = util_bitreverse(a->data.u32); > + else > + res.data.u32 = a->data.u32; > + res.data.u32 = (res.data.u32 << lshift) >> rshift; > + break; > default: > return; > } > -- > 1.8.4.5 >
Ilia Mirkin
2014-Jun-03 21:05 UTC
[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions
On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index a214ffc..c497335 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -948,6 +948,24 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > case OP_EX2: > unary(i, imm0); > break; > + case OP_BFIND: { > + int32_t res; > + switch (i->dType) { > + case TYPE_S32: > + res = util_last_bit_signed(imm0.reg.data.s32) - 1; break;The style elsewhere is to do case TYPE_S32: foo; break; if it fits. Otherwise put the break on a separate line.> + case TYPE_U32: > + res = util_last_bit(imm0.reg.data.u32) -1; break;Missing space between "-" and "1".> + default: > + return; > + } > + if ((i->subOp == NV50_IR_SUBOP_BFIND_SAMT) && (res >= 0))No need for the extra parens. && comes after ==.> + res = 31 - res; > + i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), (uint32_t)res));Why the typecast?> + i->setSrc(1, NULL); > + i->op = OP_MOV; > + i->subOp = 0; > + break; > + } > default: > return; > } > -- > 1.8.4.5 >
Ilia Mirkin
2014-Jun-03 21:06 UTC
[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions
On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote:> Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set. > After folding, make sure that it is cleared > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>> --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 1a2c2e6..58092f4 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i, > } else { > i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */ > } > + i->subOp = 0; > } > > void > -- > 1.8.4.5 >
Reasonably Related Threads
- [PATCH v3 0/4] Constant folding of new Instructions
- Add constant folding for new opcodes
- [PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
- [PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
- [PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic