Displaying 20 results from an estimated 37 matches for "op_mad".
2016 Oct 02
1
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...should have picked that up even in
> its current form. Should try to figure out why it didn't and that is
> likely to "fix" a *lot* more situations.
Actually i was coming from an, given really constrained, addition to the
LoadPropagation pass, where i was told to fix it within OP_MAD :/
> On Sun, Oct 2, 2016 at 2:24 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>
>> On 02.10.2016 20:03, Ilia Mirkin wrote:
>>> On Sun, Oct 2, 2016 at 1:58 PM, Tobias Klausmann
>>> <tobias.johannes.klausmann at mni.thm.de>...
2016 Oct 02
0
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...index 9875738..8bb5cf9 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
> break;
> case OP_MAD:
> if (imm0.isInteger(0)) {
> + ImmediateValue imm1;
> i->setSrc(0, i->getSrc(2));
> i->src(0).mod = i->src(2).mod;
> i->setSrc(1, NULL);
> i->setSrc(2, NULL);
> - i->op = i->src(0).mod.getOp...
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...egen/nv50_ir_peephole.cpp
index 9875738..8bb5cf9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
break;
case OP_MAD:
if (imm0.isInteger(0)) {
+ ImmediateValue imm1;
i->setSrc(0, i->getSrc(2));
i->src(0).mod = i->src(2).mod;
i->setSrc(1, NULL);
i->setSrc(2, NULL);
- i->op = i->src(0).mod.getOp();
- if (i->op != OP_CV...
2016 Oct 02
0
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...allium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i,
>>> ImmediateValue &imm0, int s)
>>> break;
>>> case OP_MAD:
>>> if (imm0.isInteger(0)) {
>>> + ImmediateValue imm1;
>>> i->setSrc(0, i->getSrc(2));
>>> i->src(0).mod = i->src(2).mod;
>>> i->setSrc(1, NULL);
>>> i->setSrc(2, NU...
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...9 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
>> break;
>> case OP_MAD:
>> if (imm0.isInteger(0)) {
>> + ImmediateValue imm1;
>> i->setSrc(0, i->getSrc(2));
>> i->src(0).mod = i->src(2).mod;
>> i->setSrc(1, NULL);
>> i->setSrc(2, NULL);
>> -...
2015 Feb 23
2
[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)
...insn)
emitUADD(insn);
break;
case OP_MUL:
- if (isFloatType(insn->dType))
+ if (insn->dType == TYPE_F64)
+ emitDMUL(insn);
+ else if (isFloatType(insn->dType))
emitFMUL(insn);
else
emitIMUL(insn);
break;
case OP_MAD:
case OP_FMA:
- if (isFloatType(insn->dType))
+ if (insn->dType == TYPE_F64)
+ emitDMAD(insn);
+ else if (isFloatType(insn->dType))
emitFMAD(insn);
else
emitIMAD(insn);
@@ -1912,7 +1986,7 @@ CodeEmitterNV50::getMinEncodingSize(const Ins...
2017 Mar 26
5
[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation
was "nv50/ir: PostRaConstantFolding improvements" before.
nothing really changed from the last version, just minor things.
Karol Herbst (5):
nv50/ir: restructure and rename postraconstantfolding pass
nv50/ir: implement mad post ra folding for nvc0+
gk110/ir: add LIMM form of mad
gm107/ir: add LIMM form of mad
nv50/ir: also do PostRaLoadPropagation for FMA
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
...bld.loadImm(NULL, 1.5f));
> +
> + half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input);
> + // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2)
> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
> + bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input,
> + bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess),
> + three_half));
> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
> + bl...
2014 May 18
1
[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic
...s
- i[0] = bld->mkSplit(a, halfSize, mul->getSrc(0));
- i[1] = bld->mkSplit(b, halfSize, mul->getSrc(1));
+ i[0] = bld->mkSplit(a, halfSize, s[0]);
+ i[1] = bld->mkSplit(b, halfSize, s[1]);
i[2] = bld->mkOp2(OP_MUL, fTy, t[0], a[0], b[1]);
i[3] = bld->mkOp3(OP_MAD, fTy, t[1], a[1], b[0], t[0]);
@@ -75,24 +92,76 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul)
i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]);
if (highResult) {
- Value *r[4];
+ Value *c[2];
+ Value *r[5];
Value *imm = bld->loadImm(NULL, 1 <&l...
2015 Jan 02
0
[PATCH] nv50/ir: Fold sat into mad
...um/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -87,7 +87,7 @@ static const struct opProperties _initProps[] =
{ OP_MUL, 0x3, 0x0, 0x0, 0x0, 0x2, 0x1, 0x1, 0x2 },
{ OP_MAX, 0x3, 0x3, 0x0, 0x0, 0x2, 0x1, 0x1, 0x0 },
{ OP_MIN, 0x3, 0x3, 0x0, 0x0, 0x2, 0x1, 0x1, 0x0 },
- { OP_MAD, 0x7, 0x0, 0x0, 0x0, 0x6, 0x1, 0x1, 0x0 }, // special constraint
+ { OP_MAD, 0x7, 0x0, 0x0, 0x8, 0x6, 0x1, 0x1, 0x0 }, // special constraint
{ OP_ABS, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0 },
{ OP_NEG, 0x0, 0x1, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0 },
{ OP_CVT, 0x1, 0x1, 0x0,...
2014 Jun 23
2
[Mesa-dev] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions
Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> writes:
> Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set.
> After folding, make sure that it is cleared
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
> Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable at lists.freedesktop.or...
2015 Feb 23
0
[PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
...64, three_half, TYPE_F32, bld.loadImm(NULL, 1.5f));
+
+ half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input);
+ // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2)
+ guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
+ bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input,
+ bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess),
+ three_half));
+ guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
+ bld.mkOp3v(OP_MAD, TYP...
2014 May 13
1
[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: "10.2" <mesa-stable at lists.freedesktop.org>
---
Not 100% sure of the significance of this code, but this seems like the
correct thing to do... will definitely run it through a full piglit run before
pushing out.
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git
2015 Feb 23
0
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
....5f));
>> +
>> + half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input);
>> + // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2)
>> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
>> + bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input,
>> + bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess),
>> + three_half));
>> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess,
>> +...
2015 Jan 11
6
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit
Signed-off-by: Roy Spliet <rspliet at eclipso.eu>
---
.../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------
.../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +-
2 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
2014 Jun 03
6
[PATCH v3 0/4] Constant folding of new Instructions
Yet another try for constant folding of Instructions for nvc0.
Please Review this again! (Hopefully the last time ;-) )
Tobias Klausmann (4):
nvc0/ir: clear subop when folding constant expressions
nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
expressions
nvc0/ir: Handle OP_BFIND when folding constant expressions
nvc0/ir: Handle OP_POPCNT when folding constant
2015 Jan 13
3
nv50/ir: Implement short notation for MAD V2
V2: clarify code, commit msgs, add comments. Drop code to was supposed to
make register assignment prefer SDST == SRC2 (patch 2) for now, because it
didn't quite do what I intended.
2014 Jun 03
8
[PATCH v2 0/4] Constant folding of new Instructions
And another try for constant folding of Instructions for nvc0.
Please Review this!
Thanks,
Tobias Klausmann
Tobias Klausmann (4):
nvc0/ir: clear subop when folding constant expressions
nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
expressions
nvc0/ir: Handle OP_BFIND when folding constant expressions
nvc0/ir: Handle OP_POPCNT when folding constant expressions
2015 Nov 05
7
[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates
Hi All,
This series implements using double immediates in the nouveau codegen code.
This turns the following (nvc0) code:
1: mov u32 $r2 0x00000000 (8)
2: mov u32 $r3 0x3fe00000 (8)
3: add f64 $r0d $r0d $r2d (8)
Into:
1: add f64 $r0d $r0d 0.500000 (8)
This has been tested with the 2 double shader tests which I just send to
the piglet list. On a gk208 (gk110 / SM35)
2015 Jan 23
3
[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation
Add emission rules for negative and saturate flags for MAD 4-byte opcodes,
and get rid of constraints. Short MAD has a very specific SDST == SSRC2
requirement, and since MAD IMM is short notation + 4-byte immediate, don't
have the compiler create MAD IMM instructions yet.
V2: Document MAD as supported short form
Signed-off-by: Roy Spliet <rspliet at eclipso.eu>
---