search for: op_mad

Displaying 20 results from an estimated 37 matches for "op_mad".

2016 Oct 02
1
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...should have picked that up even in > its current form. Should try to figure out why it didn't and that is > likely to "fix" a *lot* more situations. Actually i was coming from an, given really constrained, addition to the LoadPropagation pass, where i was told to fix it within OP_MAD :/ > On Sun, Oct 2, 2016 at 2:24 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 02.10.2016 20:03, Ilia Mirkin wrote: >>> On Sun, Oct 2, 2016 at 1:58 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de>...
2016 Oct 02
0
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...index 9875738..8bb5cf9 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > break; > case OP_MAD: > if (imm0.isInteger(0)) { > + ImmediateValue imm1; > i->setSrc(0, i->getSrc(2)); > i->src(0).mod = i->src(2).mod; > i->setSrc(1, NULL); > i->setSrc(2, NULL); > - i->op = i->src(0).mod.getOp...
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...egen/nv50_ir_peephole.cpp index 9875738..8bb5cf9 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) break; case OP_MAD: if (imm0.isInteger(0)) { + ImmediateValue imm1; i->setSrc(0, i->getSrc(2)); i->src(0).mod = i->src(2).mod; i->setSrc(1, NULL); i->setSrc(2, NULL); - i->op = i->src(0).mod.getOp(); - if (i->op != OP_CV...
2016 Oct 02
0
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...allium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, >>> ImmediateValue &imm0, int s) >>> break; >>> case OP_MAD: >>> if (imm0.isInteger(0)) { >>> + ImmediateValue imm1; >>> i->setSrc(0, i->getSrc(2)); >>> i->src(0).mod = i->src(2).mod; >>> i->setSrc(1, NULL); >>> i->setSrc(2, NU...
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
...9 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) >> break; >> case OP_MAD: >> if (imm0.isInteger(0)) { >> + ImmediateValue imm1; >> i->setSrc(0, i->getSrc(2)); >> i->src(0).mod = i->src(2).mod; >> i->setSrc(1, NULL); >> i->setSrc(2, NULL); >> -...
2015 Feb 23
2
[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)
...insn) emitUADD(insn); break; case OP_MUL: - if (isFloatType(insn->dType)) + if (insn->dType == TYPE_F64) + emitDMUL(insn); + else if (isFloatType(insn->dType)) emitFMUL(insn); else emitIMUL(insn); break; case OP_MAD: case OP_FMA: - if (isFloatType(insn->dType)) + if (insn->dType == TYPE_F64) + emitDMAD(insn); + else if (isFloatType(insn->dType)) emitFMAD(insn); else emitIMAD(insn); @@ -1912,7 +1986,7 @@ CodeEmitterNV50::getMinEncodingSize(const Ins...
2017 Mar 26
5
[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation
was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
...bld.loadImm(NULL, 1.5f)); > + > + half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input); > + // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2) > + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, > + bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input, > + bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess), > + three_half)); > + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, > + bl...
2014 May 18
1
[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic
...s - i[0] = bld->mkSplit(a, halfSize, mul->getSrc(0)); - i[1] = bld->mkSplit(b, halfSize, mul->getSrc(1)); + i[0] = bld->mkSplit(a, halfSize, s[0]); + i[1] = bld->mkSplit(b, halfSize, s[1]); i[2] = bld->mkOp2(OP_MUL, fTy, t[0], a[0], b[1]); i[3] = bld->mkOp3(OP_MAD, fTy, t[1], a[1], b[0], t[0]); @@ -75,24 +92,76 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul) i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]); if (highResult) { - Value *r[4]; + Value *c[2]; + Value *r[5]; Value *imm = bld->loadImm(NULL, 1 <&l...
2015 Jan 02
0
[PATCH] nv50/ir: Fold sat into mad
...um/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -87,7 +87,7 @@ static const struct opProperties _initProps[] = { OP_MUL, 0x3, 0x0, 0x0, 0x0, 0x2, 0x1, 0x1, 0x2 }, { OP_MAX, 0x3, 0x3, 0x0, 0x0, 0x2, 0x1, 0x1, 0x0 }, { OP_MIN, 0x3, 0x3, 0x0, 0x0, 0x2, 0x1, 0x1, 0x0 }, - { OP_MAD, 0x7, 0x0, 0x0, 0x0, 0x6, 0x1, 0x1, 0x0 }, // special constraint + { OP_MAD, 0x7, 0x0, 0x0, 0x8, 0x6, 0x1, 0x1, 0x0 }, // special constraint { OP_ABS, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0 }, { OP_NEG, 0x0, 0x1, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0 }, { OP_CVT, 0x1, 0x1, 0x0,...
2014 Jun 23
2
[Mesa-dev] [PATCH v3 1/4] nvc0/ir: clear subop when folding constant expressions
Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> writes: > Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set. > After folding, make sure that it is cleared > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.1 10.2" <mesa-stable at lists.freedesktop.or...
2015 Feb 23
0
[PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
...64, three_half, TYPE_F32, bld.loadImm(NULL, 1.5f)); + + half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input); + // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2) + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, + bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input, + bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess), + three_half)); + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, + bld.mkOp3v(OP_MAD, TYP...
2014 May 13
1
[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2" <mesa-stable at lists.freedesktop.org> --- Not 100% sure of the significance of this code, but this seems like the correct thing to do... will definitely run it through a full piglit run before pushing out. src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git
2015 Feb 23
0
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
....5f)); >> + >> + half_input = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), half_input, input); >> + // RSQ: x_{n+1} = x_n * (1.5 - 0.5 * input * x_n^2) >> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, >> + bld.mkOp3v(OP_MAD, TYPE_F64, bld.getSSA(8), half_input, >> + bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, guess), >> + three_half)); >> + guess = bld.mkOp2v(OP_MUL, TYPE_F64, bld.getSSA(8), guess, >> +...
2015 Jan 11
6
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
2014 Jun 03
6
[PATCH v3 0/4] Constant folding of new Instructions
Yet another try for constant folding of Instructions for nvc0. Please Review this again! (Hopefully the last time ;-) ) Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant
2015 Jan 13
3
nv50/ir: Implement short notation for MAD V2
V2: clarify code, commit msgs, add comments. Drop code to was supposed to make register assignment prefer SDST == SRC2 (patch 2) for now, because it didn't quite do what I intended.
2014 Jun 03
8
[PATCH v2 0/4] Constant folding of new Instructions
And another try for constant folding of Instructions for nvc0. Please Review this! Thanks, Tobias Klausmann Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant expressions
2015 Nov 05
7
[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates
Hi All, This series implements using double immediates in the nouveau codegen code. This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) This has been tested with the 2 double shader tests which I just send to the piglet list. On a gk208 (gk110 / SM35)
2015 Jan 23
3
[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation
Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of constraints. Short MAD has a very specific SDST == SSRC2 requirement, and since MAD IMM is short notation + 4-byte immediate, don't have the compiler create MAD IMM instructions yet. V2: Document MAD as supported short form Signed-off-by: Roy Spliet <rspliet at eclipso.eu> ---