thr3ads.net - search: "mkop3"

Displaying 10 results from an estimated 10 matches for "mkop3".

Did you mean: mkop2

[PATCH] gm107/ir: use lane 0 for manual textureGrad handling

2017 Dec 20

[PATCH] gm107/ir: use lane 0 for manual textureGrad handling

...++l) { Value *src[3], *val; - // mov coordinates from lane l to all lanes + Value *lane = bld.mkImm(l); bld.mkOp(OP_QUADON, TYPE_NONE, NULL); + // Make sure lane 0 has the appropriate array/depth compare values + if (l != 0) { + if (array) + bld.mkOp3(OP_SHFL, TYPE_F32, arr, i->getSrc(0), lane, quad); + if (i->tex.target.isShadow()) + bld.mkOp3(OP_SHFL, TYPE_F32, shadow, i->getSrc(array + dim), lane, quad); + } + + // mov coordinates from lane l to all lanes for (c = 0; c < dim; ++c) { - bl...

[PATCH] gm107/ir: use lane 0 for manual textureGrad handling

2017 Dec 20

[PATCH] gm107/ir: use lane 0 for manual textureGrad handling

...> - // mov coordinates from lane l to all lanes > + Value *lane = bld.mkImm(l); > bld.mkOp(OP_QUADON, TYPE_NONE, NULL); > + // Make sure lane 0 has the appropriate array/depth compare values > + if (l != 0) { > + if (array) > + bld.mkOp3(OP_SHFL, TYPE_F32, arr, i->getSrc(0), lane, quad); > + if (i->tex.target.isShadow()) > + bld.mkOp3(OP_SHFL, TYPE_F32, shadow, i->getSrc(array + dim), lane, quad); In the great argument switcheroo between each SM version, the shadow compare is actually after the in...

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

...halves - i[0] = bld->mkSplit(a, halfSize, mul->getSrc(0)); - i[1] = bld->mkSplit(b, halfSize, mul->getSrc(1)); + i[0] = bld->mkSplit(a, halfSize, s[0]); + i[1] = bld->mkSplit(b, halfSize, s[1]); i[2] = bld->mkOp2(OP_MUL, fTy, t[0], a[0], b[1]); i[3] = bld->mkOp3(OP_MAD, fTy, t[1], a[1], b[0], t[0]); @@ -75,24 +92,76 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul) i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]); if (highResult) { - Value *r[4]; + Value *c[2]; + Value *r[5]; Value *imm = bld->loadImm(NULL, 1...

[PATCH] gm107/ir: fix texture argument order

2014 Sep 25

[PATCH] gm107/ir: fix texture argument order

...// create it if it's not already there, and INSBF it if it already // is. s = (i->tex.rIndirectSrc >= 0) ? 1 : 0; + if (chipset >= NVISA_GM107_CHIPSET) + s += dim; if (i->tex.target.isArray()) { - bld.mkOp3(OP_INSBF, TYPE_U32, i->getSrc(0), + bld.mkOp3(OP_INSBF, TYPE_U32, i->getSrc(s), bld.loadImm(NULL, imm), bld.mkImm(0xc10), i->getSrc(s)); } else { diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b...

[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced

2014 May 13

[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2" <mesa-stable at lists.freedesktop.org> --- Not 100% sure of the significance of this code, but this seems like the correct thing to do... will definitely run it through a full piglit run before pushing out. src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git

[PATCH] gm107/ir: take relative pfetch offset into account

2014 Sep 26

[PATCH] gm107/ir: take relative pfetch offset into account

..._U32, tmp1, tmp1, bld.mkImm(0xff)); - bld.mkOp1(OP_MOV , TYPE_U32, tmp2, bld.mkImm(i->getSrc(0)->reg.data.u32)); + if (i->getSrc(1)) + bld.mkOp2(OP_ADD , TYPE_U32, tmp2, i->getSrc(0), i->getSrc(1)); + else + bld.mkOp1(OP_MOV , TYPE_U32, tmp2, i->getSrc(0)); bld.mkOp3(OP_MAD , TYPE_U32, tmp0, tmp0, tmp1, tmp2); i->setSrc(0, tmp0); i->setSrc(1, NULL); -- 1.8.5.5

[PATCH 1/2] nvc0/ir: use manual TXD when offsets are involved

2014 Jul 05

[PATCH 1/2] nvc0/ir: use manual TXD when offsets are involved

Something about how we're implementing offsets for TXD is wrong, just flip to the generic quadop-based implementation in that case. This is the minimal fix appropriate for backporting. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 3 ++- 1 file changed, 2

[PATCH 1/3] nvc0/ir: add base tex offset for fermi indirect tex case

2014 Aug 08

[PATCH 1/3] nvc0/ir: add base tex offset for fermi indirect tex case

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index f010767..4a9e48f 100644 ---

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---

[RFC 0/9] Add precise/invariant semantics to TGSI

2017 Jun 11

[RFC 0/9] Add precise/invariant semantics to TGSI

Running Tomb Raider on Nouveau I found some flicker caused by ignoring precise modifiers on variables inside Nouveau. This series add precise/invariant handling to TGSI, which can be then used by drivers to disable certain unsafe optimisations which may otherwise alter calculations, which depend on having the same result across shaders. This series fixes this bug in Tomb Raider and one CTS test

search for: mkop3