Ilia Mirkin
2014-May-13 15:33 UTC
[Nouveau] [PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2" <mesa-stable at lists.freedesktop.org> --- Not 100% sure of the significance of this code, but this seems like the correct thing to do... will definitely run it through a full piglit run before pushing out. src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 60a6a3f..b284081 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -998,7 +998,9 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int mask) case OP_TXQ: case OP_TXD: case OP_TXG: + case OP_TXLQ: case OP_TEXCSAA: + case OP_TEXPREP: if (!(mask & JOIN_MASK_TEX)) break; for (c = 0; insn->srcExists(c) && c != insn->predSrc; ++c) -- 1.8.5.5
Ilia Mirkin
2014-May-13 15:33 UTC
[Nouveau] [PATCH 2/2] nv50/ir: fix integer mul lowering for u32 x u32 -> high u32
UNION appears to expect that all of its defines are conditional. Otherwise it inserts a mov instruction which overwrites the desired result. This fixes tests that use UMUL_HI, and much less directly, unsigned integer division by a constant, which uses this functionality in its lowering. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.1 10.2" <mesa-stable at lists.freedesktop.org> --- The S32 version of this logic is busted too (thankfully there is no IMUL instruction so it normally doesn't come up). However division by signed integer is still broken as a result. Fixing it will require more changes, but those will rely on the unsigned case working. src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index 63db1d7..b17d57d 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -75,16 +75,17 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul) i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]); if (highResult) { - Value *r[3]; + Value *r[4]; Value *imm = bld->loadImm(NULL, 1 << (halfSize * 8)); c[0] = bld->getSSA(1, FILE_FLAGS); c[1] = bld->getSSA(1, FILE_FLAGS); - for (int j = 0; j < 3; ++j) + for (int j = 0; j < 4; ++j) r[j] = bld->getSSA(fullSize); i[8] = bld->mkOp2(OP_SHR, fTy, r[0], t[1], bld->mkImm(halfSize * 8)); i[6] = bld->mkOp2(OP_ADD, fTy, r[1], r[0], imm); - bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[0]); + bld->mkMov(r[3], r[0])->setPredicate(CC_NC, c[0]); + bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[3]); i[5] = bld->mkOp3(OP_MAD, fTy, mul->getDef(0), a[1], b[1], r[2]); // set carry defs / sources -- 1.8.5.5
Apparently Analagous Threads
- [PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic
- [PATCH] gm107/ir: fix loading z offset for layered 3d image bindings
- [PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)
- [PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2
- [PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax