Ilia Mirkin
2015-May-23 06:06 UTC
[Nouveau] [PATCH] nv50/ir: avoid messing up arg1 of PFETCH
There can be scenarios where the "indirect" arg of a PFETCH becomes known, and so the code will attempt to propagate it. Use this opportunity to just fold it into the first argument, and prevent the load propagation pass from touching PFETCH further. This fixes gs-input-array-vec4-index-rd.shader_test and vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.5 10.6" <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 72dd31e..98e3d1f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -236,6 +236,9 @@ LoadPropagation::visit(BasicBlock *bb) if (i->op == OP_CALL) // calls have args as sources, they must be in regs continue; + if (i->op == OP_PFETCH) // pfetch expects arg1 to be a reg + continue; + if (i->srcExists(1)) checkSwapSrc01(i); @@ -581,6 +584,11 @@ ConstantFolding::expr(Instruction *i, case OP_POPCNT: res.data.u32 = util_bitcount(a->data.u32 & b->data.u32); break; + case OP_PFETCH: + // The two arguments to pfetch are logically added together. Normally + // the second argument will not be constant, but that can happen. + res.data.u32 = a->data.u32 + b->data.u32; + break; default: return; } @@ -610,6 +618,8 @@ ConstantFolding::expr(Instruction *i, bld.setPosition(i, false); i->setSrc(1, bld.loadImm(NULL, res.data.u32)); } + } else if (i->op == OP_PFETCH) { + // Leave PFETCH alone... we just folded its 2 args into 1. } else { i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */ } -- 2.3.6
Tobias Klausmann
2015-May-23 12:27 UTC
[Nouveau] [Mesa-dev] [PATCH] nv50/ir: avoid messing up arg1 of PFETCH
On 23.05.2015 08:06, Ilia Mirkin wrote:> There can be scenarios where the "indirect" arg of a PFETCH becomes > known, and so the code will attempt to propagate it. Use this > opportunity to just fold it into the first argument, and prevent the > load propagation pass from touching PFETCH further. > > This fixes gs-input-array-vec4-index-rd.shader_test and > vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > Cc: "10.5 10.6" <mesa-stable at lists.freedesktop.org> > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 72dd31e..98e3d1f 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -236,6 +236,9 @@ LoadPropagation::visit(BasicBlock *bb) > if (i->op == OP_CALL) // calls have args as sources, they must be in regs > continue; > > + if (i->op == OP_PFETCH) // pfetch expects arg1 to be a reg > + continue; > + > if (i->srcExists(1)) > checkSwapSrc01(i); > > @@ -581,6 +584,11 @@ ConstantFolding::expr(Instruction *i, > case OP_POPCNT: > res.data.u32 = util_bitcount(a->data.u32 & b->data.u32); > break; > + case OP_PFETCH: > + // The two arguments to pfetch are logically added together. Normally > + // the second argument will not be constant, but that can happen. > + res.data.u32 = a->data.u32 + b->data.u32; > + break; > default: > return; > } > @@ -610,6 +618,8 @@ ConstantFolding::expr(Instruction *i, > bld.setPosition(i, false); > i->setSrc(1, bld.loadImm(NULL, res.data.u32)); > } > + } else if (i->op == OP_PFETCH) { > + // Leave PFETCH alone... we just folded its 2 args into 1. > } else { > i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */ > }this last part sure works, but it gets ugly, while you are at it, can you change it to a switch statement?