Ilia Mirkin
2014-Aug-30 22:02 UTC
[Nouveau] [PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on instructions it had already processed, resulting in an infinite recursion. Avoid this by keeping track of already-visited instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079 Tested-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2 10.3" <mesa-stable at lists.freedesktop.org> --- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 32 +++++++++++++++++----- .../nouveau/codegen/nv50_ir_lowering_nvc0.h | 5 +++- 2 files changed, 29 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 7da9b0b..92f9a15 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -174,15 +174,31 @@ NVC0LegalizePostRA::findOverwritingDefs(const Instruction *texi, } void -NVC0LegalizePostRA::findFirstUses(const Instruction *texi, - const Instruction *insn, - std::list<TexUse> &uses) +NVC0LegalizePostRA::findFirstUses( + const Instruction *texi, + const Instruction *insn, + std::list<TexUse> &uses, + std::tr1::unordered_set<const Instruction *>& visited) { for (int d = 0; insn->defExists(d); ++d) { Value *v = insn->getDef(d); for (Value::UseIterator u = v->uses.begin(); u != v->uses.end(); ++u) { Instruction *usei = (*u)->getInsn(); + /* XXX HACK ALERT XXX + * + * This shouldn't have to be here, we should always be making forward + * progress by looking at the uses. However this somehow does not + * appear to be the case. Probably because this is being done right + * after RA, when the defs/uses lists have been messed with by node + * merging. This should probably be moved to being done right before + * RA. But this will do for now. + */ + if (visited.find(usei) != visited.end()) + continue; + + visited.insert(usei); + if (usei->op == OP_PHI || usei->op == OP_UNION) { // need a barrier before WAW cases for (int s = 0; usei->srcExists(s); ++s) { @@ -197,11 +213,11 @@ NVC0LegalizePostRA::findFirstUses(const Instruction *texi, usei->op == OP_PHI || usei->op == OP_UNION) { // these uses don't manifest in the machine code - findFirstUses(texi, usei, uses); + findFirstUses(texi, usei, uses, visited); } else if (usei->op == OP_MOV && usei->getDef(0)->equals(usei->getSrc(0)) && usei->subOp != NV50_IR_SUBOP_MOV_FINAL) { - findFirstUses(texi, usei, uses); + findFirstUses(texi, usei, uses, visited); } else { addTexUse(uses, usei, insn); } @@ -257,8 +273,10 @@ NVC0LegalizePostRA::insertTextureBarriers(Function *fn) uses = new std::list<TexUse>[texes.size()]; if (!uses) return false; - for (size_t i = 0; i < texes.size(); ++i) - findFirstUses(texes[i], texes[i], uses[i]); + for (size_t i = 0; i < texes.size(); ++i) { + std::tr1::unordered_set<const Instruction *> visited; + findFirstUses(texes[i], texes[i], uses[i], visited); + } // determine the barrier level at each use for (size_t i = 0; i < texes.size(); ++i) { diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h index 7f39c28..d8ff5cd 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h @@ -20,6 +20,8 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include <tr1/unordered_set> + #include "codegen/nv50_ir.h" #include "codegen/nv50_ir_build_util.h" @@ -69,7 +71,8 @@ private: bool insertTextureBarriers(Function *); inline bool insnDominatedBy(const Instruction *, const Instruction *) const; void findFirstUses(const Instruction *tex, const Instruction *def, - std::list<TexUse>&); + std::list<TexUse>&, + std::tr1::unordered_set<const Instruction *>&); void findOverwritingDefs(const Instruction *tex, Instruction *insn, const BasicBlock *term, std::list<TexUse>&); -- 1.8.5.5
Samplers are only defined up to num_samplers, so set all samplers above nr to NULL so that we don't try to read them again later. Tested-by: Christian Ruppert <idl0r at qasl.de> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2 10.3" <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/nv50/nv50_state.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c b/src/gallium/drivers/nouveau/nv50/nv50_state.c index 48bc079..cf84f88 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c @@ -585,9 +585,12 @@ nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s, nv50_screen_tsc_unlock(nv50->screen, old); } assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS); - for (; i < nv50->num_samplers[s]; ++i) - if (nv50->samplers[s][i]) + for (; i < nv50->num_samplers[s]; ++i) { + if (nv50->samplers[s][i]) { nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]); + nv50->samplers[s][i] = NULL; + } + } nv50->num_samplers[s] = nr; -- 1.8.5.5
Emil Velikov
2014-Aug-30 23:30 UTC
[Nouveau] [Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
On 30/08/14 23:02, Ilia Mirkin wrote:> Samplers are only defined up to num_samplers, so set all samplers above > nr to NULL so that we don't try to read them again later. >Would it be worth doing a similar thing with the unlocked samplers below the nr mark ? It seems to me that we might be leaking nv50->samplers[s][i], or perhaps I'm missing something ? -Emil> Tested-by: Christian Ruppert <idl0r at qasl.de> > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > Cc: "10.2 10.3" <mesa-stable at lists.freedesktop.org> > --- > src/gallium/drivers/nouveau/nv50/nv50_state.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c b/src/gallium/drivers/nouveau/nv50/nv50_state.c > index 48bc079..cf84f88 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c > @@ -585,9 +585,12 @@ nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s, > nv50_screen_tsc_unlock(nv50->screen, old); > } > assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS); > - for (; i < nv50->num_samplers[s]; ++i) > - if (nv50->samplers[s][i]) > + for (; i < nv50->num_samplers[s]; ++i) { > + if (nv50->samplers[s][i]) { > nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]); > + nv50->samplers[s][i] = NULL; > + } > + } > > nv50->num_samplers[s] = nr; > >
Possibly Parallel Threads
- [PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs
- [Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
- [Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
- [PATCH] nv50/ir: we can't replace 0x0 with zero reg for SHLADD
- [PATCH] nv50/ir: we can't replace 0x0 with the zero reg for SHLADD