thr3ads.net - Nouveau - [Nouveau] [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses [Jul 2014]

If this information is useful, please help other people find it:
Share via:

Tobias Klausmann

2014-Jul-08 02:19 UTC

[Nouveau] [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

This shortens runtime of piglit test fp-long-alu to ~22s

No piglit regressions observed on nvc0!

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp          | 6 +++---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h            | 7 ++++---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp       | 4 ++--
 4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index 13f8cf2..ca3c806 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -141,9 +141,9 @@ ValueRef::set(Value *refVal)
    if (value == refVal)
       return;
    if (value)
-      value->uses.remove(this);
+      value->uses.erase(this);
    if (refVal)
-      refVal->uses.push_back(this);
+      refVal->uses.insert(this);
 
    value = refVal;
 }
@@ -206,7 +206,7 @@ ValueDef::replace(const ValueRef &repVal, bool doSet)
       return;
 
    while (!value->uses.empty()) {
-      ValueRef *ref = value->uses.front();
+      ValueRef *ref = *value->uses.begin();
       ref->set(repVal.get());
       ref->mod *= repVal.mod;
    }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 8844030..0ff5e5d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -29,6 +29,7 @@
 #include <deque>
 #include <list>
 #include <vector>
+#include <tr1/unordered_set>
 
 #include "codegen/nv50_ir_util.h"
 #include "codegen/nv50_ir_graph.h"
@@ -581,10 +582,10 @@ public:
 
    static inline Value *get(Iterator&);
 
-   std::list<ValueRef *> uses;
+   std::tr1::unordered_set<ValueRef *> uses;
    std::list<ValueDef *> defs;
-   typedef std::list<ValueRef *>::iterator UseIterator;
-   typedef std::list<ValueRef *>::const_iterator UseCIterator;
+   typedef std::tr1::unordered_set<ValueRef *>::iterator UseIterator;
+   typedef std::tr1::unordered_set<ValueRef *>::const_iterator
UseCIterator;
    typedef std::list<ValueDef *>::iterator DefIterator;
    typedef std::list<ValueDef *>::const_iterator DefCIterator;
 
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index c162ac4..8d052c5 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -686,7 +686,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2,
       // b = mul a, imm
       // d = mul b, c   -> d = mul_x_imm a, c
       int s2, t2;
-      insn = mul2->getDef(0)->uses.front()->getInsn();
+      insn = (*mul2->getDef(0)->uses.begin())->getInsn();
       if (!insn)
          return;
       mul1 = mul2;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index e4f56b1..6c83a60 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -983,7 +983,7 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int mask)
             break;
          i = NULL;
          if (!insn->getDef(0)->uses.empty())
-            i = insn->getDef(0)->uses.front()->getInsn();
+            i = (*insn->getDef(0)->uses.begin())->getInsn();
          // if this is a contraint-move there will only be a single use
          if (i && i->op == OP_MERGE) // do we really still need this
?
             break;
@@ -1559,7 +1559,7 @@ SpillCodeInserter::run(const
std::list<ValuePair>& lst)
          // Unspill at each use *before* inserting spill instructions,
          // we don't want to have the spill instructions in the use list
here.
          while (!dval->uses.empty()) {
-            ValueRef *u = dval->uses.front();
+            ValueRef *u = *dval->uses.begin();
             Instruction *usei = u->getInsn();
             assert(usei);
             if (usei->isPseudo()) {
-- 
1.8.4.5

Ilia Mirkin

2014-Jul-08 06:00 UTC

head link

[Nouveau] [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

On Mon, Jul 7, 2014 at 10:19 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:> This shortens runtime of piglit test fp-long-alu to ~22s
>
> No piglit regressions observed on nvc0!
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at
mni.thm.de>
This is great. I'm going to run my tests on it and push it out.
There's one small problem, which is already an issue and I'm not _so_
concerned about it, but would be nice to look at (esp as you're also
looking into converting defs):

In SpillCodeInserter::run, we have the following inner loop:

         // Unspill at each use *before* inserting spill instructions,
         // we don't want to have the spill instructions in the use list
here.
         while (!dval->uses.empty()) {
            ValueRef *u = *dval->uses.begin();
            Instruction *usei = u->getInsn();
            assert(usei);
            if (usei->isPseudo()) {
               tmp = (slot->reg.file == FILE_MEMORY_LOCAL) ? NULL : slot;
               last = NULL;
            } else
            if (!last || usei != last->next) { // TODO: sort uses
               tmp = unspill(usei, dval, slot);
               last = usei;
            }
            u->set(tmp);
         }

Note the "TODO: sort uses" bit. This was less important when things
were in an ordered list where things were likely to be semi-sorted
already, however with the ~random hash ordering, we're basically going
to be unspilling before each use, even if we don't need to.

Ideally you would identify strings of instructions and only unspill at
their head. What you can do is something like

(a) compute a unordered_multimap of instruction -> use [multimap means
a single key may have multiple values].
(b) pick an instruction in the set
(c) move the instruction pointer by ->prev until ->prev is no longer
in the set or hits a pseudo-instruction (which generally indicates a
basic block starting and beyond that you'd get into control flow,
which you want nothing to do with)
(d) do the unspill stuff and remove all the instructions from the set
by following ->next pointers (so that you also hit instructions that
come after the random one you pick). note that before you remove each
instruction from the multimap, you should take its value (i.e.
iterable of uses) and do the use->set(tmp) thing on it. i recommend
taking a look at the unordered_multimap api -- e.g. equal_range is
what you want here (returns a pair of iterators)
(e) goto b

This should definitely be a separate patch though. The nice thing is
that with sets, I don't think this will be substantially slower, and
it's only run once, at the end, on values that need to be spilled.
[Hm, in case you're not already aware, spilling is when you run out of
registers and need to store values somewhere else for temporary
storage.]
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.cpp          | 6 +++---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h            | 7 ++++---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp       | 4 ++--
>  4 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> index 13f8cf2..ca3c806 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> @@ -141,9 +141,9 @@ ValueRef::set(Value *refVal)
>     if (value == refVal)
>        return;
>     if (value)
> -      value->uses.remove(this);
> +      value->uses.erase(this);
>     if (refVal)
> -      refVal->uses.push_back(this);
> +      refVal->uses.insert(this);
>
>     value = refVal;
>  }
> @@ -206,7 +206,7 @@ ValueDef::replace(const ValueRef &repVal, bool
doSet)
>        return;
>
>     while (!value->uses.empty()) {
> -      ValueRef *ref = value->uses.front();
> +      ValueRef *ref = *value->uses.begin();
>        ref->set(repVal.get());
>        ref->mod *= repVal.mod;
>     }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index 8844030..0ff5e5d 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -29,6 +29,7 @@
>  #include <deque>
>  #include <list>
>  #include <vector>
> +#include <tr1/unordered_set>
>
>  #include "codegen/nv50_ir_util.h"
>  #include "codegen/nv50_ir_graph.h"
> @@ -581,10 +582,10 @@ public:
>
>     static inline Value *get(Iterator&);
>
> -   std::list<ValueRef *> uses;
> +   std::tr1::unordered_set<ValueRef *> uses;
>     std::list<ValueDef *> defs;
> -   typedef std::list<ValueRef *>::iterator UseIterator;
> -   typedef std::list<ValueRef *>::const_iterator UseCIterator;
> +   typedef std::tr1::unordered_set<ValueRef *>::iterator
UseIterator;
> +   typedef std::tr1::unordered_set<ValueRef *>::const_iterator
UseCIterator;
>     typedef std::list<ValueDef *>::iterator DefIterator;
>     typedef std::list<ValueDef *>::const_iterator DefCIterator;
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index c162ac4..8d052c5 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -686,7 +686,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction
*mul2,
>        // b = mul a, imm
>        // d = mul b, c   -> d = mul_x_imm a, c
>        int s2, t2;
> -      insn = mul2->getDef(0)->uses.front()->getInsn();
> +      insn = (*mul2->getDef(0)->uses.begin())->getInsn();
>        if (!insn)
>           return;
>        mul1 = mul2;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index e4f56b1..6c83a60 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -983,7 +983,7 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int
mask)
>              break;
>           i = NULL;
>           if (!insn->getDef(0)->uses.empty())
> -            i = insn->getDef(0)->uses.front()->getInsn();
> +            i = (*insn->getDef(0)->uses.begin())->getInsn();
>           // if this is a contraint-move there will only be a single use
>           if (i && i->op == OP_MERGE) // do we really still need
this ?
>              break;
> @@ -1559,7 +1559,7 @@ SpillCodeInserter::run(const
std::list<ValuePair>& lst)
>           // Unspill at each use *before* inserting spill instructions,
>           // we don't want to have the spill instructions in the use
list here.
>           while (!dval->uses.empty()) {
> -            ValueRef *u = dval->uses.front();
> +            ValueRef *u = *dval->uses.begin();
>              Instruction *usei = u->getInsn();
>              assert(usei);
>              if (usei->isPseudo()) {
> --
> 1.8.4.5
>

Apparently Analagous Threads

Search for more reasonably related threads

Nouveau - Jul 2014 - [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

[Nouveau] [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

[Nouveau] [PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

Apparently Analagous Threads