thr3ads.net - Nouveau - [Nouveau] [PATCH v2 0/4] Constant folding of new Instructions [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Tobias Klausmann

2014-Jun-03 20:58 UTC

[Nouveau] [PATCH v2 0/4] Constant folding of new Instructions

And another try for constant folding of Instructions for nvc0.

Please Review this!

Thanks,
Tobias Klausmann

Tobias Klausmann (4):
  nvc0/ir: clear subop when folding constant expressions
  nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
    expressions
  nvc0/ir: Handle OP_BFIND when folding constant expressions
  nvc0/ir: Handle OP_POPCNT when folding constant expressions

 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 50 +++++++++++++++++++++-
 1 file changed, 48 insertions(+), 2 deletions(-)

-- 
1.8.4.5

Tobias Klausmann

2014-Jun-03 20:58 UTC

head link

[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions

Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set.
After folding, make sure that it is cleared

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1a2c2e6..58092f4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i,
    } else {
       i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
    }
+   i->subOp = 0;
 }
 
 void
-- 
1.8.4.5

Tobias Klausmann

2014-Jun-03 20:58 UTC

head link

[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

V2: Handle the instruction right (shift after reverse)

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 58092f4..a214ffc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -529,8 +529,20 @@ ConstantFolding::expr(Instruction *i,
          lshift = 32 - width - offset;
       }
       switch (i->dType) {
-      case TYPE_S32: res.data.s32 = (a->data.s32 << lshift) >>
rshift; break;
-      case TYPE_U32: res.data.u32 = (a->data.u32 << lshift) >>
rshift; break;
+      case TYPE_S32:
+         if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
+            res.data.s32 = util_bitreverse(a->data.s32);
+         else
+            res.data.s32 = a->data.s32;
+         res.data.s32 = (res.data.s32 << lshift) >> rshift;
+         break;
+      case TYPE_U32:
+         if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
+            res.data.u32 = util_bitreverse(a->data.u32);
+         else
+            res.data.u32 = a->data.u32;
+         res.data.u32 = (res.data.u32 << lshift) >> rshift;
+         break;
       default:
          return;
       }
-- 
1.8.4.5

Tobias Klausmann

2014-Jun-03 20:58 UTC

head link

[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp       | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index a214ffc..c497335 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -948,6 +948,24 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
    case OP_EX2:
       unary(i, imm0);
       break;
+   case OP_BFIND: {
+      int32_t res;
+      switch (i->dType) {
+      case TYPE_S32:
+         res = util_last_bit_signed(imm0.reg.data.s32) - 1; break;
+      case TYPE_U32:
+         res = util_last_bit(imm0.reg.data.u32) -1; break;
+      default:
+         return;
+      }
+      if ((i->subOp ==  NV50_IR_SUBOP_BFIND_SAMT) && (res >= 0))
+         res = 31 - res;
+      i->setSrc(0, new_ImmediateValue(i->bb->getProgram(),
(uint32_t)res));
+      i->setSrc(1, NULL);
+      i->op = OP_MOV;
+      i->subOp = 0;
+      break;
+   }
    default:
       return;
    }
-- 
1.8.4.5

Tobias Klausmann

2014-Jun-03 20:58 UTC

head link

[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

V2: Add support for a single-argument version of POPCNT for Maxwell (SM5)

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index c497335..19767b4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -548,6 +548,10 @@ ConstantFolding::expr(Instruction *i,
       }
       break;
    }
+   case OP_POPCNT: {
+         res.data.u32 = util_bitcount(a->data.u32 & b->data.u32);
break;
+      break;
+   }
    default:
       return;
    }
@@ -966,6 +970,17 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
       i->subOp = 0;
       break;
    }
+   case OP_POPCNT: {
+      uint32_t res;
+      if (!i->srcExists(1)) {
+         res = util_bitcount(imm0.reg.data.u32);
+         i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res));
+         i->setSrc(1, NULL);
+         i->op = OP_MOV;
+         i->subOp = 0;
+      }
+      break;
+   }
    default:
       return;
    }
-- 
1.8.4.5

Ilia Mirkin

2014-Jun-03 21:01 UTC

head link

[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:> V2: Add support for a single-argument version of POPCNT for Maxwell (SM5)
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at
mni.thm.de>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15
+++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index c497335..19767b4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -548,6 +548,10 @@ ConstantFolding::expr(Instruction *i,
>        }
>        break;
>     }
> +   case OP_POPCNT: {
> +         res.data.u32 = util_bitcount(a->data.u32 &
b->data.u32); break;
> +      break;
Do you really need 2 breaks here? Also, funy indentation.
> +   }
>     default:
>        return;
>     }
> @@ -966,6 +970,17 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
>        i->subOp = 0;
>        break;
>     }
> +   case OP_POPCNT: {
> +      uint32_t res;
> +      if (!i->srcExists(1)) {
> +         res = util_bitcount(imm0.reg.data.u32);
> +         i->setSrc(0, new_ImmediateValue(i->bb->getProgram(),
res));
> +         i->setSrc(1, NULL);
A little overkill -- src(1) already doesn't exist... can get rid of
that, I think.
> +         i->op = OP_MOV;
> +         i->subOp = 0;
> +      }
> +      break;
> +   }
>     default:
>        return;
>     }
> --
> 1.8.4.5
>

Ilia Mirkin

2014-Jun-03 21:03 UTC

head link

[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:> V2: Handle the instruction right (shift after reverse)
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at
mni.thm.de>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16
++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 58092f4..a214ffc 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -529,8 +529,20 @@ ConstantFolding::expr(Instruction *i,
>           lshift = 32 - width - offset;
>        }
>        switch (i->dType) {
> -      case TYPE_S32: res.data.s32 = (a->data.s32 << lshift)
>> rshift; break;
> -      case TYPE_U32: res.data.u32 = (a->data.u32 << lshift)
>> rshift; break;
> +      case TYPE_S32:
> +         if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
> +            res.data.s32 = util_bitreverse(a->data.s32);
> +         else
> +            res.data.s32 = a->data.s32;
Why not do this once outside of the switch statement? The two are
actually the same -- util_bitreverse doesn't care about
signed/unsigned, and res.data is a union.
> +         res.data.s32 = (res.data.s32 << lshift) >> rshift;
> +         break;
> +      case TYPE_U32:
> +         if (i->subOp == NV50_IR_SUBOP_EXTBF_REV)
> +            res.data.u32 = util_bitreverse(a->data.u32);
> +         else
> +            res.data.u32 = a->data.u32;
> +         res.data.u32 = (res.data.u32 << lshift) >> rshift;
> +         break;
>        default:
>           return;
>        }
> --
> 1.8.4.5
>

Ilia Mirkin

2014-Jun-03 21:05 UTC

head link

[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at
mni.thm.de>
> ---
>  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp       | 18
++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index a214ffc..c497335 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -948,6 +948,24 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
&imm0, int s)
>     case OP_EX2:
>        unary(i, imm0);
>        break;
> +   case OP_BFIND: {
> +      int32_t res;
> +      switch (i->dType) {
> +      case TYPE_S32:
> +         res = util_last_bit_signed(imm0.reg.data.s32) - 1; break;
The style elsewhere is to do

case TYPE_S32: foo; break;

if it fits. Otherwise put the break on a separate line.
> +      case TYPE_U32:
> +         res = util_last_bit(imm0.reg.data.u32) -1; break;
Missing space between "-" and "1".
> +      default:
> +         return;
> +      }
> +      if ((i->subOp ==  NV50_IR_SUBOP_BFIND_SAMT) && (res >=
0))
No need for the extra parens. && comes after ==.
> +         res = 31 - res;
> +      i->setSrc(0, new_ImmediateValue(i->bb->getProgram(),
(uint32_t)res));
Why the typecast?
> +      i->setSrc(1, NULL);
> +      i->op = OP_MOV;
> +      i->subOp = 0;
> +      break;
> +   }
>     default:
>        return;
>     }
> --
> 1.8.4.5
>

Ilia Mirkin

2014-Jun-03 21:06 UTC

head link

[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions

On Tue, Jun 3, 2014 at 4:58 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:> Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF might have a subop set.
> After folding, make sure that it is cleared
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at
mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 1a2c2e6..58092f4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i,
>     } else {
>        i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by
unary() */
>     }
> +   i->subOp = 0;
>  }
>
>  void
> --
> 1.8.4.5
>

Maybe Matching Threads

Search for more maybe matching threads

Nouveau - Jun 2014 - [PATCH v2 0/4] Constant folding of new Instructions

[Nouveau] [PATCH v2 0/4] Constant folding of new Instructions

[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions

[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

[Nouveau] [PATCH v2 4/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

[Nouveau] [PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

[Nouveau] [PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

[Nouveau] [PATCH v2 1/4] nvc0/ir: clear subop when folding constant expressions

Maybe Matching Threads