thr3ads.net - similar to: "Add constant folding for new opcodes"

Displaying 20 results from an estimated 800 matches similar to: "Add constant folding for new opcodes"

[PATCH v2 0/4] Constant folding of new Instructions

2014 Jun 03

[PATCH v2 0/4] Constant folding of new Instructions

And another try for constant folding of Instructions for nvc0. Please Review this! Thanks, Tobias Klausmann Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant expressions

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

2014 May 29

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 58092f4..93f7c2a 100644 ---

[PATCH v3 0/4] Constant folding of new Instructions

2014 Jun 03

[PATCH v3 0/4] Constant folding of new Instructions

Yet another try for constant folding of Instructions for nvc0. Please Review this again! (Hopefully the last time ;-) ) Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

2014 May 29

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

Tested with: MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 ./shader_runner ../tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldReverse.shader_test -> green output, so this should be ok the test was not change though... On 29.05.2014 21:47, Ilia Mirkin wrote: > Can you verify that you tested how the HW handles this, as well as > exactly how you did it (i.e. how did you

[PATCH 4/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

2014 May 29

[PATCH 4/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 68b9a6d..a56756c 100644 ---

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

2014 May 29

[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

Can you verify that you tested how the HW handles this, as well as exactly how you did it (i.e. how did you modify the code + piglit test, what the results were, etc) On Thu, May 29, 2014 at 3:43 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- >

[PATCH 3/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

2014 May 29

[PATCH 3/4] nvc0/ir: Handle OP_POPCNT when folding constant expressions

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 93f7c2a..68b9a6d 100644 ---

[PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

2014 Jun 03

[PATCH v2 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions

V2: Handle the instruction right (shift after reverse) Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand".

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil

2014 Jul 03

[PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_build_util.cpp | 17 +++++++++++++++++ .../drivers/nouveau/codegen/nv50_ir_build_util.h | 2 ++ 2 files changed, 19 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 05

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32 No piglit regressions observed on nv50 and nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: fix usage of wrong variable V3: enable F64 support V4: - disable F64 support again - handle saturate flag: clamp to min/max if needed

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 27

[PATCH] nv50/ir: constant fold OP_SPLIT

Split the source immediate value into two new values and create OP_MOV instructions the two newly created values. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 23 ++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

2014 Jun 03

[PATCH v2 3/4] nvc0/ir: Handle OP_BFIND when folding constant expressions

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index a214ffc..c497335 100644 ---

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 09

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 10

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH] nv50/ir: constant fold OP_SPLIT

On 28.09.2016 02:01, Ilia Mirkin wrote: > On Tue, Sep 27, 2016 at 7:25 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Split the source immediate value into two new values and create OP_MOV >> instructions the two newly created values. >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >> ---

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

2015 May 09

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Pretty sure there's nothing wrong with it, but it looks odd in the code. src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 ++ src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 +++++-- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 ++ 3 files changed, 9 insertions(+), 2 deletions(-)

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

2015 May 09

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31

similar to: Add constant folding for new opcodes