thr3ads.net - similar to: "[RESEND/PATCH] nv50/ir: Handle OP

Displaying 20 results from an estimated 1000 matches similar to: "[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions"

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 05

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32 No piglit regressions observed on nv50 and nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: fix usage of wrong variable V3: enable F64 support V4: - disable F64 support again - handle saturate flag: clamp to min/max if needed

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 10

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil

2014 Jul 03

[PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_build_util.cpp | 17 +++++++++++++++++ .../drivers/nouveau/codegen/nv50_ir_build_util.h | 2 ++ 2 files changed, 19 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

On Sun, Jan 11, 2015 at 5:48 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > > > On 11.01.2015 23:12, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann >> <tobias.johannes.klausmann at mni.thm.de> wrote: >>> >>> >>> On 11.01.2015 22:54, Ilia Mirkin wrote: >>>>

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:57, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 20:19, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> >>>> On 11.01.2015 01:58,

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 01:58, Ilia Mirkin wrote: > On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >> --- >> V2: beat me, whip me, split

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 27

[PATCH] nv50/ir: constant fold OP_SPLIT

Split the source immediate value into two new values and create OP_MOV instructions the two newly created values. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 23 ++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH] nv50/ir: constant fold OP_SPLIT

On 28.09.2016 02:01, Ilia Mirkin wrote: > On Tue, Sep 27, 2016 at 7:25 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Split the source immediate value into two new values and create OP_MOV >> instructions the two newly created values. >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >> ---

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand".

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > > > On 11.01.2015 22:54, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann >> <tobias.johannes.klausmann at mni.thm.de> wrote: >>> >>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- > V2: Split out F64 parts > V3: remove handling of saturate for (U/S)32, > >

[PATCH v5] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 06

[PATCH v5] nv50/ir: Handle OP_CVT when folding constant expressions

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 03

[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32 No piglit regressions observed! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 74 ++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 10

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

On Fri, Jan 9, 2015 at 6:47 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ > 1 file

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:19, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 01:58, Ilia Mirkin wrote: >>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> Folding for conversions:

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH v2] nv50/ir: constant fold OP_SPLIT

2016 Sep 30

[PATCH v2] nv50/ir: constant fold OP_SPLIT

Split the source immediate value into two new values and create OP_MOV instructions the two newly created values. V2: get rid of special cases Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

Previously we'd end up with an unnecessary mov for the thirs immediate value. total instructions in shared programs : 851881 -> 851864 (-0.00%) total gprs used in shared programs : 110295 -> 110295 (0.00%) total local used in shared programs : 1020 -> 1020 (0.00%) local gpr inst bytes helped 0 0 17 17

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

2015 Nov 05

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

Hi All, This series implements using double immediates in the nouveau codegen code. This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) This has been tested with the 2 double shader tests which I just send to the piglet list. On a gk208 (gk110 / SM35)

similar to: [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions