Displaying 20 results from an estimated 600 matches similar to: "[PATCH] nv50/ir: optimmize shl(a, 0) to a"
2017 Apr 29
5
[PATCH v2] nv50/ir: optimize shl(a, 0) to a
helps two alien isolation shaders
shader-db:
total instructions in shared programs : 4251497 -> 4251494 (-0.00%)
total gprs used in shared programs : 513962 -> 513962 (0.00%)
total local used in shared programs : 29797 -> 29797 (0.00%)
total bytes used in shared programs : 38960264 -> 38960232 (-0.00%)
local gpr inst bytes
helped
2017 Apr 29
0
[PATCH] nv50/ir: optimmize shl(a, 0) to a
On Sat, Apr 29, 2017 at 12:46 PM, Karol Herbst <karolherbst at gmail.com> wrote:
> helps two alien isolation shaders
>
> shader-db:
> total instructions in shared programs : 4251497 -> 4251494 (-0.00%)
> total gprs used in shared programs : 513962 -> 513962 (0.00%)
> total local used in shared programs : 29797 -> 29797 (0.00%)
> total bytes used in shared
2017 Apr 29
0
[PATCH v2] nv50/ir: optimize shl(a, 0) to a
On Sat, Apr 29, 2017 at 6:09 PM, Karol Herbst <karolherbst at gmail.com> wrote:
> helps two alien isolation shaders
>
> shader-db:
> total instructions in shared programs : 4251497 -> 4251494 (-0.00%)
> total gprs used in shared programs : 513962 -> 513962 (0.00%)
> total local used in shared programs : 29797 -> 29797 (0.00%)
> total bytes used in shared
2017 Apr 30
0
[PATCH v2] nv50/ir: optimize shl(a, 0) to a
Maybe in a separate change. I'd want to double check on all gens. I think
the thing I suggested is sufficient.
On Apr 29, 2017 8:09 PM, "Karol Herbst" <karolherbst at gmail.com> wrote:
2017-04-30 0:28 GMT+02:00 Ilia Mirkin <imirkin at alum.mit.edu>:
> On Sat, Apr 29, 2017 at 6:09 PM, Karol Herbst <karolherbst at gmail.com>
wrote:
>> helps two alien
2017 Apr 30
0
[PATCH v2] nv50/ir: optimize shl(a, 0) to a
On Apr 30, 2017 8:14 AM, "Karol Herbst" <karolherbst at gmail.com> wrote:
2017-04-30 2:28 GMT+02:00 Ilia Mirkin <imirkin at alum.mit.edu>:
> Maybe in a separate change. I'd want to double check on all gens. I think
> the thing I suggested is sufficient.
>
well, if I just fixup the op, I kind of have to fix the mod as well.
And if I use getOp, it could also
2015 May 09
2
[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets
On 09.05.2015 07:35, Ilia Mirkin wrote:
> This has started to happen more now that the backend is producing
> KILL_IF more often.
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++
> .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++
> 2 files changed, 31
2015 May 09
5
[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Pretty sure there's nothing wrong with it, but it looks odd in the code.
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 ++
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 +++++--
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 ++
3 files changed, 9 insertions(+), 2 deletions(-)
2015 Jan 09
3
[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2014 Jun 03
8
[PATCH v2 0/4] Constant folding of new Instructions
And another try for constant folding of Instructions for nvc0.
Please Review this!
Thanks,
Tobias Klausmann
Tobias Klausmann (4):
nvc0/ir: clear subop when folding constant expressions
nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
expressions
nvc0/ir: Handle OP_BFIND when folding constant expressions
nvc0/ir: Handle OP_POPCNT when folding constant expressions
2015 Jan 10
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
V2: beat me, whip me, split out F64
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++
1 file changed, 81 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2015 Jan 11
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On 11.01.2015 20:57, Ilia Mirkin wrote:
> On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>
>> On 11.01.2015 20:19, Ilia Mirkin wrote:
>>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann
>>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>>
>>>> On 11.01.2015 01:58,
2015 Jan 11
2
[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
On Sun, Jan 11, 2015 at 5:48 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:
>
>
> On 11.01.2015 23:12, Ilia Mirkin wrote:
>>
>> On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann
>> <tobias.johannes.klausmann at mni.thm.de> wrote:
>>>
>>>
>>> On 11.01.2015 22:54, Ilia Mirkin wrote:
>>>>
2016 Sep 27
2
[PATCH] nv50/ir: constant fold OP_SPLIT
Split the source immediate value into two new values and create OP_MOV
instructions the two newly created values.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 23 ++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2016 Sep 30
2
[PATCH] nv50/ir: constant fold OP_SPLIT
On 28.09.2016 02:01, Ilia Mirkin wrote:
> On Tue, Sep 27, 2016 at 7:25 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>> Split the source immediate value into two new values and create OP_MOV
>> instructions the two newly created values.
>>
>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
>> ---
2016 Sep 30
2
[PATCH v2] nv50/ir: constant fold OP_SPLIT
Split the source immediate value into two new values and create OP_MOV
instructions the two newly created values.
V2: get rid of special cases
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git
2015 Jan 11
2
[PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:
> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
> ---
> V2: Split out F64 parts
> V3: remove handling of saturate for (U/S)32,
>
>
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
Previously we'd end up with an unnecessary mov for the thirs immediate value.
total instructions in shared programs : 851881 -> 851864 (-0.00%)
total gprs used in shared programs : 110295 -> 110295 (0.00%)
total local used in shared programs : 1020 -> 1020 (0.00%)
local gpr inst bytes
helped 0 0 17 17
2014 Jul 05
1
[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions
Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32
No piglit regressions observed on nv50 and nvc0!
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
V2: fix usage of wrong variable
V3: enable F64 support
V4:
- disable F64 support again
- handle saturate flag: clamp to min/max if needed
2014 Jun 03
6
[PATCH v3 0/4] Constant folding of new Instructions
Yet another try for constant folding of Instructions for nvc0.
Please Review this again! (Hopefully the last time ;-) )
Tobias Klausmann (4):
nvc0/ir: clear subop when folding constant expressions
nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant
expressions
nvc0/ir: Handle OP_BFIND when folding constant expressions
nvc0/ir: Handle OP_POPCNT when folding constant
2014 Jul 03
1
[PATCH v3 1/2] nv50/ir: Add support for the double Type to BuildUtil
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
.../drivers/nouveau/codegen/nv50_ir_build_util.cpp | 17 +++++++++++++++++
.../drivers/nouveau/codegen/nv50_ir_build_util.h | 2 ++
2 files changed, 19 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp