similar to: [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

Displaying 20 results from an estimated 300 matches similar to: "[PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop"

2017 Apr 03
3
[PATCH 0/3] nv50/ir: Preapre for running Opts inside a loop
Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt nv50/ir: run some
2017 Apr 03
0
[PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt
Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index bd60a84998..0de84fe9fc 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++
2015 May 09
5
[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Pretty sure there's nothing wrong with it, but it looks odd in the code. src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 ++ src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 +++++-- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 ++ 3 files changed, 9 insertions(+), 2 deletions(-)
2015 Aug 19
5
[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF
Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2017 Apr 03
0
[PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods
Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 4c92a1efb5..bd60a84998 100644 ---
2014 May 10
1
[PATCH] nv50/ir: make sure to reverse cond codes on all the OP_SET variants
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2 10.1" <mesa-stable at lists.freedesktop.org> --- Found this while tracking a regression on nvc0 for my patch which fixes ir_unop_any to emit or's instead of dp3's. (That patch is fine, this code was always broken.) src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 ++- 1 file changed, 2
2015 May 09
2
[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets
On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31
2015 Feb 20
10
[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---
2017 Mar 26
5
[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation
was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA
2015 Jan 13
3
nv50/ir: Implement short notation for MAD V2
V2: clarify code, commit msgs, add comments. Drop code to was supposed to make register assignment prefer SDST == SRC2 (patch 2) for now, because it didn't quite do what I intended.
2015 Jan 23
3
[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation
Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of constraints. Short MAD has a very specific SDST == SSRC2 requirement, and since MAD IMM is short notation + 4-byte immediate, don't have the compiler create MAD IMM instructions yet. V2: Document MAD as supported short form Signed-off-by: Roy Spliet <rspliet at eclipso.eu> ---
2019 Jul 25
5
[Bug 111217] New: compilation of vdpau shaders crashes in handleCVT_CVT
https://bugs.freedesktop.org/show_bug.cgi?id=111217 Bug ID: 111217 Summary: compilation of vdpau shaders crashes in handleCVT_CVT Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/DRI/nouveau Assignee:
2015 Jan 11
6
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
2015 Feb 06
2
[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode
Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 10 ++++------
2001 Oct 18
1
rsync 2.4.6 upload trouble
hi, I'm having rsync trouble I installt an new version of rsync on my server and clients (2.4.6) first every thing workt fine but now I'm having a lot of trouble. I use the following command to upload: rsync -vcrulpog -e ssh --delete --exclude=*lost+found --timeout=60 --password-file=/etc/rsyncpasswd /exports/set1/ user@1.2.3.4:/exports/set1/ and it gives me the following error:
2006 May 14
2
[LLVMdev] __main() function and AliasSet
In a code segment of my pass plugin, I try to gather AliasSets for all StoreInst, LoadInst and CallInst instructions in a function. Some behaviors of the pass puzzled me. Below is the *.ll of the test program which I run the pass on, it was get with "llvm-gcc -Wl,--disable-opt" from a rather simple *.c program. ---------------------------------- ; ModuleID = 'ptralias.bc'
2006 May 14
0
[LLVMdev] Re: __main() function and AliasSet
Oh, I appologize that I should not have asked about __main() ---- it appears in FAQ. But the question remains that why call to __main() can alias stack location? I think the memory location pointed by data_X pointers are not visible to __main(). In comparison, calls to printf() do not have similar effect. On 5/14/06, Nai Xia <nelson.xia at gmail.com> wrote: > > In a code segment of
2006 May 15
2
[LLVMdev] Re: __main() function and AliasSet
Hi Chris, I took a haste look at the "Points-to Analysis in Almost Linear Time" by Steens , your PHD thesis and SteensGaard.cpp in LLVM this afternoon. So I think: 1. Actually the basic algorithm described originally by SteensGaard does not provide MOD/REF information for functions. 2. The context insensitive part of Data Structure Analysis (LocalAnalysis) can be deemed as an
2006 May 17
2
[LLVMdev] Re: __main() function and AliasSet
On Tuesday 16 May 2006 03:19, Chris Lattner wrote: > On Mon, 15 May 2006, Nai Xia wrote: > > > In other words, if I only use -steens-aa and the data_XXXs are all > > external global variables( and so inComplete ), > > Sounds right! > > > the call to printf will > > make the same effect, which I have tested it. > > > > Am I right ? :) >
2017 Jun 11
14
[RFC 0/9] Add precise/invariant semantics to TGSI
Running Tomb Raider on Nouveau I found some flicker caused by ignoring precise modifiers on variables inside Nouveau. This series add precise/invariant handling to TGSI, which can be then used by drivers to disable certain unsafe optimisations which may otherwise alter calculations, which depend on having the same result across shaders. This series fixes this bug in Tomb Raider and one CTS test