thr3ads.net - similar to: "[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs"

Displaying 20 results from an estimated 200 matches similar to: "[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs"

[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs

2014 Dec 02

[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs

The set of variable defs does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~11s from ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

2014 Jul 08

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

This shortens runtime of piglit test fp-long-alu to ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 6 +++--- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 7 ++++--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-

[PATCH] nv50/ra: Only increment DefValue counter if we are going to spill

2017 Aug 19

[PATCH] nv50/ra: Only increment DefValue counter if we are going to spill

This is in preparation of an upcoming patch changing how we keep track of the defs. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp

[PATCH] nv50/ir: avoid deleting pseudo instructions too early

2014 Sep 25

[PATCH] nv50/ir: avoid deleting pseudo instructions too early

What happens is that a SPLIT operation is part of the spill node, and as a pseudo op, the instruction gets erased after processing its first def. However the later defs still need to refer to it, so instead delay spilling until after that whole RA node is done processing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462 Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc:

[PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex

2014 Aug 30

[PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex

In certain circumstances, findFirstUses could end up doubling back on instructions it had already processed, resulting in an infinite recursion. Avoid this by keeping track of already-visited instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079 Tested-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> Signed-off-by: Ilia Mirkin <imirkin at

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

2015 May 09

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31

[Bug 111167] New: Dividing zero by a uniform in loop header causes segfault in nv50_ir::NVC0LegalizeSSA::handleDIV

2019 Jul 18

[Bug 111167] New: Dividing zero by a uniform in loop header causes segfault in nv50_ir::NVC0LegalizeSSA::handleDIV

https://bugs.freedesktop.org/show_bug.cgi?id=111167 Bug ID: 111167 Summary: Dividing zero by a uniform in loop header causes segfault in nv50_ir::NVC0LegalizeSSA::handleDIV Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: minor

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH] gm107/ir: fix loading z offset for layered 3d image bindings

2019 Oct 14

[PATCH] gm107/ir: fix loading z offset for layered 3d image bindings

Unfortuantely we don't know if a particular load is a real 2d image (as would be a cube face or 2d array element), or a layer of a 3d image. Since we pass in the TIC reference, the instruction's type has to match what's in the TIC (experimentally). In order to properly support bindless images, this also can't be done by looking at the current bindings and generating appropriate

[RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS

2017 Jul 31

[RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS

This lets us spill more values and compile a big shader for Civilization 6. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index b33d7b4010..f29c8a1a95

[Bug 79462] New: [NVC0/Codegen] Shader compilation falis in spill logic

2014 May 30

[Bug 79462] New: [NVC0/Codegen] Shader compilation falis in spill logic

https://bugs.freedesktop.org/show_bug.cgi?id=79462 Priority: medium Bug ID: 79462 Assignee: nouveau at lists.freedesktop.org Summary: [NVC0/Codegen] Shader compilation falis in spill logic Severity: normal Classification: Unclassified OS: All Reporter: imirkin at alum.mit.edu Hardware: Other

[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs

2017 Aug 13

[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs

On using builtin functions we have to move the input to registers $0 and $1, if one of the input value is an immediate, we fail to propagate the immediate: ... mov u32 $r477 0x00000003 (0) ... mov u32 $r0 %r473 (0) mov u32 $r1 $r477 (0) call abs BUILTIN:0 (0) mov u32 %r495 $r1 (0) ... With this patch the immediate is propagated, potentially causing the first MOV to be superfluous, which we'd

[PATCH 3/3] nv50/ir: Fold IMM into MAD

2015 Jan 11

[PATCH 3/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it is required that SDST == SSRC2. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 52 ++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git

[PATCH 2/3] nv50/ir: Fold IMM into MAD

2015 Jan 13

[PATCH 2/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 60

[PATCH 2/2] nv50/ir: Fold IMM into MAD

2015 Jan 23

[PATCH 2/2] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>

[PATCH 3/3] nv50/ir: Fold IMM into MAD

2015 Feb 06

[PATCH 3/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

2017 Aug 12

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

On Sat, Aug 12, 2017 at 3:33 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > On using builtin functions we have to move the input to registers $0 and $1, if > one of the input value is an immediate, we fail to propagate the immediate: > > ... > mov u32 $r477 0x00000003 (0) > ... > mov u32 $r0 %r473 (0) > mov u32 $r1 $r477 (0) > call abs

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

2017 Aug 12

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

Use of the C++ standard library in XRay compiler-rt

2017 Mar 15

Use of the C++ standard library in XRay compiler-rt

On Tue, Mar 14, 2017 at 5:34 PM Dean Michael Berris <dean.berris at gmail.com> wrote: > On 13 Mar 2017, at 15:39, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com> > wrote: > > > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev < > llvm-dev at

Use of the C++ standard library in XRay compiler-rt

2017 Mar 13

Use of the C++ standard library in XRay compiler-rt

On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com> wrote: > > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > I agree that we should clean up the standard library usage even just for > consistency. > > > > +1 -- now that I think about it, it should be fairly doable

similar to: [PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs