thr3ads.net - similar to: "[RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS"

Displaying 20 results from an estimated 100 matches similar to: "[RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS"

[PATCH 0/5] nvc0: fp64 preparation

2014 Jul 18

[PATCH 0/5] nvc0: fp64 preparation

Most of codegen is already FP64-ready. There are a few edge-cases that I ran into, many of which can apply even to non-fp64-enabled programs (although the double-wide registers are not very common without fp64). I've yet to give this a full piglit run, but wanted to send these out in case someone wanted to comment. They do not depend on the preliminary core fp64 work. Ilia Mirkin (5):

[PATCH] nv50/ir: avoid deleting pseudo instructions too early

2014 Sep 25

[PATCH] nv50/ir: avoid deleting pseudo instructions too early

What happens is that a SPLIT operation is part of the spill node, and as a pseudo op, the instruction gets erased after processing its first def. However the later defs still need to refer to it, so instead delay spilling until after that whole RA node is done processing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462 Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc:

[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs

2014 Sep 01

[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs

The set of variable defs does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~11s from ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

2014 Jul 08

[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses

This shortens runtime of piglit test fp-long-alu to ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 6 +++--- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 7 ++++--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-

NV50 compute support questions

2015 Nov 25

NV50 compute support questions

Hi, On 20-11-15 17:07, Samuel Pitoiset wrote: > > > On 11/20/2015 11:36 AM, Hans de Goede wrote: >> Hi Samual, et al, > > Hi Hans, > >> >> In >> http://cgit.freedesktop.org/mesa/mesa/commit/src/gallium/drivers/nouveau?id=ff72440b40211326eda118232fabd53965410afd >> >> you write: "This compute support has been tested by >> Pierre

NV50 compute support questions

2015 Nov 26

NV50 compute support questions

Hi, On 26-11-15 09:42, Samuel Pitoiset wrote: > Well, if you remove that assert locally, all compute tests in src/gallium/tests/trivial/compute.c pass on GK106, except the atomic ones. Do you mean the: Assertion `pres->target != PIPE_BUFFER' failed. or the: Assertion `tex->defExists(0) && tex->srcExists(0)' failed. assert? Or is the first one not present for

NV50 compute support questions

2015 Nov 20

NV50 compute support questions

Hi Samual, et al, In http://cgit.freedesktop.org/mesa/mesa/commit/src/gallium/drivers/nouveau?id=ff72440b40211326eda118232fabd53965410afd you write: "This compute support has been tested by Pierre Moreau and myself with some compute kernels." Can you provide testing instructions (and the necessary files) so that I can try to reproduce your tests ? And once I've reproduced your

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand".

[PATCH] nv50/ir: Initialize all members of GCRA (trivial)

2017 Aug 11

[PATCH] nv50/ir: Initialize all members of GCRA (trivial)

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 9d70ec3c9c..e4f38c8e46 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++

[PATCH v2] nv50/ir: Initialize all members of GCRA (trivial)

2017 Dec 30

[PATCH v2] nv50/ir: Initialize all members of GCRA (trivial)

v2: use initialization list (Pierre) Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> Reviewed-by: Pierre Moreau <pierre.morrow at free.fr> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp

[PATCH] nv50/ra: Only increment DefValue counter if we are going to spill

2017 Aug 19

[PATCH] nv50/ra: Only increment DefValue counter if we are going to spill

This is in preparation of an upcoming patch changing how we keep track of the defs. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp

[PATCH] nv50/ir: don't touch degree on physreg RIG nodes

2016 Jan 06

[PATCH] nv50/ir: don't touch degree on physreg RIG nodes

These nodes don't go through reduction, so we shouldn't be increasing their degrees. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91895 Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "11.0 11.1" <mesa-stable at lists.freedesktop.org> --- I would like to see a *bunch* of testing on this before merging it... RA-land is far from my expertise.

Regression caused by 2e9ee44797 ("nv50/ir/ra: some register spilling fixes")

2014 Feb 14

Regression caused by 2e9ee44797 ("nv50/ir/ra: some register spilling fixes")

Hi Christoph, bin/shader_runner tests/spec/glsl-1.40/uniform_buffer/fs-struct-copy-complicated.shader_test -auto bin/shader_runner tests/spec/glsl-1.40/uniform_buffer/vs-struct-copy-complicated.shader_test -auto bin/shader_runner tests/spec/glsl-1.50/uniform_buffer/gs-struct-copy-complicated.shader_test -auto Now all segfault. I reverted 2e9ee44797 ("nv50/ir/ra: some register spilling

[PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

2015 Jan 11

[PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

If liveness analysis indicates it's good, this should improve the chances of being able to emit the short MAD form. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp

[PATCH] nv50/ir: Initialize all members of GCRA (trivial)

2017 Dec 29

[PATCH] nv50/ir: Initialize all members of GCRA (trivial)

It looks like this patch was never merged. You could initialise “nodeCount” and “nodes” directly in the member initialisation list. With that changed, this patch is Reviewed-by: Pierre Moreau <pierre.morrow at free.fr> On 2017-08-12 — 01:45, Tobias Klausmann wrote: > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> > --- >

rsync filename heuristics

2005 Jan 05

rsync filename heuristics

On 5 Jan 2005, Rusty Russell <rusty@rustcorp.com.au> wrote: > On Tue, 2005-01-04 at 18:24 +0100, Robert Lemmen wrote: > > hi rusty, > > > > i read on some webpage about rsync and debian that you wrote a patch to > > rsync that let's it uses heuristics when deciding which local file to > > use. could you tell me whether this is planned to be included in

[PATCH] gm107/ir: fix texture argument order

2014 Sep 25

[PATCH] gm107/ir: fix texture argument order

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.3" <mesa-stable at lists.freedesktop.org> --- With this, all the tex-miplevel-selection tests pass on maxwell. There is a minor bit of this change which affects textureGrad on kepler that I have yet to test, but I'm moderately sure it's correct and was only working by luck before. (Changing the insbf to use

[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced

2014 May 13

[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2" <mesa-stable at lists.freedesktop.org> --- Not 100% sure of the significance of this code, but this seems like the correct thing to do... will definitely run it through a full piglit run before pushing out. src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git

[PATCH] nvc0/ir: move sample id to second source arg to fix sampler2DMS

2014 Mar 20

[PATCH] nvc0/ir: move sample id to second source arg to fix sampler2DMS

The nvc0 texfetch instruction expects the sample id to be in the second source (usually used for the offset) rather than as part of the texture coordinate. This fixes all the sampler2DMS/Array tests on nvc0. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.1" <mesa-stable at lists.freedesktop.org> --- Tested on nvc1 with a full piglit run, no regressions,

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp

similar to: [RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS