similar to: [PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex

Displaying 20 results from an estimated 400 matches similar to: "[PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex"

2014 Dec 02
0
[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs
The set of variable defs does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~11s from ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
2014 Aug 30
3
[Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
On 30/08/14 23:02, Ilia Mirkin wrote: > Samplers are only defined up to num_samplers, so set all samplers above > nr to NULL so that we don't try to read them again later. > Would it be worth doing a similar thing with the unlocked samplers below the nr mark ? It seems to me that we might be leaking nv50->samplers[s][i], or perhaps I'm missing something ? -Emil >
2014 Aug 31
2
[Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
On 31/08/14 00:34, Ilia Mirkin wrote: > On Sat, Aug 30, 2014 at 7:30 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote: >> On 30/08/14 23:02, Ilia Mirkin wrote: >>> Samplers are only defined up to num_samplers, so set all samplers above >>> nr to NULL so that we don't try to read them again later. >>> >> Would it be worth doing a similar thing
2017 Apr 29
2
[PATCH] nv50/ir: we can't replace 0x0 with zero reg for SHLADD
fixes a crash in Alien Isolation Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 732e1a93b4..4815d6df07 100644 ---
2017 Apr 29
0
[PATCH] nv50/ir: we can't replace 0x0 with the zero reg for SHLADD
fixes a crash in Alien Isolation Signed-off-by: Karol Herbst <karolherbst at gmail.com> Cc: 13.0 17.0 17.1 <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
2017 Apr 29
0
[PATCH] nv50/ir: we can't replace 0x0 with zero reg for SHLADD
On Sat, Apr 29, 2017 at 10:41 AM, Karol Herbst <karolherbst at gmail.com> wrote: > fixes a crash in Alien Isolation What crash? How did the zero get there? Does this only happen if you do your optimization loop thing? In either case, we still want the replaceZero() logic. However that logic should be aware that the middle argument of a SHLADD is not to be touched. Otherwise we could end
2014 Aug 30
0
[PATCH 2/2] nv50: zero out unbound samplers
Samplers are only defined up to num_samplers, so set all samplers above nr to NULL so that we don't try to read them again later. Tested-by: Christian Ruppert <idl0r at qasl.de> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2 10.3" <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/nv50/nv50_state.c | 7 +++++-- 1 file changed, 5
2014 Aug 30
0
[Mesa-stable] [PATCH 2/2] nv50: zero out unbound samplers
On Sat, Aug 30, 2014 at 7:30 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote: > On 30/08/14 23:02, Ilia Mirkin wrote: >> Samplers are only defined up to num_samplers, so set all samplers above >> nr to NULL so that we don't try to read them again later. >> > Would it be worth doing a similar thing with the unlocked samplers below the > nr mark ? It seems
2014 Jul 08
1
[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses
This shortens runtime of piglit test fp-long-alu to ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 6 +++--- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 7 ++++--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
2014 Mar 01
1
[PATCH] nouveau: add valid range tracking to nouveau_buffer
This logic is borrowed from the radeon code. The transfer logic will only get called for PIPE_BUFFER resources, so it shouldn't be necessary to worry about them becoming render targets. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- A user reported a ~30% FPS improvement with an earlier version of this patch in TF2, and no visual regressions in CS, all on a nv50 card. (Source
2014 Aug 08
2
[PATCH 1/3] nvc0/ir: add base tex offset for fermi indirect tex case
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index f010767..4a9e48f 100644 ---
2014 Jul 05
1
[PATCH 1/2] nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just flip to the generic quadop-based implementation in that case. This is the minimal fix appropriate for backporting. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 3 ++- 1 file changed, 2
2017 Aug 13
1
[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if one of the input value is an immediate, we fail to propagate the immediate: ... mov u32 $r477 0x00000003 (0) ... mov u32 $r0 %r473 (0) mov u32 $r1 $r477 (0) call abs BUILTIN:0 (0) mov u32 %r495 $r1 (0) ... With this patch the immediate is propagated, potentially causing the first MOV to be superfluous, which we'd
2014 Jul 05
0
[PATCH] nvc0: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that the quadops deal with the texture coordinates. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git
2015 Jan 04
0
[PATCH] nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied upon to perform logic. Reported-by: Pierre Moreau <pierre.morrow at free.fr> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.2 10.3 10.4" <mesa-stable at lists.freedesktop.org> --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp | 3 ++-
2015 Jan 05
0
[PATCH] nv50/ir: change the way float face is returned
The old way made it impossible for the optimizer to reason about what was going on. The new way is the same number of instructions (the neg gets folded into the cvt) but enables the optimizer to be cleverer if comparing to a constant (most common case). [The optimizer is presently not sufficiently clever to work this out, but it could relatively easily be made to be. The old way would have
2015 Feb 23
0
[PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Not sure how many steps are needed for the necessary accuracy. Just doing 2 because that seems like a reasonable number. .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 42 ++++++++++++++++++++-- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
2014 Sep 25
0
[PATCH] gm107/ir: fix texture argument order
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.3" <mesa-stable at lists.freedesktop.org> --- With this, all the tex-miplevel-selection tests pass on maxwell. There is a minor bit of this change which affects textureGrad on kepler that I have yet to test, but I'm moderately sure it's correct and was only working by luck before. (Changing the insbf to use
2017 Aug 12
0
[PATCH] nvc0/ir: propagate immediates to CALL input MOVs
On Sat, Aug 12, 2017 at 3:33 PM, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > On using builtin functions we have to move the input to registers $0 and $1, if > one of the input value is an immediate, we fail to propagate the immediate: > > ... > mov u32 $r477 0x00000003 (0) > ... > mov u32 $r0 %r473 (0) > mov u32 $r1 $r477 (0) > call abs
2014 Mar 20
0
[PATCH] nvc0/ir: move sample id to second source arg to fix sampler2DMS
The nvc0 texfetch instruction expects the sample id to be in the second source (usually used for the offset) rather than as part of the texture coordinate. This fixes all the sampler2DMS/Array tests on nvc0. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.1" <mesa-stable at lists.freedesktop.org> --- Tested on nvc1 with a full piglit run, no regressions,