Displaying 20 results from an estimated 300 matches similar to: "[PATCH 1/2] nv50/ir: make sure that texprep/texquerylod's args get coalesced"
2014 May 18
1
[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic
Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".
2019 Oct 14
1
[PATCH] gm107/ir: fix loading z offset for layered 3d image bindings
Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate
2015 Feb 23
2
[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Untested beyond compiling a few shaders to see if they look like they
might work. nvdisasm agrees with envydis's decoding of these things.
Will definitely get ahold of a G200 to run tests on before pushing this.
.../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 94 ++++++++++++++++++---
2014 Jan 13
20
[PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2
OK, so there's a bunch of stuff in here. The geometry stuff is based on the
work started by Bryan Cain and Christoph Bumiller.
Patches 01-12: Add support for geometry shaders and fix related issues
Patches 13-14: Make it possible for fb clears to operate on texture attachments
with an explicit layer set (as is allowed in gl 3.2).
Patches 15-17: Make ARB_texture_multisample work
2015 Feb 20
10
[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
.../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++-
1 file changed, 63 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index dfb093c..e38a3b8 100644
---
2017 Dec 20
2
[PATCH] gm107/ir: use lane 0 for manual textureGrad handling
This is parallel to the pre-SM50 change which does this. Adjusts the
shuffles / quadops to make the values correct relative to lane 0, and
then splat the results to all lanes for the final move into the target
register.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Entirely untested beyond compilation. Should check
bin/tex-miplevel-selection textureGrad Cube
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Does this give correct results for special floats (0, infs)?
We tried to improve (for single floats) x86 rcp in llvmpipe with
newton-raphson, but unfortunately not being able to give correct results
for these two cases (without even more additional code) meant it got all
disabled in the end (you can still see that code in the driver) since
the problems are at least as bad as those due to bad
2017 Jul 31
1
[RFC PATCH] nv50/ir: allow spilling of def values for constrained MERGES/UNIONS
This lets us spill more values and compile a big shader for Civilization 6.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index b33d7b4010..f29c8a1a95
2014 Dec 02
0
[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs
The set of variable defs does not need to be ordered in any way, and
removing/adding elements is a fairly common operation in various
optimization passes.
This shortens runtime of piglit test fp-long-alu to ~11s from ~22s
No piglit regressions observed on nvc0!
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
2014 Sep 01
0
[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs
The set of variable defs does not need to be ordered in any way, and
removing/adding elements is a fairly common operation in various
optimization passes.
This shortens runtime of piglit test fp-long-alu to ~11s from ~22s
No piglit regressions observed on nvc0!
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
2014 Jul 18
5
[PATCH 0/5] nvc0: fp64 preparation
Most of codegen is already FP64-ready. There are a few edge-cases that I ran
into, many of which can apply even to non-fp64-enabled programs (although the
double-wide registers are not very common without fp64).
I've yet to give this a full piglit run, but wanted to send these out in case
someone wanted to comment. They do not depend on the preliminary core fp64
work.
Ilia Mirkin (5):
2014 Feb 28
0
[PATCH] nv50: enable texture query lod
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Note: this applies on top of airlied's r600g-texture-gather branch.
Appears to pass all 4 piglit tests. The conversion from what the instruction
outputs is the same as what the blob does.
src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
.../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 4 ++++
2014 May 20
14
[PATCH 00/12] Cherry-pick nv50/nvc0 patches from gallium-nine
I went through the gallium-nine tree and picked out nouveau patches that are
general bug-fixes. The first bunch I'd like to also get into 10.2. I've
reviewed all of them and they make sense to me, but sending them out for
public review as well in case there are any objections.
Unless I hear objections, I'd like to push this by Friday.
Christoph Bumiller (11):
nv50,nvc0: always pull
2014 Jul 05
1
[PATCH 1/2] nv50/ir: retrieve shadow compare from first arg
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: <mesa-stable at lists.freedesktop.org>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
2014 May 10
2
[PATCH] nv50: fix setting of texture ms info to be per-stage
Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable at lists.freedesktop.org>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp | 4 ++++
2017 Aug 11
2
[PATCH] nv50/ir: Initialize all members of GCRA (trivial)
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 9d70ec3c9c..e4f38c8e46 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++
2014 Feb 14
0
Regression caused by 2e9ee44797 ("nv50/ir/ra: some register spilling fixes")
Hi Christoph,
bin/shader_runner
tests/spec/glsl-1.40/uniform_buffer/fs-struct-copy-complicated.shader_test
-auto
bin/shader_runner
tests/spec/glsl-1.40/uniform_buffer/vs-struct-copy-complicated.shader_test
-auto
bin/shader_runner
tests/spec/glsl-1.50/uniform_buffer/gs-struct-copy-complicated.shader_test
-auto
Now all segfault. I reverted 2e9ee44797 ("nv50/ir/ra: some register
spilling
2017 Dec 30
1
[PATCH v2] nv50/ir: Initialize all members of GCRA (trivial)
v2: use initialization list (Pierre)
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
Reviewed-by: Pierre Moreau <pierre.morrow at free.fr>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
2017 Aug 19
1
[PATCH] nv50/ra: Only increment DefValue counter if we are going to spill
This is in preparation of an upcoming patch changing how we keep track of the
defs.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
2016 Jan 06
1
[PATCH] nv50/ir: don't touch degree on physreg RIG nodes
These nodes don't go through reduction, so we shouldn't be increasing
their degrees.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91895
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable at lists.freedesktop.org>
---
I would like to see a *bunch* of testing on this before merging it... RA-land
is far from my expertise.