Displaying 20 results from an estimated 400 matches similar to: "[PATCH] nv50/ir: we can't replace 0x0 with zero reg for SHLADD"
2017 Apr 29
0
[PATCH] nv50/ir: we can't replace 0x0 with zero reg for SHLADD
On Sat, Apr 29, 2017 at 10:41 AM, Karol Herbst <karolherbst at gmail.com> wrote:
> fixes a crash in Alien Isolation
What crash? How did the zero get there? Does this only happen if you
do your optimization loop thing?
In either case, we still want the replaceZero() logic. However that
logic should be aware that the middle argument of a SHLADD is not to
be touched. Otherwise we could end
2017 Apr 29
0
[PATCH] nv50/ir: we can't replace 0x0 with the zero reg for SHLADD
fixes a crash in Alien Isolation
Signed-off-by: Karol Herbst <karolherbst at gmail.com>
Cc: 13.0 17.0 17.1 <mesa-stable at lists.freedesktop.org>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
2014 Aug 30
2
[PATCH 1/2] nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin at
2014 Aug 08
2
[PATCH 1/3] nvc0/ir: add base tex offset for fermi indirect tex case
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
.../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index f010767..4a9e48f 100644
---
2014 Jul 05
1
[PATCH 1/2] nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.
This is the minimal fix appropriate for backporting.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: <mesa-stable at lists.freedesktop.org>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 3 ++-
1 file changed, 2
2017 Aug 12
3
[PATCH] nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:
...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd
2017 Aug 13
1
[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:
...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd
2014 Jan 13
20
[PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2
OK, so there's a bunch of stuff in here. The geometry stuff is based on the
work started by Bryan Cain and Christoph Bumiller.
Patches 01-12: Add support for geometry shaders and fix related issues
Patches 13-14: Make it possible for fb clears to operate on texture attachments
with an explicit layer set (as is allowed in gl 3.2).
Patches 15-17: Make ARB_texture_multisample work
2016 Mar 14
2
[RFC mesa] nouveau: Add support for OpenCL global memory buffers
This little "hack" fixes the use of OpenCL global memory buffers with
nouveau, but clearly the #if 0 is not a solution as it breaks buffers
with GLSL.
The reason I'm posting this as an RFC patch is to discuss how to solve
this properly, 2 solutions come to mind:
1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus
TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL,
2016 Mar 14
2
[RFC mesa] nouveau: Add support for OpenCL global memory buffers
Hi,
On 14-03-16 16:05, Ilia Mirkin wrote:
> There's a less hacky and more hacky way forward. The more hacky solution is
> to set file index to -1 or something and then not do the lowering when you
> see that.
>
> The less hacky solution is the one you proposed as #1 - introduce a new
> file for "buffer" memory and lower it to the global file by adding a base
>
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Does this give correct results for special floats (0, infs)?
We tried to improve (for single floats) x86 rcp in llvmpipe with
newton-raphson, but unfortunately not being able to give correct results
for these two cases (without even more additional code) meant it got all
disabled in the end (you can still see that code in the driver) since
the problems are at least as bad as those due to bad
2019 Oct 14
1
[PATCH] gm107/ir: fix loading z offset for layered 3d image bindings
Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate
2016 Mar 14
2
[RFC mesa] nouveau: Add support for OpenCL global memory buffers
Hi,
On 14-03-16 16:41, Samuel Pitoiset wrote:
>
>
> On 03/14/2016 04:28 PM, Hans de Goede wrote:
>> Hi,
>>
>> On 14-03-16 16:05, Ilia Mirkin wrote:
>>> There's a less hacky and more hacky way forward. The more hacky
>>> solution is
>>> to set file index to -1 or something and then not do the lowering when
>>> you
>>> see
2016 Mar 17
4
[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers
Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only
apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for
OpenCL global buffers.
This commits changes the buffer code to use FILE_MEMORY_BUFFER at the
ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL
for use with OpenCL global buffers.
Note that after lowering buffer accesses use the
2014 Dec 02
0
[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs
The set of variable defs does not need to be ordered in any way, and
removing/adding elements is a fairly common operation in various
optimization passes.
This shortens runtime of piglit test fp-long-alu to ~11s from ~22s
No piglit regressions observed on nvc0!
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
2016 Apr 08
2
[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers
Hi,
On 23-03-16 23:10, Samuel Pitoiset wrote:
> Are you sure this won't break compute shaders on fermi?
> Could you please double-check that?
I just checked:
lspci:
01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
Before this patch-set:
[hans at plank piglit]$ ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
2016 Apr 12
2
[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers
Hi,
On 08-04-16 18:14, Samuel Pitoiset wrote:
>
>
> On 04/08/2016 12:17 PM, Hans de Goede wrote:
>> Hi,
>>
>> On 23-03-16 23:10, Samuel Pitoiset wrote:
>>> Are you sure this won't break compute shaders on fermi?
>>> Could you please double-check that?
>>
>> I just checked:
>>
>> lspci:
>> 01:00.0 VGA compatible
2016 Mar 16
2
[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/
This approach leads to the emitters needing to know about both global and
buffer, even though at that point, they are identical. I was thinking that
in the lowering logic, buffer would just get rewritten as global (with the
offset added), thus not needing any change to the emitters. What do you
think about such an approach?
On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede at
2015 May 17
14
[PATCH 00/12] Tessellation support for nvc0
This is enough to enable tessellation support on nvc0. It seems to
work a lot better on my GF108 than GK208. I suspect that there's some
sort of scheduling shenanigans that need to be adjusted for
kepler+. Or perhaps some shader header things.
Even with the GF108, I still get occasional blue triangles in Heaven,
but I get a *ton* of them on the GK208 -- seemingly the same issue,
but it's
2014 Jul 05
0
[PATCH] nvc0: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: <mesa-stable at lists.freedesktop.org>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git