thr3ads.net - similar to: "[PATCH] nv50/ir: allow load propagation when flags are defined"

Displaying 20 results from an estimated 900 matches similar to: "[PATCH] nv50/ir: allow load propagation when flags are defined"

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

2014 May 18

[PATCH 1/2] nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand".

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

And you're allowing saturate/neg emission on the short form. Is this already in envytools? Also, what's the shortForm thing? This change is probably fine, but the changelog needs work. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet <rspliet at eclipso.eu> wrote: > MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit > > Signed-off-by: Roy Spliet <rspliet

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

Op 11-01-15 om 01:34 schreef Ilia Mirkin: > And you're allowing saturate/neg emission on the short form. Yes > Is this already in envytools? Tesla floating point instructions are poorly documented in the RST documents; fmad is no exception. I'll make sure to check envydis. > Also, what's the shortForm thing? Documented in envytools; see

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Mar 23

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Are you sure this won't break compute shaders on fermi? Could you please double-check that? One minor comment below. On 03/17/2016 05:07 PM, Hans de Goede wrote: > Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only > apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for > OpenCL global buffers. > > This commits changes the buffer code to use

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Apr 08

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

On 04/08/2016 12:17 PM, Hans de Goede wrote: > Hi, > > On 23-03-16 23:10, Samuel Pitoiset wrote: >> Are you sure this won't break compute shaders on fermi? >> Could you please double-check that? > > I just checked: > > lspci: > 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT > 610] (rev a1) > > Before this patch-set: >

[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 23

[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation

Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of constraints. Short MAD has a very specific SDST == SSRC2 requirement, and since MAD IMM is short notation + 4-byte immediate, don't have the compiler create MAD IMM instructions yet. V2: Document MAD as supported short form Signed-off-by: Roy Spliet <rspliet at eclipso.eu> ---

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Apr 14

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

On 04/12/2016 12:04 PM, Hans de Goede wrote: > Hi, > > On 08-04-16 18:14, Samuel Pitoiset wrote: >> >> >> On 04/08/2016 12:17 PM, Hans de Goede wrote: >>> Hi, >>> >>> On 23-03-16 23:10, Samuel Pitoiset wrote: >>>> Are you sure this won't break compute shaders on fermi? >>>> Could you please double-check that?

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Mar 17

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Apr 08

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Hi, On 23-03-16 23:10, Samuel Pitoiset wrote: > Are you sure this won't break compute shaders on fermi? > Could you please double-check that? I just checked: lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1) Before this patch-set: [hans at plank piglit]$ ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Apr 12

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Hi, On 08-04-16 18:14, Samuel Pitoiset wrote: > > > On 04/08/2016 12:17 PM, Hans de Goede wrote: >> Hi, >> >> On 23-03-16 23:10, Samuel Pitoiset wrote: >>> Are you sure this won't break compute shaders on fermi? >>> Could you please double-check that? >> >> I just checked: >> >> lspci: >> 01:00.0 VGA compatible

[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode

2015 Feb 06

[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode

Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 10 ++++------

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

2016 Mar 16

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

FILE_MEMORY_GLOBAL is currently only used for buffer handling, as we do not yet have (opencl) global memory support. Global memory support actually requires some different handling during lowering, so rename FILE_MEMORY_GLOBAL to FILE_MEMORY_BUFFER to reflect that the current code is for buffer handling, this will allow the later (re-)addition of FILE_MEMORY_GLOBAL for regular global memory.

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

2016 Mar 16

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

Hi, On 16-03-16 15:55, Ilia Mirkin wrote: > This approach leads to the emitters needing to know about both global and > buffer, even though at that point, they are identical. I was thinking that > in the lowering logic, buffer would just get rewritten as global (with the > offset added), thus not needing any change to the emitters. What do you > think about such an approach? I was

[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)

2015 Feb 23

[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Untested beyond compiling a few shaders to see if they look like they might work. nvdisasm agrees with envydis's decoding of these things. Will definitely get ahold of a G200 to run tests on before pushing this. .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 94 ++++++++++++++++++---

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

2016 Mar 16

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

This approach leads to the emitters needing to know about both global and buffer, even though at that point, they are identical. I was thinking that in the lowering logic, buffer would just get rewritten as global (with the offset added), thus not needing any change to the emitters. What do you think about such an approach? On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede at

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

2016 Mar 16

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

Add support for OpenCL global memory buffers, note this has only been tested with regular load and stores and likely needs more work for e.g. atomic ops. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 + .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 31 +++++++++++++++++-----

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

2016 Mar 16

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

Hi, On 16-03-16 11:37, Samuel Pitoiset wrote: > Could you please get rid of the cosmetic changes (eg. the switch ones)? > Because this doesn't really improve readability and in my opinion these changes should be eventually done in a separate patch. I need at least halve of those cosmetic changes, because half of them is not cosmetic, e.g. : - case FILE_MEMORY_BUFFER: code[1] =

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

2015 May 09

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

2016 Mar 16

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

Could you please get rid of the cosmetic changes (eg. the switch ones)? Because this doesn't really improve readability and in my opinion these changes should be eventually done in a separate patch. Other than that, this patch is : Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> Yes, this probably won't work as is for atomic operations but the lowering pass is

similar to: [PATCH] nv50/ir: allow load propagation when flags are defined