thr3ads.net - similar to: "nv50/ir: Implement short notation for MAD V2"

Displaying 20 results from an estimated 500 matches similar to: "nv50/ir: Implement short notation for MAD V2"

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp

[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 23

[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation

Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of constraints. Short MAD has a very specific SDST == SSRC2 requirement, and since MAD IMM is short notation + 4-byte immediate, don't have the compiler create MAD IMM instructions yet. V2: Document MAD as supported short form Signed-off-by: Roy Spliet <rspliet at eclipso.eu> ---

[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode

2015 Feb 06

[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode

Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 10 ++++------

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

2017 Mar 26

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

Op 11-01-15 om 01:34 schreef Ilia Mirkin: > And you're allowing saturate/neg emission on the short form. Yes > Is this already in envytools? Tesla floating point instructions are poorly documented in the RST documents; fmad is no exception. I'll make sure to check envydis. > Also, what's the shortForm thing? Documented in envytools; see

[PATCH 3/3] nv50/ir: Fold IMM into MAD

2015 Jan 11

[PATCH 3/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it is required that SDST == SSRC2. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 52 ++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git

[PATCH 2/3] nv50/ir: Fold IMM into MAD

2015 Jan 13

[PATCH 2/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 60

[PATCH 2/2] nv50/ir: Fold IMM into MAD

2015 Jan 23

[PATCH 2/2] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>

[PATCH 3/3] nv50/ir: Fold IMM into MAD

2015 Feb 06

[PATCH 3/3] nv50/ir: Fold IMM into MAD

Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>

[PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2

2014 Jan 13

[PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2

OK, so there's a bunch of stuff in here. The geometry stuff is based on the work started by Bryan Cain and Christoph Bumiller. Patches 01-12: Add support for geometry shaders and fix related issues Patches 13-14: Make it possible for fb clears to operate on texture attachments with an explicit layer set (as is allowed in gl 3.2). Patches 15-17: Make ARB_texture_multisample work

[PATCH mesa 1/6] tgsi_build: Fix return of uninitialized memory in tgsi_*_instruction_memory

2016 Mar 16

[PATCH mesa 1/6] tgsi_build: Fix return of uninitialized memory in tgsi_*_instruction_memory

tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <nicolai.haehnle at amd.com>

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

And you're allowing saturate/neg emission on the short form. Is this already in envytools? Also, what's the shortForm thing? This change is probably fine, but the changelog needs work. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet <rspliet at eclipso.eu> wrote: > MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit > > Signed-off-by: Roy Spliet <rspliet

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---

[PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

2017 Apr 03

[PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. v2: update commit author Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt

[PATCH 0/3] nv50/ir: Preapre for running Opts inside a loop

2017 Apr 03

[PATCH 0/3] nv50/ir: Preapre for running Opts inside a loop

Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt nv50/ir: run some

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

2016 Mar 16

[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers

Could you please get rid of the cosmetic changes (eg. the switch ones)? Because this doesn't really improve readability and in my opinion these changes should be eventually done in a separate patch. Other than that, this patch is : Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> Yes, this probably won't work as is for atomic operations but the lowering pass is

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

2015 May 09

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31

[PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

2015 Jan 11

[PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

If liveness analysis indicates it's good, this should improve the chances of being able to emit the short MAD form. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

2016 Mar 16

[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/

This approach leads to the emitters needing to know about both global and buffer, even though at that point, they are identical. I was thinking that in the lowering logic, buffer would just get rewritten as global (with the offset added), thus not needing any change to the emitters. What do you think about such an approach? On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede at

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

2015 May 09

[PATCH 1/4] nvc0/ir: avoid jumping to a sched instruction

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Pretty sure there's nothing wrong with it, but it looks odd in the code. src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 ++ src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 +++++-- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 ++ 3 files changed, 9 insertions(+), 2 deletions(-)

similar to: nv50/ir: Implement short notation for MAD V2