similar to: [PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation

Displaying 20 results from an estimated 500 matches similar to: "[PATCH 1/2] nv50/ir: Add support for MAD short+IMM notation"

2015 Jan 11
6
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
2015 Jan 13
3
nv50/ir: Implement short notation for MAD V2
V2: clarify code, commit msgs, add comments. Drop code to was supposed to make register assignment prefer SDST == SRC2 (patch 2) for now, because it didn't quite do what I intended.
2015 Feb 06
2
[PATCH 1/3] nv50/ir: Add support for MAD 4-byte opcode
Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 10 ++++------
2015 Jan 11
1
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
Op 11-01-15 om 01:34 schreef Ilia Mirkin: > And you're allowing saturate/neg emission on the short form. Yes > Is this already in envytools? Tesla floating point instructions are poorly documented in the RST documents; fmad is no exception. I'll make sure to check envydis. > Also, what's the shortForm thing? Documented in envytools; see
2017 Mar 26
5
[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation
was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA
2015 Jan 11
0
[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
And you're allowing saturate/neg emission on the short form. Is this already in envytools? Also, what's the shortForm thing? This change is probably fine, but the changelog needs work. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet <rspliet at eclipso.eu> wrote: > MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit > > Signed-off-by: Roy Spliet <rspliet
2015 Jan 11
0
[PATCH 3/3] nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it is required that SDST == SSRC2. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 52 ++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git
2015 Jan 13
0
[PATCH 2/3] nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 60
2015 Jan 23
0
[PATCH 2/2] nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>
2015 Feb 06
0
[PATCH 3/3] nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet at eclipso.eu>
2014 Jan 13
20
[PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2
OK, so there's a bunch of stuff in here. The geometry stuff is based on the work started by Bryan Cain and Christoph Bumiller. Patches 01-12: Add support for geometry shaders and fix related issues Patches 13-14: Make it possible for fb clears to operate on texture attachments with an explicit layer set (as is allowed in gl 3.2). Patches 15-17: Make ARB_texture_multisample work
2015 Feb 23
2
[PATCH 1/2] nv50/ir: add fp64 support on G200 (NVA0)
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Untested beyond compiling a few shaders to see if they look like they might work. nvdisasm agrees with envydis's decoding of these things. Will definitely get ahold of a G200 to run tests on before pushing this. .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 94 ++++++++++++++++++---
2016 Mar 16
13
[PATCH mesa 1/6] tgsi_build: Fix return of uninitialized memory in tgsi_*_instruction_memory
tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <nicolai.haehnle at amd.com>
2016 Mar 16
2
[PATCH mesa 5/6] nouveau: codegen: Add support for OpenCL global memory buffers
Could you please get rid of the cosmetic changes (eg. the switch ones)? Because this doesn't really improve readability and in my opinion these changes should be eventually done in a separate patch. Other than that, this patch is : Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> Yes, this probably won't work as is for atomic operations but the lowering pass is
2016 Mar 16
2
[PATCH mesa 4/6] nouveau: codegen: s/FILE_MEMORY_GLOBAL/FILE_MEMORY_BUFFER/
This approach leads to the emitters needing to know about both global and buffer, even though at that point, they are identical. I was thinking that in the lowering logic, buffer would just get rewritten as global (with the offset added), thus not needing any change to the emitters. What do you think about such an approach? On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede at
2015 Jan 11
0
[PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2
If liveness analysis indicates it's good, this should improve the chances of being able to emit the short MAD form. Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
2015 Jul 08
2
[PATCH 2/2] nv50/ir: fix a compiler warning with debug-only code
On 8 July 2015 at 19:27, Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> wrote: > codegen/nv50_ir_emit_nv50.cpp: In member function > ‘void nv50_ir::CodeEmitterNV50::emitLOAD(const nv50_ir::Instruction*)’: > codegen/nv50_ir_emit_nv50.cpp:620:12: warning: unused variable ‘offset’ > [-Wunused-variable] > int32_t offset = i->getSrc(0)->reg.data.offset; >
2015 Feb 24
2
intercalar elementos de vectores
Gracias, Carlos. Habia pensado en algo similar usando sapply(): sapply(seq(1, ncol(vtmp), by = 2), function(i) c(rbind(as.character(vtmp[, i]), as.character(vtmp[, i+1])))) Dependiendo de la dimension de los datos, quizas mapply() sea mas eficiente que sapply(). Saludos cordiales, Jorge.- 2015-02-25 1:01 GMT+11:00 Carlos Ortega <cof en qualityexcellence.es>: > Hola, > > Este
2015 Feb 24
2
intercalar elementos de vectores
Excelente! Ahora corre muy rápido. No conocía ese método, creo que me va a resultar muy útil. Muchas gracias y saludos. Fernando Macedo El 24/02/15 a las 10:51, Jorge I Velez escribió: Fernando, Podrias intentar R> a <- rep('a', 5) R> b <- rep('b', 5) R> a [1] "a" "a" "a" "a" "a" R> b [1] "b"
2015 Jul 08
5
[PATCH 1/2] nouveau/compiler: fix trivial compiler warnings
nouveau_compiler.c: In function ‘main’: nouveau_compiler.c:216:27: warning: ‘code’ may be used uninitialized in this function [-Wmaybe-uninitialized] printf("%08x ", code[i / 4]); ^ nouveau_compiler.c:215:4: warning: ‘size’ may be used uninitialized in this function [-Wmaybe-uninitialized] for (i = 0; i < size; i += 4) { Signed-off-by: Tobias