thr3ads.net - search: "getsrcs"

Displaying 20 results from an estimated 94 matches for "getsrcs".

Did you mean: getsrc

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 07

[PATCH] nouveau: codegen: Take src swizzle into account on loads

The llvm TGSI backend does things like: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. Before this commit the data at TEMP[0].x + 4 would be loaded instead. This commit fixes this. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 ++++++-- 1 file changed,

[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs

2014 Dec 02

[PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs

The set of variable defs does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~11s from ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp

[PATCH mesa v2 1/3] nouveau: codegen: LOAD: Always use component 0 when getting the address

2016 Apr 21

[PATCH mesa v2 1/3] nouveau: codegen: LOAD: Always use component 0 when getting the address

LOAD loads upto 4 components from the specified resource starting at the passed in x value of the 2nd source operand, the y, z and w components of the address should not be used. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- Changes in v2: -New patch in v2 of this patch-set --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

2017 Aug 12

[PATCH] nvc0/ir: propagate immediates to CALL input MOVs

On using builtin functions we have to move the input to registers $0 and $1, if one of the input value is an immediate, we fail to propagate the immediate: ... mov u32 $r477 0x00000003 (0) ... mov u32 $r0 %r473 (0) mov u32 $r1 $r477 (0) call abs BUILTIN:0 (0) mov u32 %r495 $r1 (0) ... With this patch the immediate is propagated, potentially causing the first MOV to be superfluous, which we'd

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

2017 Mar 26

[PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: also do PostRaLoadPropagation for FMA

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

2016 Mar 16

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

On 03/16/2016 10:23 AM, Hans de Goede wrote: > Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled / > commented out some of the old resource handling code, but not all of it. > > Effectively all of it is dead already, if we ever enter the old code > paths in handeLOAD / handleSTORE / handleATOM we will get an exception > due to trying to access the

[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs

2017 Aug 13

[PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

2015 May 09

[PATCH 3/4] nvc0/ir: optimize set & 1.0 to produce boolean-float sets

On 09.05.2015 07:35, Ilia Mirkin wrote: > This has started to happen more now that the backend is producing > KILL_IF more often. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++++++++++++++++++++++ > .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 ++ > 2 files changed, 31

[PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

2017 Apr 03

[PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. v2: update commit author Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt

nv50/ir: Implement short notation for MAD V2

2015 Jan 13

nv50/ir: Implement short notation for MAD V2

V2: clarify code, commit msgs, add comments. Drop code to was supposed to make register assignment prefer SDST == SRC2 (patch 2) for now, because it didn't quite do what I intended.

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 07

[PATCH] nouveau: codegen: Take src swizzle into account on loads

That's wrong. The spec for the instruction needs to be clarified... The current nouveau impl is correct - only the .x of the address should be loaded, with up to 16 bytes read into the destination. On Thu, Apr 7, 2016 at 9:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI backend does things like: > > LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x > >

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 21

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

2016 Mar 16

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled / commented out some of the old resource handling code, but not all of it. Effectively all of it is dead already, if we ever enter the old code paths in handeLOAD / handleSTORE / handleATOM we will get an exception due to trying to access the now always zero-sized resources vector. Make non buffer / memory file

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015 Jan 11

[PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp

[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs

2014 Sep 01

[PATCH] nv50/ir: use unordered_set instead of list to keep track of var defs

[PATCH] gm107/ir: fix texture argument order

2014 Sep 25

[PATCH] gm107/ir: fix texture argument order

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.3" <mesa-stable at lists.freedesktop.org> --- With this, all the tex-miplevel-selection tests pass on maxwell. There is a minor bit of this change which affects textureGrad on kepler that I have yet to test, but I'm moderately sure it's correct and was only working by luck before. (Changing the insbf to use

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

2016 Mar 16

[PATCH mesa 6/6] nouveau: codegen: Disable more old resource handling code

Hi, On 16-03-16 11:45, Samuel Pitoiset wrote: > > > On 03/16/2016 10:23 AM, Hans de Goede wrote: >> Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled / >> commented out some of the old resource handling code, but not all of it. >> >> Effectively all of it is dead already, if we ever enter the old code >> paths in handeLOAD /

[PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods

2017 Apr 03

[PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods

Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 4c92a1efb5..bd60a84998 100644 ---

search for: getsrcs