thr3ads.net - search: "swizzling"

Displaying 20 results from an estimated 248 matches for "swizzling".

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...le, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any way around this. AFAIK this is exactly what src-swizzling is for. Also note that this commit does not change anything if no src-swizzling is specified, in that case things work exactly as before. > The spec for the instruction needs to be clarified... > The current nouveau impl is correct - only the .x of the address > should be loaded, with up...

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...ess >> >> And get the data at absolute global memory address "address" into TEMP[0].y >> ? >> >> This is a must-have for llvm to be able to generate working TGSI code, >> I do not see any way around this. >> >> AFAIK this is exactly what src-swizzling is for. Also note that >> this commit does not change anything if no src-swizzling is specified, >> in that case things work exactly as before. >> >>> The spec for the instruction needs to be clarified... >>> >>> The current nouveau impl is correct - onl...

Findings on pre-NV50 miptree layout

2010 Jan 08

Findings on pre-NV50 miptree layout

...lied IIRC). 4. Swizzled 2D/3D/cube textures don't have any gaps, except for cube map face alignment. The current code contains a strange dimension check. I'm in the process of rewriting the miptree layout code and all the 2D engine code to account this (and to support all case, including unswizzling and 3D-swizzling). Here are the findings on NV40. Not sure what happens with compressed textures (which may be currently broken since Doom3 misrenders in non-Ultra quality). I'll check that once the 2D code is otherwise finished and working * Swizzled 1D/2D/3D textures Mipmaps are laid seque...

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...LOAD TEMP[0].y, MEMORY[0], address > > And get the data at absolute global memory address "address" into TEMP[0].y > ? > > This is a must-have for llvm to be able to generate working TGSI code, > I do not see any way around this. > > AFAIK this is exactly what src-swizzling is for. Also note that > this commit does not change anything if no src-swizzling is specified, > in that case things work exactly as before. > >> The spec for the instruction needs to be clarified... >> >> The current nouveau impl is correct - only the .x of the address...

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

2005 Apr 20

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

...y, r2.wx Note that the channel y of r1 is replicated in the third instruction. Detailed documentation: <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/directx/graphics/reference/AssemblyLanguageShaders/PixelShaders/Registers/Modifiers/SourceRegisterModifiers/PS_Swizzling.asp> The code must be be transformed in SSA (.ll file). The problem is that no existing LLVM instruction or intrinsic function supports swizzle and writemask. I have a few solutions: (1) Treat each channel of a register as a individual SSA variable. This could generate inefficient machine co...

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...is pointing. But instead it will get the 32 bits of >> data at address (TEMP[0].x + 4). >> >> With the old RES[32767] code one could generate the following TGSI: >> >> LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] >> >> And things would work fine since the .xxxx swizzling postfix would >> be honored and when storing to y (the only component set in the dest-mask) >> the x component at address (TEMP[0].x) would be loaded, rather then the >> y component at (TEMP[0].y) >> >> Note that another approach would be to not increment the address b...

Textures Twiddling/Swizzling

2018 Sep 19

Textures Twiddling/Swizzling

...3D textures into yuzu, as far as I know they are twiddled in a different manner to 2D textures. Could one of you guys point me in the right direction? I've been meddling around: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv50/nv50_tex.c but I can't see where the swizzling actualy takes place. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20180919/b418e35e/attachment.html>

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 22

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

Hi, On 22-04-16 09:08, Marek Olšák wrote: > On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >> [+radeon folk] >> >> Marek, Nicolai, Bas - please have a look at the doc change and let us >> know if you think this will cause a problem for radeon. >> >> Hans is solving the issue that he wants to swizzle the data loaded

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 21

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

[+radeon folk] Marek, Nicolai, Bas - please have a look at the doc change and let us know if you think this will cause a problem for radeon. Hans is solving the issue that he wants to swizzle the data loaded from the image/buffer/whatever before sticking it into the dst register. -ilia On Thu, Apr 21, 2016 at 8:39 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...ata > to which TEMP[0].x is pointing. But instead it will get the 32 bits of > data at address (TEMP[0].x + 4). > > With the old RES[32767] code one could generate the following TGSI: > > LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] > > And things would work fine since the .xxxx swizzling postfix would > be honored and when storing to y (the only component set in the dest-mask) > the x component at address (TEMP[0].x) would be loaded, rather then the > y component at (TEMP[0].y) > > Note that another approach would be to not increment the address by > a 32 bit word...

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

...will get the 32 bits of >>> data at address (TEMP[0].x + 4). >>> >>> With the old RES[32767] code one could generate the following TGSI: >>> >>> LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] >>> >>> And things would work fine since the .xxxx swizzling postfix would >>> be honored and when storing to y (the only component set in the dest-mask) >>> the x component at address (TEMP[0].x) would be loaded, rather then the >>> y component at (TEMP[0].y) >>> >>> Note that another approach would be to not inc...

"enable dri3 support without glamor" causes gnome-shell regression on nv4x

2015 Aug 10

"enable dri3 support without glamor" causes gnome-shell regression on nv4x

...ll DRI, leading to >>>> software rendering. >>>> >>>> I discussed this with Ben this morning and he suggested that this >>>> is likely a Mesa issue since with DRI3 mesa rather then the ddx >>>> allocs the surfaces. I've tried disabling swizzling in the >>>> mesa code by forcing nv30_miptree_create() to always take >>>> the code path for linear textures, but that leads to the exact >>>> same result as before that change. >>> >>> >>> Ah yes. Very different problem indeed. I actua...

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

...) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That is graphics hardware (basically every single programmable gpu) has instruction level support for vector swizzling and write masks. For example the following represents a valid gpu shader instruction: ADD dst.xyz src1.yxzw src2.zwxy which performs an addition that stores the result to the dst operated (each operarand is a vector type of four data elements) The instruction uses source swizzle modifiers and...

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 07

[PATCH] nouveau: codegen: Take src swizzle into account on loads

The llvm TGSI backend does things like: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. Before this commit the data at TEMP[0].x + 4 would be loaded instead. This commit fixes this. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 ++++++-- 1 file changed,

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

...model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single programmable gpu) has > instruction level support for vector swizzling and write masks. ok > For example the following represents a valid gpu shader instruction: > ADD dst.xyz src1.yxzw src2.zwxy > which performs an addition that stores the result to the dst operated (each > operarand is a vector type of four data elements) The instruction uses source...

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 27

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation

"enable dri3 support without glamor" causes gnome-shell regression on nv4x

2015 Aug 03

"enable dri3 support without glamor" causes gnome-shell regression on nv4x

...at seems to also automatically disable all DRI, leading to >> software rendering. >> >> I discussed this with Ben this morning and he suggested that this >> is likely a Mesa issue since with DRI3 mesa rather then the ddx >> allocs the surfaces. I've tried disabling swizzling in the >> mesa code by forcing nv30_miptree_create() to always take >> the code path for linear textures, but that leads to the exact >> same result as before that change. > > Ah yes. Very different problem indeed. I actually suspect it has to do > with swizzling. Look at...

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 21

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 22

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

On Fri, Apr 22, 2016 at 9:23 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 22-04-16 09:08, Marek Olšák wrote: >> >> On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >>> >>> [+radeon folk] >>> >>> Marek, Nicolai, Bas - please have a look at the doc change and let us >>> know

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These

search for: swizzling