search for: swizzle

Displaying 20 results from an estimated 248 matches for "swizzle".

2016 Apr 08
3
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any way around this. AFAIK this is exactly what src-sw...
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like >> >> LOAD TEMP[0].y, MEMORY[0], address >> >> And get the data at absolute global memory address "address" into TEMP[0].y >> ? >> >> This is a must-have for llvm to be able to generate working TGSI code, >>...
2010 Jan 08
0
Findings on pre-NV50 miptree layout
...d367%40mail.gmail.com&forum_name=mesa3d-dev . Here are the findings from running it. The result is that our miptree layout code is partially broken, and overly complex. In particular: 1. 3D textures are broken because they are not laid out like cube maps, but first by level and then by face 2. Swizzled 3D texture are all 3 texture coordinates swizzled together 3. Cube maps have their faces 128 byte aligned, not only 64 like in my patch or unaligned like without it (not applied IIRC). 4. Swizzled 2D/3D/cube textures don't have any gaps, except for cube map face alignment. The current code con...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 07-04-16 15:58, Ilia Mirkin wrote: >> >> That's wrong. > > > It used to work with the old RES[] code and if one cannot specify > a source swizzle, then how can I do something like > > LOAD TEMP[0].y, MEMORY[0], address > > And get the data at absolute global memory address "address" into TEMP[0].y > ? > > This is a must-have for llvm to be able to generate working TGSI code, > I do not see any way around thi...
2005 Apr 20
1
[LLVMdev] adding new instructions to support "swizzle" and "writemask"
...#39;, 'a', and each channel is a 32-bit floating point. It's similar to the high and low 8-bit of an x86 16-bit general purpose register "AX" can be individually referenced as "AH" and "AL". What's different is the hardware further "source register swizzle" and "writemask". For example: # The following two instructions are equivalent. # They cost the same instruction slot, and have same # execution time. Four channels are added in parallel. add r0, r1, r2 add r0.xyzw, r1.xyzw, r2.xyzw # equivale...
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...roblem adding the > swizzling logic, i.e. the way that LOAD will now work (logically) is > that it will > > (a) fetch 4 values from the coordinates provided (4 sequential dwords > from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in > the case of images) > (b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE argument > (c) store that swizzled result into the destination based on the writemask > > That would sound reasonable to me, and if I understand correctly, is > option 2 of your proposal. Yes that is option 2, and is basically wh...
2018 Sep 19
1
Textures Twiddling/Swizzling
Thanks for the last info it was truely helpful. Anyways, I'm currently trying to implement 3D textures into yuzu, as far as I know they are twiddled in a different manner to 2D textures. Could one of you guys point me in the right direction? I've been meddling around: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv50/nv50_tex.c but I can't see where the
2016 Apr 22
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
...t 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >> [+radeon folk] >> >> Marek, Nicolai, Bas - please have a look at the doc change and let us >> know if you think this will cause a problem for radeon. >> >> Hans is solving the issue that he wants to swizzle the data loaded >> from the image/buffer/whatever before sticking it into the dst >> register. > > Is this something st/mesa needs or just nouveau? If just nouveau needs > it, I don't see a point in updating the TGSI spec, since nouveau can > just add the swizzle when tr...
2016 Apr 21
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
[+radeon folk] Marek, Nicolai, Bas - please have a look at the doc change and let us know if you think this will cause a problem for radeon. Hans is solving the issue that he wants to swizzle the data loaded from the image/buffer/whatever before sticking it into the dst register. -ilia On Thu, Apr 21, 2016 at 8:39 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI backend uses pointers in registers and does things > like: > > LOAD TEMP[0].y, MEMORY[0...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...IT and the such. Well, I have no problem adding the swizzling logic, i.e. the way that LOAD will now work (logically) is that it will (a) fetch 4 values from the coordinates provided (4 sequential dwords from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in the case of images) (b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE argument (c) store that swizzled result into the destination based on the writemask That would sound reasonable to me, and if I understand correctly, is option 2 of your proposal. We'd need some docs updates and buy-in from the other gal...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...wizzling logic, i.e. the way that LOAD will now work (logically) is >> that it will >> >> (a) fetch 4 values from the coordinates provided (4 sequential dwords >> from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in >> the case of images) >> (b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE argument >> (c) store that swizzled result into the destination based on the writemask >> >> That would sound reasonable to me, and if I understand correctly, is >> option 2 of your proposal. > > Yes that is opti...
2015 Aug 10
2
"enable dri3 support without glamor" causes gnome-shell regression on nv4x
...gt;> and so the commands never get flushed out. Easily verified by sticking >>>>> PUSH_KICK's everywhere. >>>> >>>> >>>> >>>> I do not believe that that is the problem, in my case it clearly >>>> seems to be a pitch / swizzle problem rather then a synchronizarion >>>> problem, here is what my desktop with gnome shell looks like when >>>> using DRI2: >>>> >>>> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-good.jpg >>>> >>>> And this is what i...
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
...for vector swizzling and write masks. For example the following represents a valid gpu shader instruction: ADD dst.xyz src1.yxzw src2.zwxy which performs an addition that stores the result to the dst operated (each operarand is a vector type of four data elements) The instruction uses source swizzle modifiers and destination mask modifier. So if a language is capable of expressing such constructs (as GLSL, HLSL and few others are) I'd like to make sure that the code generator is actually capable of generating instructions with exactly those semantics. Right now vector operations utili...
2016 Apr 07
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...um/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -2279,12 +2279,16 @@ Converter::handleLOAD(Value *dst0[4]) Value *off = fetchSrc(1, c); Symbol *sym; + uint32_t src0_component_offset = tgsi.getSrc(0).getSwizzle(c) * 4; + if (tgsi.getSrc(1).getFile() == TGSI_FILE_IMMEDIATE) { off = NULL; sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, - tgsi.getSrc(1).getValueU32(0, info) + 4 * c); + tgsi.getSrc(1).getValueU32(0, info)...
2007 Sep 27
0
[LLVMdev] Vector swizzling and write masks code generation
...rite masks. ok > For example the following represents a valid gpu shader instruction: > ADD dst.xyz src1.yxzw src2.zwxy > which performs an addition that stores the result to the dst operated (each > operarand is a vector type of four data elements) The instruction uses source > swizzle modifiers and destination mask modifier. Right. > So if a language is capable of expressing such constructs (as GLSL, HLSL and > few others are) I'd like to make sure that the code generator is actually > capable of generating instructions with exactly those semantics. Ok. Are you...
2005 Jul 27
3
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
...r is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation instruction SHUFPD '_bias' and '_x2' are modifiers. they modify the value of source operands and send the modified values to the adder. '_bias' = source - 0.5, '_x2' = source * 2 '_s...
2015 Aug 03
2
"enable dri3 support without glamor" causes gnome-shell regression on nv4x
...rks fine with DRI2, but DRI3 has no synchronization >>> and so the commands never get flushed out. Easily verified by sticking >>> PUSH_KICK's everywhere. >> >> >> I do not believe that that is the problem, in my case it clearly >> seems to be a pitch / swizzle problem rather then a synchronizarion >> problem, here is what my desktop with gnome shell looks like when >> using DRI2: >> >> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-good.jpg >> >> And this is what it looks like when using DRI3: >> >>...
2016 Apr 21
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- Changes in v2: -Tweaked commit msg a bit -Add documentation for this to src/gallium/docs/source...
2016 Apr 22
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
...alum.mit.edu> wrote: >>> >>> [+radeon folk] >>> >>> Marek, Nicolai, Bas - please have a look at the doc change and let us >>> know if you think this will cause a problem for radeon. >>> >>> Hans is solving the issue that he wants to swizzle the data loaded >>> from the image/buffer/whatever before sticking it into the dst >>> register. >> >> >> Is this something st/mesa needs or just nouveau? If just nouveau needs >> it, I don't see a point in updating the TGSI spec, since nouveau can >...
2005 Dec 15
3
[LLVMdev] Vector LLVM extension v.s. DirectX Shaders
...nt vector // the names of the components are x, y, z, w add r0.xy, r1.zxyw, r2.yyyy The components of r1 and r2 and permuted before the addition, but the permeation result is _not_ written backed to r1 and r2. 'zxyw' and 'yyyy' are the permutation patterns (they are called 'swizzle'). 'xy' is called the write mask. The result is written to only x and y component of r0. z and w are left untouched. _Almost each_ instruction specifies different write masks and swizzles. There will be a lot of extract, combine, and permute LLVA instructions. It may make the transfor...