search for: swizzl

Displaying 20 results from an estimated 248 matches for "swizzl".

Did you mean: swizzle
2016 Apr 08
3
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any way around this. AFAIK this is exactly what src-s...
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like >> >> LOAD TEMP[0].y, MEMORY[0], address >> >> And get the data at absolute global memory address "address" into TEMP[0].y >> ? >> >> This is a must-have for llvm to be able to generate working TGSI code, >&gt...
2010 Jan 08
0
Findings on pre-NV50 miptree layout
...d367%40mail.gmail.com&forum_name=mesa3d-dev . Here are the findings from running it. The result is that our miptree layout code is partially broken, and overly complex. In particular: 1. 3D textures are broken because they are not laid out like cube maps, but first by level and then by face 2. Swizzled 3D texture are all 3 texture coordinates swizzled together 3. Cube maps have their faces 128 byte aligned, not only 64 like in my patch or unaligned like without it (not applied IIRC). 4. Swizzled 2D/3D/cube textures don't have any gaps, except for cube map face alignment. The current code co...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 07-04-16 15:58, Ilia Mirkin wrote: >> >> That's wrong. > > > It used to work with the old RES[] code and if one cannot specify > a source swizzle, then how can I do something like > > LOAD TEMP[0].y, MEMORY[0], address > > And get the data at absolute global memory address "address" into TEMP[0].y > ? > > This is a must-have for llvm to be able to generate working TGSI code, > I do not see any way around th...
2005 Apr 20
1
[LLVMdev] adding new instructions to support "swizzle" and "writemask"
...#39;, 'a', and each channel is a 32-bit floating point. It's similar to the high and low 8-bit of an x86 16-bit general purpose register "AX" can be individually referenced as "AH" and "AL". What's different is the hardware further "source register swizzle" and "writemask". For example: # The following two instructions are equivalent. # They cost the same instruction slot, and have same # execution time. Four channels are added in parallel. add r0, r1, r2 add r0.xyzw, r1.xyzw, r2.xyzw # equival...
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...is pointing. But instead it will get the 32 bits of >> data at address (TEMP[0].x + 4). >> >> With the old RES[32767] code one could generate the following TGSI: >> >> LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] >> >> And things would work fine since the .xxxx swizzling postfix would >> be honored and when storing to y (the only component set in the dest-mask) >> the x component at address (TEMP[0].x) would be loaded, rather then the >> y component at (TEMP[0].y) >> >> Note that another approach would be to not increment the addres...
2018 Sep 19
1
Textures Twiddling/Swizzling
...3D textures into yuzu, as far as I know they are twiddled in a different manner to 2D textures. Could one of you guys point me in the right direction? I've been meddling around: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv50/nv50_tex.c but I can't see where the swizzling actualy takes place. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20180919/b418e35e/attachment.html>
2016 Apr 22
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
...t 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >> [+radeon folk] >> >> Marek, Nicolai, Bas - please have a look at the doc change and let us >> know if you think this will cause a problem for radeon. >> >> Hans is solving the issue that he wants to swizzle the data loaded >> from the image/buffer/whatever before sticking it into the dst >> register. > > Is this something st/mesa needs or just nouveau? If just nouveau needs > it, I don't see a point in updating the TGSI spec, since nouveau can > just add the swizzle when t...
2016 Apr 21
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
[+radeon folk] Marek, Nicolai, Bas - please have a look at the doc change and let us know if you think this will cause a problem for radeon. Hans is solving the issue that he wants to swizzle the data loaded from the image/buffer/whatever before sticking it into the dst register. -ilia On Thu, Apr 21, 2016 at 8:39 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI backend uses pointers in registers and does things > like: > > LOAD TEMP[0].y, MEMORY[...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...ata > to which TEMP[0].x is pointing. But instead it will get the 32 bits of > data at address (TEMP[0].x + 4). > > With the old RES[32767] code one could generate the following TGSI: > > LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] > > And things would work fine since the .xxxx swizzling postfix would > be honored and when storing to y (the only component set in the dest-mask) > the x component at address (TEMP[0].x) would be loaded, rather then the > y component at (TEMP[0].y) > > Note that another approach would be to not increment the address by > a 32 bit w...
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...will get the 32 bits of >>> data at address (TEMP[0].x + 4). >>> >>> With the old RES[32767] code one could generate the following TGSI: >>> >>> LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] >>> >>> And things would work fine since the .xxxx swizzling postfix would >>> be honored and when storing to y (the only component set in the dest-mask) >>> the x component at address (TEMP[0].x) would be loaded, rather then the >>> y component at (TEMP[0].y) >>> >>> Note that another approach would be to not...
2015 Aug 10
2
"enable dri3 support without glamor" causes gnome-shell regression on nv4x
...gt;> and so the commands never get flushed out. Easily verified by sticking >>>>> PUSH_KICK's everywhere. >>>> >>>> >>>> >>>> I do not believe that that is the problem, in my case it clearly >>>> seems to be a pitch / swizzle problem rather then a synchronizarion >>>> problem, here is what my desktop with gnome shell looks like when >>>> using DRI2: >>>> >>>> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-good.jpg >>>> >>>> And this is what...
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
...) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That is graphics hardware (basically every single programmable gpu) has instruction level support for vector swizzling and write masks. For example the following represents a valid gpu shader instruction: ADD dst.xyz src1.yxzw src2.zwxy which performs an addition that stores the result to the dst operated (each operarand is a vector type of four data elements) The instruction uses source swizzle modifiers...
2016 Apr 07
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
...um/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -2279,12 +2279,16 @@ Converter::handleLOAD(Value *dst0[4]) Value *off = fetchSrc(1, c); Symbol *sym; + uint32_t src0_component_offset = tgsi.getSrc(0).getSwizzle(c) * 4; + if (tgsi.getSrc(1).getFile() == TGSI_FILE_IMMEDIATE) { off = NULL; sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, - tgsi.getSrc(1).getValueU32(0, info) + 4 * c); + tgsi.getSrc(1).getValueU32(0, info)...
2007 Sep 27
0
[LLVMdev] Vector swizzling and write masks code generation
...model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single programmable gpu) has > instruction level support for vector swizzling and write masks. ok > For example the following represents a valid gpu shader instruction: > ADD dst.xyz src1.yxzw src2.zwxy > which performs an addition that stores the result to the dst operated (each > operarand is a vector type of four data elements) The instruction uses sou...
2005 Jul 27
3
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
...r is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation instruction SHUFPD '_bias' and '_x2' are modifiers. they modify the value of source operands and send the modified values to the adder. '_bias' = source - 0.5, '_x2' = source * 2 '_...
2015 Aug 03
2
"enable dri3 support without glamor" causes gnome-shell regression on nv4x
...rks fine with DRI2, but DRI3 has no synchronization >>> and so the commands never get flushed out. Easily verified by sticking >>> PUSH_KICK's everywhere. >> >> >> I do not believe that that is the problem, in my case it clearly >> seems to be a pitch / swizzle problem rather then a synchronizarion >> problem, here is what my desktop with gnome shell looks like when >> using DRI2: >> >> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-good.jpg >> >> And this is what it looks like when using DRI3: >> >>...
2016 Apr 21
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- Changes in v2: -Tweaked commit msg a bit -Add documentation for this to src/gallium/docs/sourc...
2016 Apr 22
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
...alum.mit.edu> wrote: >>> >>> [+radeon folk] >>> >>> Marek, Nicolai, Bas - please have a look at the doc change and let us >>> know if you think this will cause a problem for radeon. >>> >>> Hans is solving the issue that he wants to swizzle the data loaded >>> from the image/buffer/whatever before sticking it into the dst >>> register. >> >> >> Is this something st/mesa needs or just nouveau? If just nouveau needs >> it, I don't see a point in updating the TGSI spec, since nouveau can &gt...
2005 Dec 15
3
[LLVMdev] Vector LLVM extension v.s. DirectX Shaders
...nt vector // the names of the components are x, y, z, w add r0.xy, r1.zxyw, r2.yyyy The components of r1 and r2 and permuted before the addition, but the permeation result is _not_ written backed to r1 and r2. 'zxyw' and 'yyyy' are the permutation patterns (they are called 'swizzle'). 'xy' is called the write mask. The result is written to only x and y component of r0. z and w are left untouched. _Almost each_ instruction specifies different write masks and swizzles. There will be a lot of extract, combine, and permute LLVA instructions. It may make the transfo...