similar to: [LLVMdev] adding new instructions to support "swizzle" and "writemask"

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] adding new instructions to support "swizzle" and "writemask""

2013 Oct 17
1
[LLVMdev] Converting a i32 pointer to a vector of i32 ( C array to LLVM vector)
Both the SLP vectorizer and the Loop vectorizer support vectorizing pointers. The attached code looks like a candidate for the SLP-vectorizer. Can you run the SLP-vectorizer with the flag -mllvm -debug-only=SLP and attach the log ? I think that we are missing the pattern for the roots of the tree. Thanks, Nadav On Oct 16, 2013, at 5:28 PM, Tom Stellard <tom at stellard.net> wrote: >
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: > When dealing with non vector variables the llvm register allocator > will use TEMP[0].x then TEMP[0].y, etc. > > When loading something from a global buffer it will calculate the > address to use, and store that in say TEMP[0].x, so it ends up > generating: > > LOAD TEMP[0].y, MEMORY[0],
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 08-04-16 18:06, Hans de Goede wrote: > Hi, > > On 08-04-16 17:45, Ilia Mirkin wrote: >> On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >>> When dealing with non vector variables the llvm register allocator >>> will use TEMP[0].x then TEMP[0].y, etc. >>> >>> When loading something from a global buffer
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 08-04-16 17:02, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 07-04-16 15:58, Ilia Mirkin wrote: >> >> That's wrong. > > > It used to work with the old RES[] code and if one cannot specify > a source swizzle, then how can I do something like > > LOAD TEMP[0].y, MEMORY[0], address > > And get the data at absolute
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 08-04-16 17:45, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> When dealing with non vector variables the llvm register allocator >> will use TEMP[0].x then TEMP[0].y, etc. >> >> When loading something from a global buffer it will calculate the >> address to use, and store that in say TEMP[0].x,
2009 Sep 10
0
[PATCH 02/13] nv50: add functions for swizzle resolution
We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx). --- src/gallium/drivers/nv50/nv50_program.c | 148 ++++++++++++++++++++++++------ 1 files changed, 118 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c
2016 Apr 08
3
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any
2016 Apr 22
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
On Fri, Apr 22, 2016 at 9:23 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 22-04-16 09:08, Marek Olšák wrote: >> >> On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >>> >>> [+radeon folk] >>> >>> Marek, Nicolai, Bas - please have a look at the doc change and let us >>> know
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
Hey, as some of you may know we're in process of experimenting with LLVM in Gallium3D (Mesa's new driver model), where LLVM would be used both in the software only (by just JIT executing shaders) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That
2016 Apr 21
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually
2016 Apr 22
0
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > [+radeon folk] > > Marek, Nicolai, Bas - please have a look at the doc change and let us > know if you think this will cause a problem for radeon. > > Hans is solving the issue that he wants to swizzle the data loaded > from the image/buffer/whatever before sticking it into the dst >
2016 Apr 22
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
Hi, On 22-04-16 09:08, Marek Olšák wrote: > On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >> [+radeon folk] >> >> Marek, Nicolai, Bas - please have a look at the doc change and let us >> know if you think this will cause a problem for radeon. >> >> Hans is solving the issue that he wants to swizzle the data loaded
2008 Nov 18
1
[LLVMdev] Do I need to add new intrinsic functions for the OpenGL shading language swizzle?
OpenGL shading language (GLSL) is like a C subset language, but it contains some special features, ex: native vector type & swizzle. In GLSL, you can declare vector types: void main() { vec4 a; vec3 b; vec2 c; } You can access the element of vector by using .xyzw, it means the 1st, 2nd, 3rd, 4th element of the vector are x, y, z, w. Ex: void main() { float f; vec4 a = vec4(1.0,
2016 Apr 21
2
[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
[+radeon folk] Marek, Nicolai, Bas - please have a look at the doc change and let us know if you think this will cause a problem for radeon. Hans is solving the issue that he wants to swizzle the data loaded from the image/buffer/whatever before sticking it into the dst register. -ilia On Thu, Apr 21, 2016 at 8:39 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI
2007 Sep 27
0
[LLVMdev] Vector swizzling and write masks code generation
On Thu, 27 Sep 2007, Zack Rusin wrote: > as some of you may know we're in process of experimenting with LLVM in > Gallium3D (Mesa's new driver model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single
2005 Jul 27
3
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation
2005 Jul 29
0
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for each instruction (for example
2005 Dec 15
3
[LLVMdev] Vector LLVM extension v.s. DirectX Shaders
Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These
2005 May 06
3
[LLVMdev] avoid live range overlap of "vector" registers
a "vector" register r0 is composed of four 32-bit floating scalar registers, r0.x, r0.y, r0.z, r0.w. each scalar reg can be assigned individually, e.g. mov r0.x, r1.y add r0.y, r1,x, r2.z or assigned simultaneously with vector instructions, e.g. add r0.xyzw, r1.xzyw, r2.xyzw My question is how to define the register in .td file to avoid the code generator overlaps the