thr3ads.net - similar to: "[LLVMdev] adding new instructions to support "swizzle" and "writemask""

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] adding new instructions to support "swizzle" and "writemask""

[LLVMdev] Converting a i32 pointer to a vector of i32 ( C array to LLVM vector)

2013 Oct 17

[LLVMdev] Converting a i32 pointer to a vector of i32 ( C array to LLVM vector)

Both the SLP vectorizer and the Loop vectorizer support vectorizing pointers. The attached code looks like a candidate for the SLP-vectorizer. Can you run the SLP-vectorizer with the flag -mllvm -debug-only=SLP and attach the log ? I think that we are missing the pattern for the roots of the tree. Thanks, Nadav On Oct 16, 2013, at 5:28 PM, Tom Stellard <tom at stellard.net> wrote: >

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: > When dealing with non vector variables the llvm register allocator > will use TEMP[0].x then TEMP[0].y, etc. > > When loading something from a global buffer it will calculate the > address to use, and store that in say TEMP[0].x, so it ends up > generating: > > LOAD TEMP[0].y, MEMORY[0],

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 18:06, Hans de Goede wrote: > Hi, > > On 08-04-16 17:45, Ilia Mirkin wrote: >> On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >>> When dealing with non vector variables the llvm register allocator >>> will use TEMP[0].x then TEMP[0].y, etc. >>> >>> When loading something from a global buffer

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 17:02, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 07-04-16 15:58, Ilia Mirkin wrote: >> >> That's wrong. > > > It used to work with the old RES[] code and if one cannot specify > a source swizzle, then how can I do something like > > LOAD TEMP[0].y, MEMORY[0], address > > And get the data at absolute

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 17:45, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> When dealing with non vector variables the llvm register allocator >> will use TEMP[0].x then TEMP[0].y, etc. >> >> When loading something from a global buffer it will calculate the >> address to use, and store that in say TEMP[0].x,

[PATCH 02/13] nv50: add functions for swizzle resolution

2009 Sep 10

[PATCH 02/13] nv50: add functions for swizzle resolution

We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx). --- src/gallium/drivers/nv50/nv50_program.c | 148 ++++++++++++++++++++++++------ 1 files changed, 118 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 22

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

On Fri, Apr 22, 2016 at 9:23 AM, Hans de Goede <hdegoede at redhat.com> wrote: > Hi, > > On 22-04-16 09:08, Marek Olšák wrote: >> >> On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >>> >>> [+radeon folk] >>> >>> Marek, Nicolai, Bas - please have a look at the doc change and let us >>> know

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

Hey, as some of you may know we're in process of experimenting with LLVM in Gallium3D (Mesa's new driver model), where LLVM would be used both in the software only (by just JIT executing shaders) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 21

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 22

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > [+radeon folk] > > Marek, Nicolai, Bas - please have a look at the doc change and let us > know if you think this will cause a problem for radeon. > > Hans is solving the issue that he wants to swizzle the data loaded > from the image/buffer/whatever before sticking it into the dst >

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 22

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

Hi, On 22-04-16 09:08, Marek Olšák wrote: > On Thu, Apr 21, 2016 at 7:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: >> [+radeon folk] >> >> Marek, Nicolai, Bas - please have a look at the doc change and let us >> know if you think this will cause a problem for radeon. >> >> Hans is solving the issue that he wants to swizzle the data loaded

[LLVMdev] Do I need to add new intrinsic functions for the OpenGL shading language swizzle?

2008 Nov 18

[LLVMdev] Do I need to add new intrinsic functions for the OpenGL shading language swizzle?

OpenGL shading language (GLSL) is like a C subset language, but it contains some special features, ex: native vector type & swizzle. In GLSL, you can declare vector types: void main() { vec4 a; vec3 b; vec2 c; } You can access the element of vector by using .xyzw, it means the 1st, 2nd, 3rd, 4th element of the vector are x, y, z, w. Ex: void main() { float f; vec4 a = vec4(1.0,

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

2016 Apr 21

[PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account

[+radeon folk] Marek, Nicolai, Bas - please have a look at the doc change and let us know if you think this will cause a problem for radeon. Hans is solving the issue that he wants to swizzle the data loaded from the image/buffer/whatever before sticking it into the dst register. -ilia On Thu, Apr 21, 2016 at 8:39 AM, Hans de Goede <hdegoede at redhat.com> wrote: > The llvm TGSI

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

On Thu, 27 Sep 2007, Zack Rusin wrote: > as some of you may know we're in process of experimenting with LLVM in > Gallium3D (Mesa's new driver model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 27

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 29

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for each instruction (for example

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 06

[LLVMdev] avoid live range overlap of "vector" registers

a "vector" register r0 is composed of four 32-bit floating scalar registers, r0.x, r0.y, r0.z, r0.w. each scalar reg can be assigned individually, e.g. mov r0.x, r1.y add r0.y, r1,x, r2.z or assigned simultaneously with vector instructions, e.g. add r0.xyzw, r1.xzyw, r2.xyzw My question is how to define the register in .td file to avoid the code generator overlaps the

similar to: [LLVMdev] adding new instructions to support "swizzle" and "writemask"