search for: writemasking

Displaying 20 results from an estimated 51 matches for "writemasking".

2005 Jul 27
3
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation
2005 Jul 29
0
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for each instruction (for example
2005 Apr 20
1
[LLVMdev] adding new instructions to support "swizzle" and "writemask"
Hello, everyone: I am writing a compiler for a programmable graphics hardware. Each registers of the hardware has four channels, namely 'r', 'b', 'g', 'a', and each channel is a 32-bit floating point. It's similar to the high and low 8-bit of an x86 16-bit general purpose register "AX" can be individually referenced as "AH" and
2009 Feb 13
3
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written
2009 Feb 13
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul
2013 Nov 26
2
[LLVMdev] R600/SI build failure on Leopard (Use of C++11)
Hi Christian, Ryan just reported to me that llvm-3.4 is no longer building on OS X Leopard (https://trac.macports.org/ticket/41548). It seems the issue is with a commit that you made back in April (referenced below) which added this to SIISelLowering.cpp: // Adjust the writemask in the node std::vector<SDValue> Ops; Ops.push_back(DAG.getTargetConstant(NewDmask, MVT::i32)); for
2013 Nov 26
0
[LLVMdev] R600/SI build failure on Leopard (Use of C++11)
Can't you just use &Ops[0] ? On Tue, Nov 26, 2013 at 12:03 PM, Jeremy Huddleston Sequoia <jeremyhu at apple.com> wrote: > Hi Christian, > > Ryan just reported to me that llvm-3.4 is no longer building on OS X Leopard (https://trac.macports.org/ticket/41548). It seems the issue is with a commit that you made back in April (referenced below) which added this to
2008 Dec 30
0
[LLVMdev] [Mesa3d-dev] Folding vector instructions
On Dec 30, 2008, at 6:39 AM, Corbin Simpson wrote: >> However, the special instrucions cannot directly be mapped to LLVM >> IR, like >> "min", the conversion involves in 'extract' the vector, create >> less-than-compare, create 'select' instruction, and create 'insert- >> element' >> instruction. Using scalar operations
2008 Dec 30
2
[LLVMdev] [Mesa3d-dev] Folding vector instructions
Alex wrote: > Hello. > > Sorry I am not sure this question should go to llvm or mesa3d-dev mailing > list, so I post it to both. > > I am writing a llvm backend for a modern graphics processor which has a ISA > very similar to that of Direct 3D. > > I am reading the code in Gallium-3D driver in a mesa3d branch, which > converts the shader programs (TGSI tokens) to
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 08-04-16 17:45, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> When dealing with non vector variables the llvm register allocator >> will use TEMP[0].x then TEMP[0].y, etc. >> >> When loading something from a global buffer it will calculate the >> address to use, and store that in say TEMP[0].x,
2005 Dec 15
3
[LLVMdev] Vector LLVM extension v.s. DirectX Shaders
Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These
2016 Apr 08
2
[PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 08-04-16 17:02, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like
2008 Dec 30
2
[LLVMdev] Folding vector instructions
Hello. Sorry I am not sure this question should go to llvm or mesa3d-dev mailing list, so I post it to both. I am writing a llvm backend for a modern graphics processor which has a ISA very similar to that of Direct 3D. I am reading the code in Gallium-3D driver in a mesa3d branch, which converts the shader programs (TGSI tokens) to LLVM IR. For the shader instruction also found in LLVM IR,
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
Hey, as some of you may know we're in process of experimenting with LLVM in Gallium3D (Mesa's new driver model), where LLVM would be used both in the software only (by just JIT executing shaders) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That
2009 Sep 10
0
[PATCH 02/13] nv50: add functions for swizzle resolution
We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx). --- src/gallium/drivers/nv50/nv50_program.c | 148 ++++++++++++++++++++++++------ 1 files changed, 118 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c
2017 Jun 11
0
[RFC 3/9] st/glsl_to_tgsi: handle precise modifier
all subexpression inside an ir_assignment needs to be tagged as precise. Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 80 ++++++++++++++++++++++++------ 1 file changed, 65 insertions(+), 15 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index c5d2e0fcd2..19f90f21fe
2005 Dec 15
0
[LLVMdev] Vector LLVM extension v.s. DirectX Shaders
On Thu, 15 Dec 2005, Tzu-Chien Chiu wrote: > To write a compiler for Microsoft Direct3D shaders from our hardware, > I have a program which translates the Direct3D shader assembly to LLVM > assembly. I added several intrinsics for this purpose. > It's a vector ISA and has some special instructions like: > * rcp (reciprocal) > * frc (the fractional portion of each input
2016 Apr 08
0
[PATCH] nouveau: codegen: Take src swizzle into account on loads
On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: > When dealing with non vector variables the llvm register allocator > will use TEMP[0].x then TEMP[0].y, etc. > > When loading something from a global buffer it will calculate the > address to use, and store that in say TEMP[0].x, so it ends up > generating: > > LOAD TEMP[0].y, MEMORY[0],
2007 Sep 27
0
[LLVMdev] Vector swizzling and write masks code generation
On Thu, 27 Sep 2007, Zack Rusin wrote: > as some of you may know we're in process of experimenting with LLVM in > Gallium3D (Mesa's new driver model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single
2009 Sep 10
0
[PATCH 01/13] nv50: extend insn src mask function
Extend its usage to avoiding e.g. emission of negation instructions in tx_insn for sources we don't need. --- src/gallium/drivers/nv50/nv50_program.c | 118 +++++++++++++++++++------------ 1 files changed, 72 insertions(+), 46 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c b/src/gallium/drivers/nv50/nv50_program.c index 4a83852..a6c70ae 100644 ---