thr3ads.net - search: "writemasking"

Displaying 20 results from an estimated 51 matches for "writemasking".

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 27

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 29

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for each instruction (for example

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

2005 Apr 20

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

Hello, everyone: I am writing a compiler for a programmable graphics hardware. Each registers of the hardware has four channels, namely 'r', 'b', 'g', 'a', and each channel is a 32-bit floating point. It's similar to the high and low 8-bit of an x86 16-bit general purpose register "AX" can be individually referenced as "AH" and

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul

[LLVMdev] R600/SI build failure on Leopard (Use of C++11)

2013 Nov 26

[LLVMdev] R600/SI build failure on Leopard (Use of C++11)

Hi Christian, Ryan just reported to me that llvm-3.4 is no longer building on OS X Leopard (https://trac.macports.org/ticket/41548). It seems the issue is with a commit that you made back in April (referenced below) which added this to SIISelLowering.cpp: // Adjust the writemask in the node std::vector<SDValue> Ops; Ops.push_back(DAG.getTargetConstant(NewDmask, MVT::i32)); for

[LLVMdev] R600/SI build failure on Leopard (Use of C++11)

2013 Nov 26

[LLVMdev] R600/SI build failure on Leopard (Use of C++11)

Can't you just use &Ops[0] ? On Tue, Nov 26, 2013 at 12:03 PM, Jeremy Huddleston Sequoia <jeremyhu at apple.com> wrote: > Hi Christian, > > Ryan just reported to me that llvm-3.4 is no longer building on OS X Leopard (https://trac.macports.org/ticket/41548). It seems the issue is with a commit that you made back in April (referenced below) which added this to

[LLVMdev] [Mesa3d-dev] Folding vector instructions

2008 Dec 30

[LLVMdev] [Mesa3d-dev] Folding vector instructions

On Dec 30, 2008, at 6:39 AM, Corbin Simpson wrote: >> However, the special instrucions cannot directly be mapped to LLVM >> IR, like >> "min", the conversion involves in 'extract' the vector, create >> less-than-compare, create 'select' instruction, and create 'insert- >> element' >> instruction. Using scalar operations

[LLVMdev] [Mesa3d-dev] Folding vector instructions

2008 Dec 30

[LLVMdev] [Mesa3d-dev] Folding vector instructions

Alex wrote: > Hello. > > Sorry I am not sure this question should go to llvm or mesa3d-dev mailing > list, so I post it to both. > > I am writing a llvm backend for a modern graphics processor which has a ISA > very similar to that of Direct 3D. > > I am reading the code in Gallium-3D driver in a mesa3d branch, which > converts the shader programs (TGSI tokens) to

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 17:45, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> When dealing with non vector variables the llvm register allocator >> will use TEMP[0].x then TEMP[0].y, etc. >> >> When loading something from a global buffer it will calculate the >> address to use, and store that in say TEMP[0].x,

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 17:02, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like

[LLVMdev] Folding vector instructions

2008 Dec 30

[LLVMdev] Folding vector instructions

Hello. Sorry I am not sure this question should go to llvm or mesa3d-dev mailing list, so I post it to both. I am writing a llvm backend for a modern graphics processor which has a ISA very similar to that of Direct 3D. I am reading the code in Gallium-3D driver in a mesa3d branch, which converts the shader programs (TGSI tokens) to LLVM IR. For the shader instruction also found in LLVM IR,

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

Hey, as some of you may know we're in process of experimenting with LLVM in Gallium3D (Mesa's new driver model), where LLVM would be used both in the software only (by just JIT executing shaders) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That

[PATCH 02/13] nv50: add functions for swizzle resolution

2009 Sep 10

[PATCH 02/13] nv50: add functions for swizzle resolution

We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx). --- src/gallium/drivers/nv50/nv50_program.c | 148 ++++++++++++++++++++++++------ 1 files changed, 118 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c

[RFC 3/9] st/glsl_to_tgsi: handle precise modifier

2017 Jun 11

[RFC 3/9] st/glsl_to_tgsi: handle precise modifier

all subexpression inside an ir_assignment needs to be tagged as precise. Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 80 ++++++++++++++++++++++++------ 1 file changed, 65 insertions(+), 15 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index c5d2e0fcd2..19f90f21fe

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

On Thu, 15 Dec 2005, Tzu-Chien Chiu wrote: > To write a compiler for Microsoft Direct3D shaders from our hardware, > I have a program which translates the Direct3D shader assembly to LLVM > assembly. I added several intrinsics for this purpose. > It's a vector ISA and has some special instructions like: > * rcp (reciprocal) > * frc (the fractional portion of each input

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote: > When dealing with non vector variables the llvm register allocator > will use TEMP[0].x then TEMP[0].y, etc. > > When loading something from a global buffer it will calculate the > address to use, and store that in say TEMP[0].x, so it ends up > generating: > > LOAD TEMP[0].y, MEMORY[0],

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

On Thu, 27 Sep 2007, Zack Rusin wrote: > as some of you may know we're in process of experimenting with LLVM in > Gallium3D (Mesa's new driver model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single

[PATCH 01/13] nv50: extend insn src mask function

2009 Sep 10

[PATCH 01/13] nv50: extend insn src mask function

Extend its usage to avoiding e.g. emission of negation instructions in tx_insn for sources we don't need. --- src/gallium/drivers/nv50/nv50_program.c | 118 +++++++++++++++++++------------ 1 files changed, 72 insertions(+), 46 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_program.c b/src/gallium/drivers/nv50/nv50_program.c index 4a83852..a6c70ae 100644 ---

search for: writemasking