thr3ads.net - similar to: "[LLVMdev] Vector swizzling and write masks code generation"

Displaying 20 results from an estimated 1200 matches similar to: "[LLVMdev] Vector swizzling and write masks code generation"

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

On Thu, 27 Sep 2007, Zack Rusin wrote: > as some of you may know we're in process of experimenting with LLVM in > Gallium3D (Mesa's new driver model), where LLVM would be used both in the > software only (by just JIT executing shaders) and hardware (drivers will > implement LLVM code-generators) cases. Yep, nifty! > That is graphics hardware (basically every single

Textures Twiddling/Swizzling

2018 Sep 19

Textures Twiddling/Swizzling

Thanks for the last info it was truely helpful. Anyways, I'm currently trying to implement 3D textures into yuzu, as far as I know they are twiddled in a different manner to 2D textures. Could one of you guys point me in the right direction? I've been meddling around: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv50/nv50_tex.c but I can't see where the

[LLVMdev] Vector instructions

2008 Jun 27

[LLVMdev] Vector instructions

Hi Dan, Thanks for your comments. I've responded inline below. On 26-Jun-08, at 6:49 PM, Dan Gohman wrote: > On Jun 26, 2008, at 1:56 PM, Stefanus Du Toit wrote: >> >> === >> 1. Shufflevector only accepts vectors of the same type >> >> I would propose to change the syntax from: >> >>> <result> = shufflevector <n x <ty>>

[LLVMdev] Vector instructions

2008 Jun 27

[LLVMdev] Vector instructions

On Jun 27, 2008, at 8:02 AM, Stefanus Du Toit wrote: >>>> <result> = shufflevector <a x <ty>> <v1>, <b x <ty>> <v2>, <d x >>>> i32> >>>> <mask> ; yields <d x <ty>> >>> >>> With the requirement that the entries in the (still constant) mask >>> are >>> within

[LLVMdev] Extend SLPVectorizer to struct operations that are isomorphic to vector operations?

2014 Apr 17

[LLVMdev] Extend SLPVectorizer to struct operations that are isomorphic to vector operations?

While playing with SLPVectorizer, I notice that it will happily vectorize cases involving extractelement/insertelement, but won't vectorize isomorphic cases involving extractvalue/insertvalue (such as the attached example). Is that something that could be straightforward to add to SLPVectorizer, or are there some hard issue? In particular, the transformation would seem to require casts of

[LLVMdev] NEON intrinsics preventing redundant load optimization?

2014 Dec 07

[LLVMdev] NEON intrinsics preventing redundant load optimization?

Hi all, I’m not sure if this is the right list, so apologies if not. Doing some profiling I noticed some of my hand-tuned matrix multiply code with NEON intrinsics was much slower through a C++ template wrapper vs calling the intrinsics function directly. It turned out clang/LLVM was unable to eliminate a temporary even though the case seemed quite straightforward. Unfortunately any loads

[LLVMdev] tablegen pattern

2014 Aug 11

[LLVMdev] tablegen pattern

Hi Guys, I have a taget instruction which take a vec4 and returns a vec4.( say instruction “vec4:$dst mod( vec4:$src)" ) And I want to use it to match i an ir instruction/intrinsic function( say " float:$dst llvm.irmod( vec4:$src)" which takes a vec4, output a float. I think the procedure is: when I see the intrinsic llvm.irmod, I need to call "extractlt(

map two names into one

2012 Nov 08

map two names into one

Thanks. Yes. Your approach can identify: Glaxy ace S 5830 and S 5830 Glaxy ace But you can not identify using same program: Iphone 4S 16 G Iphone 4S 16G How should I solve both in same time. Kind regards,Tammy [[alternative HTML version deleted]]

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of 32bit each. So it looks like this, xyzw -> {xy, zw} -> {x, y, z,

[LLVMdev] Expand vector type

2012 Feb 29

[LLVMdev] Expand vector type

Hello, My input language has support for 3 and 4 element vectors but my target only has support for the latter. The language defines vec3 with the same storage space as vec4 so from a backend perspective they are both the same. I'd really like if I could have LLVM treat vec3 as vec4 but I haven't found out how. Currently the target has emulated support for vec3 through LLVM. Loads are

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 07-04-16 15:58, Ilia Mirkin wrote: > That's wrong. It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any

[LLVMdev] How to partition registers into different RegisterClass?

2005 Jul 23

[LLVMdev] How to partition registers into different RegisterClass?

On Sat, 23 Jul 2005, Tzu-Chien Chiu wrote: > 2005/7/23, Chris Lattner <sabre at nondot.org>: >> What does a 'read only' register mean? Is it a constant (e.g. returns >> 1.0)? Otherwise, how can it be a useful value? > > Yes, it's a constant register. > > Because the instruction cannot contain an immediate value, a constant > value may be stored in

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

On Jul 1, 2011, at 12:16 PM, Villmow, Micah wrote: > I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. > > Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. > Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of

[LLVMdev] Do I need to add new intrinsic functions for the OpenGL shading language swizzle?

2008 Nov 18

[LLVMdev] Do I need to add new intrinsic functions for the OpenGL shading language swizzle?

OpenGL shading language (GLSL) is like a C subset language, but it contains some special features, ex: native vector type & swizzle. In GLSL, you can declare vector types: void main() { vec4 a; vec3 b; vec2 c; } You can access the element of vector by using .xyzw, it means the 1st, 2nd, 3rd, 4th element of the vector are x, y, z, w. Ex: void main() { float f; vec4 a = vec4(1.0,

[LLVMdev] How to partition registers into different RegisterClass?

2005 Jul 25

[LLVMdev] How to partition registers into different RegisterClass?

Thanks, I think it can solve my problem. But please allow me to explain the hardware in detail. Hope there is more elegant way to solve it. The hardware is a "stream processor". That is, It processes samples one by one. Each sample is associated with several 128-bit four-element vector registers, namely: * input registers - the attributes of the sample, the values of the registers

[LLVMdev] How to partition registers into different RegisterClass?

2005 Jul 23

[LLVMdev] How to partition registers into different RegisterClass?

2005/7/23, Chris Lattner <sabre at nondot.org>: > > What does a 'read only' register mean? Is it a constant (e.g. returns > 1.0)? Otherwise, how can it be a useful value? Yes, it's a constant register. Because the instruction cannot contain an immediate value, a constant value may be stored in a constant register, and it's defined _before_ the program starts by

[LLVMdev] Expand vector type

2012 Feb 29

[LLVMdev] Expand vector type

James, Thanks for your response. I'm working in LLVM 2.7 (I know, it's old) and the default behavior is already promote. This means that for example a call to DAGTypeLegalizer::getTypeAction(v3i32) in my case and I presume in ARM NEON returns TypeWidenVector. From here legalization calls WidenVectorOperand() to process the STORE node and follows the call chain I have on my original email

[PATCH] nouveau: codegen: Take src swizzle into account on loads

2016 Apr 08

[PATCH] nouveau: codegen: Take src swizzle into account on loads

Hi, On 08-04-16 17:02, Ilia Mirkin wrote: > On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 07-04-16 15:58, Ilia Mirkin wrote: >>> >>> That's wrong. >> >> >> It used to work with the old RES[] code and if one cannot specify >> a source swizzle, then how can I do something like

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

From: Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] Sent: Friday, July 01, 2011 2:56 PM To: Villmow, Micah Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] (no subject) On Jul 1, 2011, at 12:16 PM, Villmow, Micah wrote: I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and

[LLVMdev] About JIT by LLVM 2.9 or later

2011 Nov 02

[LLVMdev] About JIT by LLVM 2.9 or later

Hello guys, Thanks for your help when you are busing. I am working on an open source project. It supports shader language and I want JIT feature, so LLVM is used. But now I find the ABI & Calling Convention did not co-work with MSVC. For example, following code I have: struct float4 { float x, y, z, w; }; struct float4x4 { float4 x, y, z, w; }; float4 fetch_vs( float4x4* mat

similar to: [LLVMdev] Vector swizzling and write masks code generation