thr3ads.net - similar to: "[LLVMdev] Register based vector insert/extract"

Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] Register based vector insert/extract"

[LLVMdev] Register based vector insert/extract

2007 Apr 23

[LLVMdev] Register based vector insert/extract

On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote: > On Mon, 23 Apr 2007, Christopher Lamb wrote: >> How can one let the back end know how to insert and extract >> elements of >> a vector through sub-register copies? I'm at a loss how to do this... > > You probably want to custom lower the insertelement/extractelement > operations for the cases you support.

[LLVMdev] Register based vector insert/extract

2007 Apr 23

[LLVMdev] Register based vector insert/extract

On Mon, 23 Apr 2007, Christopher Lamb wrote: > How can one let the back end know how to insert and extract elements of > a vector through sub-register copies? I'm at a loss how to do this... You probably want to custom lower the insertelement/extractelement operations for the cases you support. Take a look at X86TargetLowering::LowerEXTRACT_VECTOR_ELT for some examples of how to do

[LLVMdev] Register based vector insert/extract

2007 Apr 23

[LLVMdev] Register based vector insert/extract

On Apr 23, 2007, at 1:17 PM, Christopher Lamb wrote: > > On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote: > >> On Mon, 23 Apr 2007, Christopher Lamb wrote: >>> How can one let the back end know how to insert and extract >>> elements of >>> a vector through sub-register copies? I'm at a loss how to do >>> this... >> >> You

[LLVMdev] Register based vector insert/extract

2007 Apr 23

[LLVMdev] Register based vector insert/extract

On Apr 23, 2007, at 1:43 PM, Christopher Lamb wrote: > On Apr 23, 2007, at 1:17 PM, Christopher Lamb wrote: > >> On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote: >> >>> On Mon, 23 Apr 2007, Christopher Lamb wrote: >>>> How can one let the back end know how to insert and extract >>>> elements of >>>> a vector through sub-register

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The following IR implements the following nested loop: for (int i = start ; i < end ; ++i ) for (int p = 0 ; p < 4 ; ++p ) a[i*4+p] = b[i*4+p] + c[i*4+p]; define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float* noalias %arg4, float* noalias %arg5, float* noalias %arg6) { entrypoint: br i1 %arg2, label %L0, label %L1 L0:

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The loop vectorizer relies on cleanup passes to be run after it: from Transforms/IPO/PassManagerBuilder.cpp: // Add the various vectorization passes and relevant cleanup passes for // them since we are no longer in the middle of the main scalar pipeline. MPM.add(createLoopVectorizePass(DisableUnrollLoops)); MPM.add(createInstructionCombiningPass());

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The instcombine pass cleans up a lot. Any idea why there are still shufflevector, insertelement, *and* bitcast (!!) etc. instructions left? The original loop is so clean, a textbook example I'd say. There is no need to shuffle anything.At least I don't see it. Frank vector.ph: ; preds = %L5 %broadcast.splatinsert1 = insertelement <4 x

[LLVMdev] Register based vector insert/extract

2007 Apr 24

[LLVMdev] Register based vector insert/extract

On Apr 23, 2007, at 8:22 PM, Evan Cheng wrote: > > On Apr 23, 2007, at 4:07 PM, Christopher Lamb wrote: > >> Thanks for the detailed response. >> >> On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote: >> >>> Right. Evan is currently focusing on getting the late stages of >>> the code >>> generator (e.g. livevars) to be able to understand

[LLVMdev] Register based vector insert/extract

2007 Apr 24

[LLVMdev] Register based vector insert/extract

On Apr 23, 2007, at 4:07 PM, Christopher Lamb wrote: > Thanks for the detailed response. > > On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote: > >> Right. Evan is currently focusing on getting the late stages of >> the code >> generator (e.g. livevars) to be able to understand arbitrary machine >> instrs in the face of physreg subregs. This lays the

[LLVMdev] Register based vector insert/extract

2007 Apr 23

[LLVMdev] Register based vector insert/extract

Thanks for the detailed response. On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote: > Right. Evan is currently focusing on getting the late stages of > the code > generator (e.g. livevars) to be able to understand arbitrary machine > instrs in the face of physreg subregs. This lays the groundwork for > handling vreg subregs, but won't solve it directly. Is the work Evan

[LLVMdev] Two new 'llvmnotes'

2008 Apr 27

[LLVMdev] Two new 'llvmnotes'

On Apr 27, 2008, at 12:49 PM, Nick Lewycky wrote: > Chris Lattner wrote: >> On Apr 27, 2008, at 10:58 AM, Talin wrote: >> >>> I would certainly make use of this in my frontend. >>> >>> I suggest the names "getfield" and "setfield" for the two >>> operations, >>> >> >> I agree that

[LLVMdev] Improving SLPVectorizer for Julia

2014 Mar 17

[LLVMdev] Improving SLPVectorizer for Julia

I'm working on some small improvements to SLPVectorizer.cpp so that it can deal with some tuple operations arising from Julia code. Being fairly new to LLVM, I could use some advice, particular from those familiar with the internals of SLPVectorizer. The motivation can be found in the Julia discussion https://github.com/JuliaLang/julia/issues/5857 . Here is an example of the kind of LLVM

[LLVMdev] Vectorizer using Instruction, not opcodes

2013 Feb 04

[LLVMdev] Vectorizer using Instruction, not opcodes

Hi all, My take on this is that, as you state below, at the IR level we are only roughly estimating cost, at best (or we would have to lower the code and then estimate cost - something we don't want to do). I would propose for estimating the "worst case costs" and see how far we get with this. My rational here is that we don't want vectorization to decrease performance relative

[LLVMdev] Vectorizer using Instruction, not opcodes

2013 Feb 04

[LLVMdev] Vectorizer using Instruction, not opcodes

On 4 February 2013 18:25, Arnold Schwaighofer <aschwaighofer at apple.com>wrote: > For cases where this approach breaks really badly we could consider adding > a specialized api or parameters (like the type of a user/use). But we > should do so only as a last resort and backed by actual code that would > benefit from doing so. > Very sensible, more or less what I had in

[LLVMdev] Vectorizer using Instruction, not opcodes

2013 Feb 04

[LLVMdev] Vectorizer using Instruction, not opcodes

Hi folks, I've been thinking on how to implement some of the costs and there is a lot of instructions which cost depend on other instructions around. Casts are one obvious case, since arithmetic and memory instructions can, sometimes, cast values for free. The cost model receives Opcodes, which lose the info on the history of the values being vectorized, and I thought we could pass the whole

[LLVMdev] Two new 'llvmnotes'

2008 Apr 27

[LLVMdev] Two new 'llvmnotes'

On 2008-04-27, at 15:56, Chris Lattner wrote: > On Apr 27, 2008, at 12:49 PM, Nick Lewycky wrote: > >> Chris Lattner wrote: >> >>> On Apr 27, 2008, at 10:58 AM, Talin wrote: >>> >>>> I would certainly make use of this in my frontend. >>>> >>>> I suggest the names "getfield" and "setfield" for the two

[LLVMdev] GVN miscompile debugging help

2012 Aug 10

[LLVMdev] GVN miscompile debugging help

I found a case where GVN seems to miscompile an OpenCL program. What I am trying to figure out is given a bitcode file, how can I reduce it to a simpler case with bugpoint when I don't have a valid reference compiler available. Thanks for any tips, Micah -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] InsertElementInst and ExtractElementInst

2014 Jul 22

[LLVMdev] InsertElementInst and ExtractElementInst

Hello, I am create a <3 x i32> vector in LLVM IR. Then I insert 3 instructions and later on I try to load one instruction from the vector. The insertion seems to work, however, when I try to load a specific instruction from a vector I seems that it does not work. This is the part of my IR: %"ins or1" = insertelement <3 x i32> undef, i32 %38, i32 0 %"ins and2"

[LLVMdev] Vector swizzling and write masks code generation

2007 Sep 27

[LLVMdev] Vector swizzling and write masks code generation

Hey, as some of you may know we're in process of experimenting with LLVM in Gallium3D (Mesa's new driver model), where LLVM would be used both in the software only (by just JIT executing shaders) and hardware (drivers will implement LLVM code-generators) cases. While the software only case is pretty straight forward I just realized I missed something in my initial evaluation. That

[LLVMdev] How to vectorize a vector type cast?

2012 Feb 28

[LLVMdev] How to vectorize a vector type cast?

Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as typedef float float4 __attribute__((ext_vector_type(4))); typedef unsigned char uchar4 __attribute__((ext_vector_type(4))); float4 to_float4(uchar4 in) { float4 out = {in.x, in.y, in.z, in.w}; return out; } Running

similar to: [LLVMdev] Register based vector insert/extract