search for: extractelements

Displaying 20 results from an estimated 239 matches for "extractelements".

Did you mean: extractelement
2014 Mar 17
2
[LLVMdev] Improving SLPVectorizer for Julia
I'm working on some small improvements to SLPVectorizer.cpp so that it can deal with some tuple operations arising from Julia code. Being fairly new to LLVM, I could use some advice, particular from those familiar with the internals of SLPVectorizer. The motivation can be found in the Julia discussion https://github.com/JuliaLang/julia/issues/5857 . Here is an example of the kind of LLVM
2015 Jun 26
3
[LLVMdev] extractelement causes memory access violation - what to do?
Hi, Let's have a simple program: define i32 @main(i32 %n, i64 %idx) { %idxSafe = trunc i64 %idx to i5 %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx ret i32 %r } The assembly of that would be: pcmpeqd %xmm0, %xmm0 movdqa %xmm0, -24(%rsp) movl -24(%rsp,%rsi,4), %eax retq The language reference states that the extractelement instruction produces
2015 Jun 30
2
[LLVMdev] extractelement causes memory access violation - what to do?
On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <david.majnemer at gmail.com> wrote: > On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <chfast at gmail.com> wrote: > >> Hi, >> >> Let's have a simple program: >> define i32 @main(i32 %n, i64 %idx) { >> %idxSafe = trunc i64 %idx to i5 >> %r = extractelement <4 x i32> <i32 -1, i32
2015 Jun 26
2
[LLVMdev] extractelement causes memory access violation - what to do?
On 06/26/2015 08:42 AM, David Majnemer wrote: > > > On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <chfast at gmail.com > <mailto:chfast at gmail.com>> wrote: > > Hi, > > Let's have a simple program: > define i32 @main(i32 %n, i64 %idx) { > %idxSafe = trunc i64 %idx to i5 > %r = extractelement <4 x i32> <i32 -1, i32
2013 Feb 04
6
[LLVMdev] Vectorizer using Instruction, not opcodes
On 4 February 2013 18:25, Arnold Schwaighofer <aschwaighofer at apple.com>wrote: > For cases where this approach breaks really badly we could consider adding > a specialized api or parameters (like the type of a user/use). But we > should do so only as a last resort and backed by actual code that would > benefit from doing so. > Very sensible, more or less what I had in
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The following IR implements the following nested loop: for (int i = start ; i < end ; ++i ) for (int p = 0 ; p < 4 ; ++p ) a[i*4+p] = b[i*4+p] + c[i*4+p]; define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float* noalias %arg4, float* noalias %arg5, float* noalias %arg6) { entrypoint: br i1 %arg2, label %L0, label %L1 L0:
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The loop vectorizer relies on cleanup passes to be run after it: from Transforms/IPO/PassManagerBuilder.cpp: // Add the various vectorization passes and relevant cleanup passes for // them since we are no longer in the middle of the main scalar pipeline. MPM.add(createLoopVectorizePass(DisableUnrollLoops)); MPM.add(createInstructionCombiningPass());
2015 Jun 30
2
[LLVMdev] extractelement causes memory access violation - what to do?
On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- > > From: "Paweł Bylica" <chfast at gmail.com> > > To: "David Majnemer" <david.majnemer at gmail.com> > > Cc: "LLVMdev" <llvmdev at cs.uiuc.edu> > > Sent: Tuesday, June 30, 2015 5:42:24 AM > > Subject: Re:
2013 Feb 04
0
[LLVMdev] Vectorizer using Instruction, not opcodes
Hi all, My take on this is that, as you state below, at the IR level we are only roughly estimating cost, at best (or we would have to lower the code and then estimate cost - something we don't want to do). I would propose for estimating the "worst case costs" and see how far we get with this. My rational here is that we don't want vectorization to decrease performance relative
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The instcombine pass cleans up a lot. Any idea why there are still shufflevector, insertelement, *and* bitcast (!!) etc. instructions left? The original loop is so clean, a textbook example I'd say. There is no need to shuffle anything.At least I don't see it. Frank vector.ph: ; preds = %L5 %broadcast.splatinsert1 = insertelement <4 x
2013 Feb 04
0
[LLVMdev] Vectorizer using Instruction, not opcodes
----- Original Message ----- > From: "Renato Golin" <renato.golin at linaro.org> > To: "Arnold Schwaighofer" <aschwaighofer at apple.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>, "Nadav Rotem" <nrotem at apple.com>, "Hal Finkel" <hfinkel at anl.gov> > Sent: Monday, February 4, 2013 1:38:03 PM >
2013 Feb 04
0
[LLVMdev] Vectorizer using Instruction, not opcodes
The loop vectorized does not estimate the cost of vectorization by looking at the IR you list below. It does not vectorize and then run the CostAnalysis pass. It estimates the cost itself before it even performs the vectorization. The way it works is that it looks at all the scalar instructions and asks: What is the cost if I execute the scalar instruction as a vector instruction. Therefore, it
2015 Jul 01
2
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message ----- > From: "Pete Cooper" <peter_cooper at apple.com> > To: "Paweł Bylica" <chfast at gmail.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVMdev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, July 1, 2015 12:08:37 PM > Subject: Re: [LLVMdev] extractelement causes memory access violation - what
2015 Jul 02
2
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message ----- > From: "David Majnemer" <david.majnemer at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Pete Cooper" <peter_cooper at apple.com>, "LLVMdev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, July 1, 2015 7:17:19 PM > Subject: Re: [LLVMdev] extractelement causes memory access violation
2013 Feb 04
2
[LLVMdev] Vectorizer using Instruction, not opcodes
Hi folks, I've been thinking on how to implement some of the costs and there is a lot of instructions which cost depend on other instructions around. Casts are one obvious case, since arithmetic and memory instructions can, sometimes, cast values for free. The cost model receives Opcodes, which lose the info on the history of the values being vectorized, and I thought we could pass the whole
2015 Jul 01
3
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message ----- > From: "Pete Cooper" <peter_cooper at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVMdev" <llvmdev at cs.uiuc.edu>, "Paweł Bylica" <chfast at gmail.com> > Sent: Wednesday, July 1, 2015 6:42:41 PM > Subject: Re: [LLVMdev] extractelement causes memory access violation - what to
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
Yes, you need the latest ToT version of llvm or you run -loop-vectorize -earlycse -instcombine -simplifycfg The bitcast essentially is a noop to satisfy the type system. This is how your example looks like for me: vector.body: ; preds = %vector.body, %vector.ph %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %.lhs = shl i64 %6, 2
2012 Sep 02
2
[LLVMdev] branch on vector compare?
Hi all, llvm newbie here. I'm trying to branch based on a vector compare. I've found a slow way (below) which goes through memory. Is there some idiom I'm missing so that it would use for instance movmsk for SSE or vcmpgt & cr6 for altivec? Or do I need to resort to calling the intrinsic directly? Thanks, Stephen. %16 = fcmp ogt <4 x float> %15, %cr %17 =
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 3:19 PM, Hal Finkel <hfinkel at anl.gov> wrote: > On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote: >> On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> > Thanks! Did you compile with any non-default flags other than -mllvm >> > -vectorize? >> >> I used -O3 and -vectorize, no other non-default
2017 Mar 14
3
llvm-stress crash
Hi, Using llvm-stress, I got a crash after Post-RA pseudo expansion, with machine verifier. A 128 bit register %vreg233:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg233 GRX32Bit:%vreg119 gets spilled: %vreg265:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg265 GRX32Bit:%vreg119 ST128 %vreg265, <fi#10>, 0, %noreg;