thr3ads.net - search: "scevgep"

Displaying 20 results from an estimated 35 matches for "scevgep".

2010 May 14

[LLVMdev] type of the store operand

Hiya, I want to know what's the type of the operand (value) along with the store instruction. For example, the following instruction store the content in an integer type pointer scevgep99.1 to the address another integer pointer points to. *"store i8* %scevgep99.1, i8** %scevgep92.1, align 4"* It seems there is no APIs in StoreInst can do this. Is there any way to figure the type of the operand of the store instruction? ----------------------------------- %scevgep...

[LLVMdev] type of the store operand

2010 May 14

[LLVMdev] type of the store operand

Zheng Wang wrote: > Hiya, > > I want to know what's the type of the operand (value) along with the > store instruction. For example, the following instruction store the > content in an integer type pointer scevgep99.1 to the address another > integer pointer points to. > > *"store i8* %scevgep99.1, i8** %scevgep92.1, align 4"* > > It seems there is no APIs in StoreInst can do this. > > Is there any way to figure the type of the operand of the store instruction? > You can...

[LLVMdev] Unrolling an arithmetic expression inside a loop

2010 Nov 23

[LLVMdev] Unrolling an arithmetic expression inside a loop

...(trunk 118238) Target: x86_64-unknown-linux-gnu Thread model: posix What I get from clang is: 1) In exec0 it recognizes that it doesn't need the tables and doesn't create them and also, it simplifies the expression, so that the generated code inside the loop is just a multiplication: %scevgep = getelementptr i32* %X, i64 %indvar %scevgep2 = getelementptr i32* %Y, i64 %indvar %scevgep9 = getelementptr i32* %res, i64 %indvar %3 = load i32* %scevgep, align 4, !tbaa !0 %4 = load i32* %scevgep2, align 4, !tbaa !0 %5 = mul nsw i32 %4, %3 store i32 %5, i32* %scevgep9, a...

[LLVMdev] Strange pointer aliasing behaviour

2010 Jun 16

[LLVMdev] Strange pointer aliasing behaviour

...ptimized : ******** define i32 @_Z5func1v() nounwind readnone { entry: ret i32 4 } ******** func2() should give the exact same result as func1, however ... Alias Set Tracker: 1 alias sets for 2 pointer values. AliasSet[0x0x8ca140,2] may alias, Mod/Ref Pointers: (i32* %0, 4), (double* %scevgep.i, 8) A spurious alias comes up between the 2 fields of the struct (which should in theory not happen). So, it reloads _length at each iteration, thus no optimization takes place and the code below is generated : ******** define i32 @_Z5func2v() nounwind readnone { entry: %l = alloca %str...

[LLVMdev] alias set collapse and LICM

2015 Apr 28

[LLVMdev] alias set collapse and LICM

On Mon, Apr 27, 2015 at 4:21 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > You can't win here (believe me, i've tried, and better people than me have > tried, for years :P). > No matter what you do, the partitioning will never be 100% precise. The > only way to solve that in general is to pairwise query over the > partitioning. > > Your basic problem is

[LLVMdev] How to identify the first BB in a loop of non-constant trip count?

2012 Jan 23

[LLVMdev] How to identify the first BB in a loop of non-constant trip count?

...32 %n, 0 br i1 %1, label %.lr.ph, label %._crit_edge .lr.ph: ; preds = %0 %tmp = zext i32 %n to i64 br label %2 ; <label>:2 ; preds = %2, %.lr.ph %indvar = phi i64 [ 0, %.lr.ph ], [ %indvar.next, %2 ] %scevgep = getelementptr i32* %a, i64 %indvar %scevgep2 = getelementptr i32* %b, i64 %indvar %3 = load i32* %scevgep2, align 4, !tbaa !0 store i32 %3, i32* %scevgep, align 4, !tbaa !0 %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, %tmp br i1 %exitcond, label %._crit_edge...

[RFC] New pass: LoopExitValues

2015 Sep 03

[RFC] New pass: LoopExitValues

...quite see what > patterns it would be useful on. You've mentioned matrix multiply - how does > this pass alter the IR? Here's before and after IR for the matrix_mul example. Notice the two bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V pass converts these to scevgep values that already exist. *** Code after LSR *** ; Function Attrs: nounwind optsize define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src, i32 %Val) #0 { entry: %cmp.25 = icmp eq i32 %Size, 0 br i1 %cmp.25, label %for.cond.cleanup, label %for.body.4.lr.ph.prehe...

[LLVMdev] determining the number of iteration of a loop

2010 Apr 21

[LLVMdev] determining the number of iteration of a loop

In your example the the number of iterations is known -- it is N. It is not known at compile time, but it's known at run-time before you enter the loop. So you can do transforms like if( N < threshold ) copy of loop optimized for small iterations count; else copy of loop optimized for large iterations count; But you are right, in general, the number of iterations in unknown. I think Khaled

[LLVMdev] Confuse on getSCEVAtScope

2010 Jun 29

[LLVMdev] Confuse on getSCEVAtScope

On Jun 29, 2010, at 7:08 AM, ether zhhb wrote: > > why computeSCEVAtScope not try to get the operands in the current > scope like the function do with SCEVCommutativeExpr, like: > > if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(V)) { > if (!L || !AddRec->getLoop()->contains(L)) { > ... > // Then, evaluate the AddRec. >

llc error

2016 Sep 03

llc error

...attached LLVM assembly file fails to generate x86 code when compiled using llc. compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 -mcpu=core-avx2 ex4.ll The error message is, LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, undef:i64 t16: i64 = add t2, Constant:i64<16> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg5 t1: i64 = Register %vreg5 t15: i64 = Constant<16> t4: i64 = undef In function: _ZN10soundtouch12TDStretchSSE13calcCrossCorrEPK...

[LLVMdev] Strange pointer aliasing behaviour

2010 Jun 17

[LLVMdev] Strange pointer aliasing behaviour

...nwind readnone { > entry: > ret i32 4 > } > > > ******** > func2() should give the exact same result as func1, however ... > > Alias Set Tracker: 1 alias sets for 2 pointer values. > AliasSet[0x0x8ca140,2] may alias, Mod/Ref Pointers: (i32* %0, 4), > (double* %scevgep.i, 8) > > A spurious alias comes up between the 2 fields of the struct (which should > in theory not happen). > So, it reloads _length at each iteration, thus no optimization takes place > and the code below is generated : > > ******** > > define i32 @_Z5func2v() nounwind...

[LLVMdev] Confuse on getSCEVAtScope

2010 Jun 29

[LLVMdev] Confuse on getSCEVAtScope

hi all, i have SCEVAddRec {{(32 + @edge.8265),+,32}<Loop0>,+,4}<Loop1> where Loop0 and Loop1 are brothers (loops at the same level of the loopnest), and Loop0 have a computable backedge taken count. when i call getSCEVAtScope({{(32 + @edge.8265),+,32}<Loop0>,+,4}<Loop1> , Loop1), it just give me a {{(32 + @edge.8265),+,32}<Loop0>,+,4}<Loop1>, instead of

[LLVMdev] Loop-Unroll optimization

2011 May 03

[LLVMdev] Loop-Unroll optimization

...0; i++) { c[i] = a[i] + b[i]; } printf("%d\n", c[999]); ------------------------------------------------- and bit-code in *Hello4.bc* bb3: ; preds = %bb3, %bb3.preheader %i.17 = phi i32 [ %5, %bb3 ], [ 0, %bb3.preheader ] %scevgep11 = getelementptr [1000 x i32]* %b, i32 0, i32 %i.17 %scevgep10 = getelementptr [1000 x i32]* %a, i32 0, i32 %i.17 %scevgep = getelementptr [1000 x i32]* %c, i32 0, i32 %i.17 %2 = load i32* %scevgep10, align 4 %3 = load i32* %scevgep11, align 4 %4 = add nsw i32 %3, %2 store i32 %4, i32*...

[LLVMdev] Loop-Unroll optimization

2011 May 04

[LLVMdev] Loop-Unroll optimization

...} > > printf("%d\n", c[999]); > ------------------------------------------------- > > and bit-code in *Hello4.bc* > bb3: ; preds = %bb3, > %bb3.preheader > %i.17 = phi i32 [ %5, %bb3 ], [ 0, %bb3.preheader ] > %scevgep11 = getelementptr [1000 x i32]* %b, i32 0, i32 %i.17 > %scevgep10 = getelementptr [1000 x i32]* %a, i32 0, i32 %i.17 > %scevgep = getelementptr [1000 x i32]* %c, i32 0, i32 %i.17 > %2 = load i32* %scevgep10, align 4 > %3 = load i32* %scevgep11, align 4 > %4 = add nsw i32 %3...

[LLVMdev] Loop-Unroll optimization

2011 May 03

[LLVMdev] Loop-Unroll optimization

Hi, You might want to try running -loops -loop-simplify before loop unroll. >From loop simplify.cpp This pass performs several transformations to transform natural loops into a00011 // simpler form, which makes subsequent analyses and transformations simpler and00012 // more effective. Arushi On Tue, May 3, 2011 at 2:17 PM, Manish Gupta <mgupta.iitr at gmail.com> wrote: > You

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 04

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to

llc error

2016 Sep 03

llc error

...using llc. >> >> compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 >> -mcpu=core-avx2 ex4.ll >> >> The error message is, >> >> LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 >> t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, >> undef:i64 >> t16: i64 = add t2, Constant:i64<16> >> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg5 >> t1: i64 = Register %vreg5 >> t15: i64 = Constant<16> >> t4: i64 = undef >&g...

Objects of MemoryLocation class are created for ‘llvm.memset.*‘ intrinsics

2015 Dec 06

Objects of MemoryLocation class are created for ‘llvm.memset.*‘ intrinsics

...bug.cgi?id=23077) the AliasSetTracker constructs 128 alias sets for 0 pointer values, which contain only unknown instructions. In this case, all unknown instructions, which are added to new alias sets in the AliasSetTracker::addUnknown, have the following form: call void @llvm.memset.p0i8.i64(i8* %scevgep..., i8 0, i64 256, i32 8, i1 false) Furthermore, in this case, there aren’t any unknown instructions, which are added by AliasSetTracker::addUnknown to alias sets that are found by findAliasSetForUnknownInst. That’s why I would like to check objects of MemoryLocation class that are created for ‘l...

[RFC] New pass: LoopExitValues

2015 Sep 10

[RFC] New pass: LoopExitValues

...You've mentioned matrix multiply - how > does > >> this pass alter the IR? > > > > Here's before and after IR for the matrix_mul example. Notice the two > > bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V > > pass converts these to scevgep values that already exist. > > > > *** Code after LSR *** > > > > ; Function Attrs: nounwind optsize > > define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture > > readonly %Src, i32 %Val) #0 { > > entry: > > %cmp.25 = icmp eq i32 %S...

search for: scevgep