thr3ads.net - search: "arrayidx4"

Displaying 20 results from an estimated 58 matches for "arrayidx4".

Did you mean: arrayidx

[LLVMdev] Avoiding load narrowing in DAGCombiner

2011 Jul 27

[LLVMdev] Avoiding load narrowing in DAGCombiner

...t is known to be in the high-addressed 2 bytes of a word on my little-endian target, I emit and LD4 from the word-aligned address and an SRL 16 to shift the i16 into the LSbits of the register. DAGCombine visit()s an ISD::SRL node and notices that it is right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and replaces it with an LD2 from %arrayidx+2. Replaces -------- 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 With ---- 0x17fceb0: i32,ch = load 0x17faa00,...

Legality of transformation

2020 Apr 04

Legality of transformation

...32 @main() local_unnamed_addr #0 {entry: %A = alloca [2048 x i32], align 16 %B = alloca [2048 x i32], align 16 %"reg2mem alloca point" = bitcast i32 0 to i32 %arrayidx3 = getelementptr inbounds [2048 x i32], [2048 x i32]* %A, i64 0, i64 1024 %0 = load i32, i32* %arrayidx3, align 16 %arrayidx4 = getelementptr inbounds [2048 x i32], [2048 x i32]* %B, i64 0, i64 1024 %1 = load i32, i32* %arrayidx4, align 16 %cmp5 = icmp eq i32 %0, %1 %conv = zext i1 %cmp5 to i32 %call = call i32 (i32, ...) bitcast (i32 (...)* @assert to i32 (i32, ...)*)(i32 %conv) #2 ret i32 0}* It is my understandin...

[LLVMdev] better code for IV

2014 Feb 19

[LLVMdev] better code for IV

...ext i32 %trunc to i64 %arrayidx = getelementptr inbounds float* %a, i64 %idxprom %0 = load float* %arrayidx, align 4 %arrayidx2 = getelementptr inbounds float* %b, i64 %idxprom %1 = load float* %arrayidx2, align 4 %add = fadd float %0, %1 %arrayidx4 = getelementptr inbounds float* %c, i64 %idxprom store float %add, float* %arrayidx4, align 4 %L_inc_ind_var = add nuw nsw i64 %L_ind_var, 1 %L_cmp.to.max = icmp eq i64 %L_inc_ind_var, %iNumElements %L_inc_tid = add nuw nsw i64 %L_tid, 1 br i1 %L_cm...

[LLVMdev] Avoiding load narrowing in DAGCombiner

2011 Jul 27

[LLVMdev] Avoiding load narrowing in DAGCombiner

...he high-addressed 2 bytes of a word > on my little-endian target, I emit and LD4 from the word-aligned address > and an SRL 16 to shift the i16 into the LSbits of the register. > > DAGCombine visit()s an ISD::SRL node and notices that it is > right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and > replaces it with an LD2 from %arrayidx+2. > > Replaces > -------- > 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> > 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] > 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 > > With...

[LLVMdev] Avoiding load narrowing in DAGCombiner

2011 Jul 27

[LLVMdev] Avoiding load narrowing in DAGCombiner

...bytes of a word >> on my little-endian target, I emit and LD4 from the word-aligned address >> and an SRL 16 to shift the i16 into the LSbits of the register. >> >> DAGCombine visit()s an ISD::SRL node and notices that it is >> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and >> replaces it with an LD2 from %arrayidx+2. >> >> Replaces >> -------- >> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> >> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] >> 0x17f7470: i32 = srl 0x17f7...

[LLVMdev] alloca scalarization with dynamic indexing into vectors

2013 Feb 07

[LLVMdev] alloca scalarization with dynamic indexing into vectors

...%arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1 %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9 %arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 1 store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9 %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2 %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9 %arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 2 store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9 %idx.ext = ze...

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

On 5 February 2013 17:28, Nadav Rotem <nrotem at apple.com> wrote: > We insert runtime overlap checks only for unidentified objects. The > problem here is that the vectorizer thinks that A,B,C are all pointers to > the same array, so it gives up. If A,B,C were different arrays then it > could have used runtime checks. > Yes, that is exactly the code that creates the

[LLVMdev] Avoiding load narrowing in DAGCombiner

2011 Jul 27

[LLVMdev] Avoiding load narrowing in DAGCombiner

...>> on my little-endian target, I emit and LD4 from the word-aligned address >>> and an SRL 16 to shift the i16 into the LSbits of the register. >>> >>> DAGCombine visit()s an ISD::SRL node and notices that it is >>> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and >>> replaces it with an LD2 from %arrayidx+2. >>> >>> Replaces >>> -------- >>> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> >>> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] >>>...

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

...%for.cond1 %5 = load i64* %i, align 8 %mul = mul nsw i64 %5, 4 %6 = load i64* %q, align 8 %add = add nsw i64 %mul, %6 %7 = load float** %A.addr, align 8 %arrayidx = getelementptr inbounds float* %7, i64 %add %8 = load float* %arrayidx, align 4 %9 = load i64* %q, align 8 %arrayidx4 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 %9 %10 = load float* %arrayidx4, align 4 %add5 = fadd float %10, %8 store float %add5, float* %arrayidx4, align 4 br label %for.inc for.inc: ; preds = %for.body3 %11 = load i64* %q, align...

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

...initializer, align 8 ... %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom %0 = load i64* %arrayidx, align 8 %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom %1 = load i64* %arrayidx2, align 8 %mul = mul nsw i64 %1, %0 %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom store i64 %mul, i64* %arrayidx4, align 8 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130205/373cd8d0/attachment.html>

Question about instcombine pass.

2018 Feb 27

Question about instcombine pass.

...--------------------- IR.(Excerpt) ---------------------------------- for.body: ; preds = %for.body, %entry %store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul10, %for.body ] %indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ] %arrayidx4 = getelementptr inbounds [10 x i32], [10 x i32]* @a, i64 0, i64 %indvars.iv %0 = load i32, i32* %arrayidx4, align 4, !tbaa !2 %mul10 = mul nsw i32 %store_forwarded, %store_forwarded %arrayidx12 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64 %indvars.iv store i32 %mul10, i32...

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

2012 Apr 23

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Hi, When I write various test cases and explore how they're handled by the code in LoopDependenceAnalysis::analysePair, I'm surprised. This loop collects pairs of subscripts from the source and destination refs. * // Collect GEP operand pairs (FIXME: use GetGEPOperands from BasicAA), adding* * // trailing zeroes to the smaller GEP, if needed.* * GEPOpdsTy destOpds, srcOpds;* *

MemorySSA question

2017 Dec 19

MemorySSA question

...arrayidx = getelementptr inbounds i32, i32* %b, i64 %indvars.iv35 ; MemoryUse(3) %2 = load i32, i32* %arrayidx, align 4, !tbaa !2 %arrayidx2 = getelementptr inbounds i32, i32* %c, i64 %indvars.iv35 ; MemoryUse(3) %3 = load i32, i32* %arrayidx2, align 4, !tbaa !2 %add = add nsw i32 %3, %2 %arrayidx4 = getelementptr inbounds i32, i32* %a, i64 %indvars.iv35 *; 1 = MemoryDef(3)* store i32 %add, i32* %arrayidx4, align 4, !tbaa !2 %indvars.iv.next36 = add nuw nsw i64 %indvars.iv35, 5 %cmp = icmp slt i64 %indvars.iv.next36, %1 br i1 %cmp, label %for.body, label %for.end for.end:...

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 18

[LLVMdev] [Vectorization] Mis match in code generated

...32* %a, > align 4, !tbaa !1 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 = > load i32* %arrayidx1, align 4, !tbaa !1 %add = add nsw i32 %1, %0 > %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32* > %arrayidx2, align 4, !tbaa !1 %add3 = add nsw i32 %add, %2 %arrayidx4 = > getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4, > !tbaa !1 %add5 = add nsw i32 %add3, %3 %arrayidx6 = getelementptr > inbounds i32* %a, i32 4 %4 = load i32* %arrayidx6, align 4, !tbaa !1 > %add7 = add nsw i32 %add5, %4 %arrayidx8 = getelementptr inboun...

[LLVMdev] Vectorizing global struct pointers

2013 Feb 05

[LLVMdev] Vectorizing global struct pointers

...; > %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom > %0 = load i64* %arrayidx, align 8 > %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom > %1 = load i64* %arrayidx2, align 8 > %mul = mul nsw i64 %1, %0 > %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom > store i64 %mul, i64* %arrayidx4, align 8 > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.e...

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 19

[LLVMdev] [Vectorization] Mis match in code generated

...%0 = load i32* %a, align 4, !tbaa !1 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 = load i32* %arrayidx1, align 4, !tbaa !1 %add = add nsw i32 %1, %0 %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32* %arrayidx2, align 4, !tbaa !1 %add3 = add nsw i32 %add, %2 %arrayidx4 = getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4, !tbaa !1 %add5 = add nsw i32 %add3, %3 %arrayidx6 = getelementptr inbounds i32* %a, i32 4 %4 = load i32* %arrayidx6, align 4, !tbaa !1 %add7 = add nsw i32 %add5, %4 %arrayidx8 = getelementptr inbounds i32* %a, i...

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

...= insertelement <4 x double> undef, double %0, i32 0 %add = add i64 %i, 1 %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add %1 = load double, double* %arrayidx1, align 8 %vecinit2 = insertelement <4 x double> %vecinit, double %1, i32 1 %add3 = add i64 %i, 2 %arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3 %2 = load double, double* %arrayidx4, align 8 %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 %add6 = add i64 %i, 3 %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 %3 = load double, doubl...

[LLVMdev] [Vectorization] Mis match in code generated

2014 Sep 18

[LLVMdev] [Vectorization] Mis match in code generated

...{entry: %0 = load i32* %a, align 4, !tbaa !1 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 = load i32* %arrayidx1, align 4, !tbaa !1 %add = add nsw i32 %1, %0 %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32* %arrayidx2, align 4, !tbaa !1 %add3 = add nsw i32 %add, %2 %arrayidx4 = getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4, !tbaa !1 %add5 = add nsw i32 %add3, %3 %arrayidx6 = getelementptr inbounds i32* %a, i32 4 %4 = load i32* %arrayidx6, align 4, !tbaa !1 %add7 = add nsw i32 %add5, %4 %arrayidx8 = getelementptr inbounds i32* %a, i32 5 %...

[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)

2013 Feb 19

[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)

On 02/19/2013 05:51 PM, Hal Finkel wrote: > Understood. If you have some time, it seems that there are several sub-tasks: > > - Update the language reference Document the additional optional iteration id argument to llvm.mem.parallel_loop_access? I'll do this. > - Update the loop vectorizer (to update the metadata when it unrolls) > - Update the regular unroller

[LLVMdev] RFC: [PATCH] parallel loop metadata

2013 Feb 04

[LLVMdev] RFC: [PATCH] parallel loop metadata

...correct use of +both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access`` +metadata types that refer to the same loop identifier metadata. + +.. code-block:: llvm + + for.body: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0 + + for.end: + ... + !0 = metadata !{ metadata !0 } + +It is also possible to have nested parallel loops. In that case the +memory accesses refer to a list of loop...

search for: arrayidx4