thr3ads.net - search: "arrayidx7"

Displaying 20 results from an estimated 30 matches for "arrayidx7".

Did you mean: arrayidx

2013 Oct 31

[LLVMdev] SCEV and GEP NSW flag

...er, label %if.end for.body.preheader: br label %for.body for.body: %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ] %add = add nsw i64 %indvars.iv, %k %arrayidx = getelementptr inbounds i32* %a, i64 %add %0 = load i32* %arrayidx, align 4, !tbaa !1 ... %arrayidx7 = getelementptr inbounds i32* %a, i64 %indvars.iv store i32 %add5, i32* %arrayidx7, align 4, !tbaa !1 ... My goal here is to detect that, within the loop, (&a[i] - &a[i + k]) is negative, as the explicit loop guard guarantees. On the path toward that goal, I've run into the following...

[LLVMdev] SCEV getMulExpr() not propagating Wrap flags

2013 Nov 13

[LLVMdev] SCEV getMulExpr() not propagating Wrap flags

...r.body ] %0 = shl nsw i64 %indvars.iv, 1 %arrayidx = getelementptr inbounds i32* %b, i64 %0 %1 = load i32* %arrayidx, align 4, !tbaa !1 %add = add nsw i32 %1, %I %arrayidx3 = getelementptr inbounds i32* %a, i64 %0 store i32 %add, i32* %arrayidx3, align 4, !tbaa !1 %2 = or i64 %0, 1 %arrayidx7 = getelementptr inbounds i32* %b, i64 %2 %3 = load i32* %arrayidx7, align 4, !tbaa !1 %add8 = add nsw i32 %3, %I %arrayidx12 = getelementptr inbounds i32* %a, i64 %2 store i32 %add8, i32* %arrayidx12, align 4, !tbaa !1 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp e...

[LLVMdev] SCEV and GEP NSW flag

2013 Nov 02

[LLVMdev] SCEV and GEP NSW flag

...> br label %for.body > > for.body: > %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ] > %add = add nsw i64 %indvars.iv, %k > %arrayidx = getelementptr inbounds i32* %a, i64 %add > %0 = load i32* %arrayidx, align 4, !tbaa !1 > ... > %arrayidx7 = getelementptr inbounds i32* %a, i64 %indvars.iv > store i32 %add5, i32* %arrayidx7, align 4, !tbaa !1 > ... > > My goal here is to detect that, within the loop, (&a[i] - &a[i + k]) is negative, as the explicit loop guard guarantees. On the path toward that goal, I've run...

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

...rtelement <4 x double> %vecinit, double %1, i32 1 %add3 = add i64 %i, 2 %arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3 %2 = load double, double* %arrayidx4, align 8 %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 %add6 = add i64 %i, 3 %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 %3 = load double, double* %arrayidx7, align 8 %vecinit8 = insertelement <4 x double> %vecinit5, double %3, i32 3 %shuffle = shufflevector <4 x double> %vecinit8, <4 x double> %vecinit8, <4 x i32> <i32 3, i32 3...

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

2015 Feb 22

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

...If this GEP has only 0 indices, it is the same pointer as // Src. If Src is not a trivial GEP too, don't combine // the indices. if (GEP.hasAllZeroIndices() && !Src.hasAllZeroIndices() && !Src.hasOneUse()) return false; return true; } I have a case where GEP: %arrayidx7 = getelementptr inbounds i32* %arrayidx, i32 %shl6 Src: %arrayidx = getelementptr inbounds [4096 x i32]* @phasor_4096, i32 0, i32 %shl2 GEP.hasAllZeroIndices() will return false and the merge will occur Why do we want to combine these 2 getelementptr? On my out of tree target, combining these 2 G...

[LLVMdev] loop vectorizer says Bad stride

2013 Oct 28

[LLVMdev] loop vectorizer says Bad stride

...i64 %10 = load float** %c.addr, align 8 %arrayidx4 = getelementptr inbounds float* %10, i64 %idxprom3 store float %add, float* %arrayidx4, align 4 %11 = load i32* %i, align 4 %add5 = add nsw i32 256, %11 %idxprom6 = sext i32 %add5 to i64 %12 = load float** %a.addr, align 8 %arrayidx7 = getelementptr inbounds float* %12, i64 %idxprom6 %13 = load float* %arrayidx7, align 4 %14 = load i32* %i, align 4 %add8 = add nsw i32 256, %14 %idxprom9 = sext i32 %add8 to i64 %15 = load float** %b.addr, align 8 %arrayidx10 = getelementptr inbounds float* %15, i64 %idxprom9...

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

...> %add3 = add i64 %i, 2 >> %arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3 >> %2 = load double, double* %arrayidx4, align 8 >> %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 >> %add6 = add i64 %i, 3 >> %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 >> %3 = load double, double* %arrayidx7, align 8 >> %vecinit8 = insertelement <4 x double> %vecinit5, double %3, i32 3 >> %shuffle = shufflevector <4 x double> %vecinit8, <4 x double> >> %vec...

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 04

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to

[LLVMdev] RFC: Loop versioning for LICM

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

...y3: ; preds = %for.body3.lr.ph, %for.body3 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2, %for.body3.lr.ph ] %arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv store i32 %add, i32* %arrayidx, align 4, !tbaa !1 %8 = load i32* %arrayidx7, align 4, !tbaa !1 %add8 = add nsw i32 %8, %add store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %lftr.wideiv = trunc i64 %indvars.iv to i32 %exitcond = icmp eq i32 %lftr.wideiv, %0 br i1 %exitcond, label %for.inc11, label %for.body3...

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 28

[LLVMdev] Loop vectorizer dosen't find loop bounds

...runtime check ptr: %arrayidx14 = getelementptr inbounds float* %c, i64 %2 LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds float* %a, i64 %indvars.iv LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds float* %b, i64 %indvars.iv LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds float* %a, i64 %2 LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds float* %b, i64 %2 LV: We need to do 10 pointer comparisons. LV: We can't vectorize because we can't find the array bounds. LV: Can't vectorize due to memory conflicts LV: No...

[LLVMdev] SCEV and GEP NSW flag

2013 Nov 02

[LLVMdev] SCEV and GEP NSW flag

...ody: > > %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, > > %for.body.preheader ] > > %add = add nsw i64 %indvars.iv, %k > > %arrayidx = getelementptr inbounds i32* %a, i64 %add > > %0 = load i32* %arrayidx, align 4, !tbaa !1 > > ... > > %arrayidx7 = getelementptr inbounds i32* %a, i64 %indvars.iv > > store i32 %add5, i32* %arrayidx7, align 4, !tbaa !1 > > ... > > > > My goal here is to detect that, within the loop, (&a[i] - &a[i + > > k]) is negative, as the explicit loop guard guarantees. On the >...

[LLVMdev] loop vectorizer says Bad stride

2013 Oct 28

[LLVMdev] loop vectorizer says Bad stride

...oat** %c.addr, align 8 > %arrayidx4 = getelementptr inbounds float* %10, i64 %idxprom3 > store float %add, float* %arrayidx4, align 4 > %11 = load i32* %i, align 4 > %add5 = add nsw i32 256, %11 > %idxprom6 = sext i32 %add5 to i64 > %12 = load float** %a.addr, align 8 > %arrayidx7 = getelementptr inbounds float* %12, i64 %idxprom6 > %13 = load float* %arrayidx7, align 4 > %14 = load i32* %i, align 4 > %add8 = add nsw i32 256, %14 > %idxprom9 = sext i32 %add8 to i64 > %15 = load float** %b.addr, align 8 > %arrayidx10 = getelementptr inbounds float* %15...

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

...i, 2 >>> %arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3 >>> %2 = load double, double* %arrayidx4, align 8 >>> %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 >>> %add6 = add i64 %i, 3 >>> %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 >>> %3 = load double, double* %arrayidx7, align 8 >>> %vecinit8 = insertelement <4 x double> %vecinit5, double %3, i32 3 >>> %shuffle = shufflevector <4 x double> %vecinit8, <4 x double&gt...

[LLVMdev] SCEV getMulExpr() not propagating Wrap flags

2013 Nov 16

[LLVMdev] SCEV getMulExpr() not propagating Wrap flags

...s.iv, 1 > %arrayidx = getelementptr inbounds i32* %b, i64 %0 > %1 = load i32* %arrayidx, align 4, !tbaa !1 > %add = add nsw i32 %1, %I > %arrayidx3 = getelementptr inbounds i32* %a, i64 %0 > store i32 %add, i32* %arrayidx3, align 4, !tbaa !1 > %2 = or i64 %0, 1 > %arrayidx7 = getelementptr inbounds i32* %b, i64 %2 > %3 = load i32* %arrayidx7, align 4, !tbaa !1 > %add8 = add nsw i32 %3, %I > %arrayidx12 = getelementptr inbounds i32* %a, i64 %2 > store i32 %add8, i32* %arrayidx12, align 4, !tbaa !1 > %indvars.iv.next = add nuw nsw i64 %indvars.i...

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

...; %arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3 >>>> %2 = load double, double* %arrayidx4, align 8 >>>> %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 >>>> %add6 = add i64 %i, 3 >>>> %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 >>>> %3 = load double, double* %arrayidx7, align 8 >>>> %vecinit8 = insertelement <4 x double> %vecinit5, double %3, i32 3 >>>> %shuffle = shufflevector <4 x double> %vecinit8, <4...

[LLVMdev] 16bit loads being promoted to 32bit?

2009 Feb 13

[LLVMdev] 16bit loads being promoted to 32bit?

...br i1 %cmp, label %if.then, label %if.end if.end: ; preds = %entry ret void if.then: ; preds = %entry %arrayidx = getelementptr i32 addrspace(11)* %result, i32 %call ; <i32 addrspace(11)*> [#uses=1] %arrayidx7 = getelementptr i16 addrspace(11)* %input, i32 %call ; <i16 addrspace(11)*> [#uses=1] %tmp8 = load i16 addrspace(11)* %arrayidx7 ; <i16> [#uses=1] %conv9 = sext i16 %tmp8 to i32 ; <i32> [#uses=1] store i32 %conv9, i32...

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 28

[LLVMdev] Loop vectorizer dosen't find loop bounds

...dx14 = getelementptr inbounds > float* %c, i64 %2 > LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds > float* %a, i64 %indvars.iv > LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds > float* %b, i64 %indvars.iv > LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds > float* %a, i64 %2 > LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds > float* %b, i64 %2 > LV: We need to do 10 pointer comparisons. > LV: We can't vectorize because we can't find the array bounds. > LV: Can't vectorize du...

[LLVMdev] RFC: Loop versioning for LICM

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

...; preds = %for.body3.lr.ph, %for.body3 > %indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2, %for.body3.lr.ph ] > %arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv > store i32 %add, i32* %arrayidx, align 4, !tbaa !1 > %8 = load i32* %arrayidx7, align 4, !tbaa !1 > %add8 = add nsw i32 %8, %add > store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1 > %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 > %lftr.wideiv = trunc i64 %indvars.iv to i32 > %exitcond = icmp eq i32 %lftr.wideiv, %0 > br i1 %exitcond, label...

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

...etelementptr inbounds double, double* %ptr, i64 %add3 >>>>> %2 = load double, double* %arrayidx4, align 8 >>>>> %vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2 >>>>> %add6 = add i64 %i, 3 >>>>> %arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6 >>>>> %3 = load double, double* %arrayidx7, align 8 >>>>> %vecinit8 = insertelement <4 x double> %vecinit5, double %3, i32 3 >>>>> %shuffle = shufflevector <4 x double> %...

search for: arrayidx7