Displaying 20 results from an estimated 55 matches for "add8".
Did you mean:
add
2013 Oct 30
2
[LLVMdev] loop vectorizer
...ter comparisons.
LV: Checking memory dependencies
LV: Bad stride - Not an AddRecExpr pointer %arrayidx11 = getelementptr
inbounds float* %c, i64 %add2 SCEV: ((4 * %add2)<nsw> + %c)<nsw>
LV: Bad stride - Not an AddRecExpr pointer %arrayidx15 = getelementptr
inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw>
LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> +
%c)<nsw>(Induction step: 0)
LV: Distance for store float %add10, float* %arrayidx11, align 4 to
store float %add14, float* %arrayidx15, align 4: ((4...
2011 Jul 05
3
[LLVMdev] optimizer returning wrong variable?
...%
Entry
store i32 0, i32* %y
br label %Repeat2
Repeat2: ; preds = %Until2, %
Repeat1
%x3 = load i32* %x
%y4 = load i32* %y
%add = add i32 %x3, %y4
%c5 = load i32* %c
%add6 = add i32 %c5, %add
store i32 %add6, i32* %c
%y7 = load i32* %y
%add8 = add i32 %y7, 1
store i32 %add8, i32* %y
br label %Until2
Until2: ; preds = %Repeat2
%y9 = load i32* %y
%b10 = load i32* %b2
%cond = icmp sge i32 %y9, %b10
br i1 %cond, label %Repeat2, label %Untilcmp2
Untilcmp2:...
2013 Oct 30
3
[LLVMdev] loop vectorizer
...oremerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
> %div = lshr i64 %storemerge10, 2
> %mul1 = shl i64 %div, 3
> %rem = and i64 %storemerge10, 3
> %add2 = or i64 %mul1, %rem
> %0 = lshr i64 %storemerge10, 1
> %add51 = shl i64 %0, 2
> %mul6 = or i64 %rem, %add51
> %add8 = or i64 %mul6, 4
> %arrayidx = getelementptr inbounds float* %a, i64 %add2
> %1 = load float* %arrayidx, align 4
> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
> %2 = load float* %arrayidx9, align 4
> %add10 = fadd float %1, %2
> %arrayidx11 = getelementptr inbounds f...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...i1 %cmp14, label %for.body, label %for.end
for.body: ; preds = %entry,
%for.body
%i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
%div = lshr i64 %i.015, 2
%mul = shl i64 %div, 3
%rem = and i64 %i.015, 3
%add2 = or i64 %mul, %rem
%add8 = or i64 %add2, 4
%arrayidx = getelementptr inbounds float* %a, i64 %add2
%0 = load float* %arrayidx, align 4
%arrayidx9 = getelementptr inbounds float* %b, i64 %add2
%1 = load float* %arrayidx9, align 4
%add10 = fadd float %0, %1
%arrayidx11 = getelementptr inbounds float* %c, i6...
2011 Jul 05
0
[LLVMdev] optimizer returning wrong variable?
...il1, %Entry
> store i32 0, i32* %y
> br label %Repeat2
>
> Repeat2: ; preds = %Until2, %Repeat1
> %x3 = load i32* %x
> %y4 = load i32* %y
> %add = add i32 %x3, %y4
> %c5 = load i32* %c
> %add6 = add i32 %c5, %add
> store i32 %add6, i32* %c
> %y7 = load i32* %y
> %add8 = add i32 %y7, 1
> store i32 %add8, i32* %y
> br label %Until2
>
> Until2: ; preds = %Repeat2
> %y9 = load i32* %y
> %b10 = load i32* %b2
> %cond = icmp sge i32 %y9, %b10
> br i1 %cond, label %Repeat2, label %Untilcmp2
>
> Untilcmp2: ; preds = %Until2
> %x11 = load...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...ns.
> LV: Checking memory dependencies
> LV: Bad stride - Not an AddRecExpr pointer %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 SCEV: ((4 * %add2)<nsw> + %c)<nsw>
> LV: Bad stride - Not an AddRecExpr pointer %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw>
> LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> + %c)<nsw>(Induction step: 0)
> LV: Distance for store float %add10, float* %arrayidx11, align 4 to store float %add14, float* %arrayidx15, align...
2013 Oct 30
2
[LLVMdev] loop vectorizer
..., label %for.end
>
> for.body: ; preds = %entry, %for.body
> %i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
> %div = lshr i64 %i.015, 2
> %mul = shl i64 %div, 3
> %rem = and i64 %i.015, 3
> %add2 = or i64 %mul, %rem
> %add8 = or i64 %add2, 4
> %arrayidx = getelementptr inbounds float* %a, i64 %add2
> %0 = load float* %arrayidx, align 4
> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
> %1 = load float* %arrayidx9, align 4
> %add10 = fadd float %0, %1
> %arrayidx11 = getelementptr inbo...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...%for.body
%storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
%div = lshr i64 %storemerge10, 2
%mul1 = shl i64 %div, 3
%rem = and i64 %storemerge10, 3
%add2 = or i64 %mul1, %rem
%0 = lshr i64 %storemerge10, 1
%add51 = shl i64 %0, 2
%mul6 = or i64 %rem, %add51
%add8 = or i64 %mul6, 4
%arrayidx = getelementptr inbounds float* %a, i64 %add2
%1 = load float* %arrayidx, align 4
%arrayidx9 = getelementptr inbounds float* %b, i64 %add2
%2 = load float* %arrayidx9, align 4
%add10 = fadd float %1, %2
%arrayidx11 = getelementptr inbounds float* %c, i6...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2014 Nov 28
2
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
...0;
int newInc = incr+1;
for (int i = 0; i < 1000; i++) {
for (int j = 0; j < 1000; j += incr) {
x += (Arr[i] + Arr[j]);
}
}
return x;
}
=================================
The SCEV expression computed for the variable "j" is %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
--> %j.0 Exits: <<Unknown>>As is evident, this isn't a useful computation.
Whereas if I use the variable newInc to be the increment for "j", i.e., "j += newInc" in the inner loop, the computed SCEV is %j.0 = phi i32 [ 0, %for.body...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...r.body: ; preds = %entry, %for.body
>> %i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>> %div = lshr i64 %i.015, 2
>> %mul = shl i64 %div, 3
>> %rem = and i64 %i.015, 3
>> %add2 = or i64 %mul, %rem
>> %add8 = or i64 %add2, 4
>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>> %0 = load float* %arrayidx, align 4
>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>> %1 = load float* %arrayidx9, align 4
>> %add10 = fadd float %0, %1
>> %a...
2008 Jul 08
3
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...> The "let Uses = [R0]" is not needed. The pseudo instruction will be
> expanded like this later:
>
> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
> + .addReg(ptrA).addReg(ptrB);
> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
> + .addReg(incr).addReg(dest);
> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
>
> The second instruction defines R0 and the 3rd reads R0 which is
> enough to tell the register allocator what...
2013 Nov 13
2
[LLVMdev] SCEV getMulExpr() not propagating Wrap flags
...= load i32* %arrayidx, align 4, !tbaa !1
%add = add nsw i32 %1, %I
%arrayidx3 = getelementptr inbounds i32* %a, i64 %0
store i32 %add, i32* %arrayidx3, align 4, !tbaa !1
%2 = or i64 %0, 1
%arrayidx7 = getelementptr inbounds i32* %b, i64 %2
%3 = load i32* %arrayidx7, align 4, !tbaa !1
%add8 = add nsw i32 %3, %I
%arrayidx12 = getelementptr inbounds i32* %a, i64 %2
store i32 %add8, i32* %arrayidx12, align 4, !tbaa !1
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 512
br i1 %exitcond, label %for.end, label %for.body
And, when going...
2010 Jun 05
0
[LLVMdev] Converting into SSA form
...; preds = %if.else, %if.then
%a.0 = phi i32 [ 6, %if.then ], [ 2, %if.else ] ; <i32> [#uses=1]
%b.0 = phi i32 [ 3, %if.then ], [ 4, %if.else ] ; <i32> [#uses=1]
%c.0 = phi i32 [ %add, %if.then ], [ %mul, %if.else ] ; <i32> [#uses=1]
%add8 = add nsw i32 %c.0, %a.0 ; <i32> [#uses=1]
%add11 = add nsw i32 %add8, %b.0 ; <i32> [#uses=1]
ret i32 %add11
}
In order to preserve the constants' names from the original program,
mem2reg would have to insert operations like
%a.0 = bitcast i32 2...
2010 Jun 05
2
[LLVMdev] Converting into SSA form
Suppose my Input function is like :
myfunc(int x,int y){
int a=2, b=3,c=5;
if(x>y) {
c=a+b;
a=6;
}
else {
c=a*b;
b=4;
}
a=c+a;
c=a+b;
}
and the output should be :
myfunc(int x,int y){
int a.0=2, b.0=3,c.0=5;
if(x>y) {
c.1=a.0+b.0;
a.1=6;
}
else {
c.2=a.0*b.0;
b.1=4;
}
2008 Jul 08
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...et Uses = [R0]" is not needed. The pseudo instruction will be
>> expanded like this later:
>>
>> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
>> + .addReg(ptrA).addReg(ptrB);
>> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
>> + .addReg(incr).addReg(dest);
>> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
>> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
>>
>> The second instruction defines R0 and the 3rd reads R0 which is
>> enough to tell the...
2013 Oct 28
2
[LLVMdev] loop vectorizer says Bad stride
...%arrayidx4, align 4
%11 = load i32* %i, align 4
%add5 = add nsw i32 256, %11
%idxprom6 = sext i32 %add5 to i64
%12 = load float** %a.addr, align 8
%arrayidx7 = getelementptr inbounds float* %12, i64 %idxprom6
%13 = load float* %arrayidx7, align 4
%14 = load i32* %i, align 4
%add8 = add nsw i32 256, %14
%idxprom9 = sext i32 %add8 to i64
%15 = load float** %b.addr, align 8
%arrayidx10 = getelementptr inbounds float* %15, i64 %idxprom9
%16 = load float* %arrayidx10, align 4
%add11 = fadd float %13, %16
%17 = load i32* %i, align 4
%add12 = add nsw i32 256,...
2015 Feb 26
6
[LLVMdev] RFC: Loop versioning for LICM
...; preds = %for.body3.lr.ph, %for.body3
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2, %for.body3.lr.ph ]
%arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv
store i32 %add, i32* %arrayidx, align 4, !tbaa !1
%8 = load i32* %arrayidx7, align 4, !tbaa !1
%add8 = add nsw i32 %8, %add
store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv to i32
%exitcond = icmp eq i32 %lftr.wideiv, %0
br i1 %exitcond, label %for.inc11, label %for.body3
In versioned loop differen...
2008 Jul 04
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...MIC_LOAD_ADD_I32 : Pseudo<
The "let Uses = [R0]" is not needed. The pseudo instruction will be
expanded like this later:
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
The second instruction defines R0 and the 3rd reads R0 which is enough
to tell the register allocator what to do.
I do have a questio...
2013 Oct 28
0
[LLVMdev] loop vectorizer says Bad stride
...%11 = load i32* %i, align 4
> %add5 = add nsw i32 256, %11
> %idxprom6 = sext i32 %add5 to i64
> %12 = load float** %a.addr, align 8
> %arrayidx7 = getelementptr inbounds float* %12, i64 %idxprom6
> %13 = load float* %arrayidx7, align 4
> %14 = load i32* %i, align 4
> %add8 = add nsw i32 256, %14
> %idxprom9 = sext i32 %add8 to i64
> %15 = load float** %b.addr, align 8
> %arrayidx10 = getelementptr inbounds float* %15, i64 %idxprom9
> %16 = load float* %arrayidx10, align 4
> %add11 = fadd float %13, %16
> %17 = load i32* %i, align 4
> %add1...