Displaying 10 results from an estimated 10 matches for "add29".
Did you mean:
add2
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
...zext i8 %0 to i32 <-> %conv15 = zext
i8 %1 to i32
BBV: selected pair: %add26 = add i32 %mul25, %mul23 <-> %add36 =
add i32 %mul35, %mul33
BBV: selected pair: %mul = mul nsw i32 %conv14, 123 <-> %mul16 =
mul nsw i32 %conv15, 321
BBV: selected pair: %conv30 = trunc i32 %add29 to i8 <-> %conv40 =
trunc i32 %add39 to i8
BBV: selected pair: %mul25 = mul nsw i32 %conv15, 432 <-> %mul33 =
mul nsw i32 %conv14, 345
BBV: selected pair: %add29 = add i32 %add26, %mul28 <-> %add39 =
add i32 %add36, %mul38
BBV: selected pair: store i8 %conv30, i8* %inc...
2012 Jan 26
3
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, 2012-01-26 at 15:12 -0600, Sebastian Pop wrote:
> On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> > Thanks! Did you compile with any non-default flags other than -mllvm
> > -vectorize?
>
> I used -O3 and -vectorize, no other non-default flags.
If I run clang -O3 -mllvm -vectorize -S -emit-llvm -o test.ll test.c
then I get no
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 3:41 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote:
>> arm-none-linux-gnueabi
>
> Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get
Minor remark: please use -target instead of -ccc-host-triple that is
now deprecated.
Thanks for looking at this testcase.
Sebastian
--
Qualcomm
2012 Jan 26
2
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, 2012-01-26 at 15:36 -0600, Sebastian Pop wrote:
> arm-none-linux-gnueabi
Indeed, adding -ccc-host-triple arm-none-linux-gnueabi I also get
vectorization (even though I don't get vectorization when targeting
x86_64). I'll let you know what I find.
-Hal
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
...tbaa !1 %add25 = add nsw i32 %add23, %13
> %arrayidx26 = getelementptr inbounds i32* %a, i32 14 %14 = load i32*
> %arrayidx26, align 4, !tbaa !1 %add27 = add nsw i32 %add25, %14
> %arrayidx28 = getelementptr inbounds i32* %a, i32 15 %15 = load i32*
> %arrayidx28, align 4, !tbaa !1 %add29 = add nsw i32 %add27, %15 ret i32
> %add29}*
> $ opt -S -slp-vectorizer -slp-vectorize-hor test.ll -debug -o test2.ll
>
> Features:+64bit,+sse2
> CPU:generic
>
> Subtarget features: SSELevel 3, 3DNowLevel 0, 64bit 1
> SLP: Analyzing blocks in foo.
>
> *test2.ll (IR af...
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
...align 4, !tbaa !1
%add25 = add nsw i32 %add23, %13
%arrayidx26 = getelementptr inbounds i32* %a, i32 14
%14 = load i32* %arrayidx26, align 4, !tbaa !1
%add27 = add nsw i32 %add25, %14
%arrayidx28 = getelementptr inbounds i32* %a, i32 15
%15 = load i32* %arrayidx28, align 4, !tbaa !1
%add29 = add nsw i32 %add27, %15
store i32 %add29, i32* %sum, align 4, !tbaa !1
ret void
}
*IR after SLP vectorization with appropriate flags :*
$ opt -S -slp-vectorizer -slp-vectorize-hor=1 -slp-vectorize-hor-store=1
test.ll -debug
(I hope i am passing the args correctly to opt)
Subtarget featur...
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...float* %x24, align 4
%12 = tail call float @llvm.nvvm.mul.rn.f(float %0, float %11) nounwind
%x26 = getelementptr inbounds %struct.float2* %x, i64 0, i32 0
%13 = load float* %x26, align 4
%14 = tail call float @llvm.nvvm.mul.rn.f(float %13, float %3) nounwind
%add = fadd float %12, %14
%add29 = fadd float %10, %add
%15 = tail call float @llvm.nvvm.add.rn.f(float %6, float %add29) nounwind
%sub32 = fsub float %6, %15
%16 = tail call float @llvm.nvvm.add.rn.f(float %sub32, float %add29)
nounwind
%agg.result.0 = getelementptr inbounds %struct.float2* %agg.result, i64
0, i32 0
sto...
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
...rayidx24, align 4, !tbaa !1 %add25 = add nsw i32 %add23, %13
%arrayidx26 = getelementptr inbounds i32* %a, i32 14 %14 = load i32*
%arrayidx26, align 4, !tbaa !1 %add27 = add nsw i32 %add25, %14
%arrayidx28 = getelementptr inbounds i32* %a, i32 15 %15 = load i32*
%arrayidx28, align 4, !tbaa !1 %add29 = add nsw i32 %add27, %15 ret i32
%add29}*
$ opt -S -slp-vectorizer -slp-vectorize-hor test.ll -debug -o test2.ll
Features:+64bit,+sse2
CPU:generic
Subtarget features: SSELevel 3, 3DNowLevel 0, 64bit 1
SLP: Analyzing blocks in foo.
*test2.ll (IR after SLP vectorization) :*...
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
...ementptr inbounds i32* %a, i32 14
> > > %14 = load i32* %arrayidx26, align 4, !tbaa !1
> > > %add27 = add nsw i32 %add25, %14
> > > %arrayidx28 = getelementptr inbounds i32* %a, i32 15
> > > %15 = load i32* %arrayidx28, align 4, !tbaa !1
> > > %add29 = add nsw i32 %add27, %15
> > >
> > >
> > > store i32 %add29, i32* %sum, align 4, !tbaa !1
> > > ret void
> > > }
> > >
> > >
> > >
> > > IR after SLP vectorization with appropriate flags :
> > >
> >...
2012 Nov 09
0
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...2 = tail call float @llvm.nvvm.mul.rn.f(float %0, float %11) nounwind
> %x26 = getelementptr inbounds %struct.float2* %x, i64 0, i32 0
> %13 = load float* %x26, align 4
> %14 = tail call float @llvm.nvvm.mul.rn.f(float %13, float %3) nounwind
> %add = fadd float %12, %14
> %add29 = fadd float %10, %add
> %15 = tail call float @llvm.nvvm.add.rn.f(float %6, float %add29) nounwind
> %sub32 = fsub float %6, %15
> %16 = tail call float @llvm.nvvm.add.rn.f(float %sub32, float %add29)
> nounwind
> %agg.result.0 = getelementptr inbounds %struct.float2* %agg.r...