Displaying 12 results from an estimated 12 matches for "mul6".
Did you mean:
mul
2010 Mar 03
5
[LLVMdev] folding x * 0 = 0
...p mode) and renaming _ZSt3sinf to sin
and _ZSt3cosf to cos I get the following:
define float @_Z3fooff(float %a, float %b) nounwind {
entry:
%mul = fmul float %a, %b ; <float> [#uses=1]
%mul2 = fmul float %mul, 0.000000e+000 ; <float> [#uses=1]
%mul6 = fmul float 0x3FDAED54A0000000, %mul2 ; <float> [#uses=1]
ret float %mul6
}
the sin and cos calls are folded, but not the mul by zero.
May be this is missing in llvm::ConstantFoldInstOperands in
ConsantFolding.cpp?
I would expect the following optimizations, but didn't find them...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; preds = %entry,
%for.body
%storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
%div = lshr i64 %storemerge10, 2
%mul1 = shl i64 %div, 3
%rem = and i64 %storemerge10, 3
%add2 = or i64 %mul1, %rem
%0 = lshr i64 %storemerge10, 1
%add51 = shl i64 %0, 2
%mul6 = or i64 %rem, %add51
%add8 = or i64 %mul6, 4
%arrayidx = getelementptr inbounds float* %a, i64 %add2
%1 = load float* %arrayidx, align 4
%arrayidx9 = getelementptr inbounds float* %b, i64 %add2
%2 = load float* %arrayidx9, align 4
%add10 = fadd float %1, %2
%arrayidx11 = getel...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
...reds = %entry, %for.body
> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
> %div = lshr i64 %storemerge10, 2
> %mul1 = shl i64 %div, 3
> %rem = and i64 %storemerge10, 3
> %add2 = or i64 %mul1, %rem
> %0 = lshr i64 %storemerge10, 1
> %add51 = shl i64 %0, 2
> %mul6 = or i64 %rem, %add51
> %add8 = or i64 %mul6, 4
> %arrayidx = getelementptr inbounds float* %a, i64 %add2
> %1 = load float* %arrayidx, align 4
> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
> %2 = load float* %arrayidx9, align 4
> %add10 = fadd float %1, %2
> %arra...
2010 Mar 01
2
[LLVMdev] constant folding for standard math functions
Hi!
I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by
their result.
What strategy do you recommend?
Should I write a pass that does only this or should I copy and
modify the SCCP pass?
A problem with an extra pass could be that that I need to alternate
my pass and SCCP several times since the results of the math functions
could be folded again.
-Jochen
2010 Mar 01
0
[LLVMdev] constant folding for standard math functions
On Mar 1, 2010, at 9:44 AM, Jochen Wilhelmy wrote:
> Hi!
>
> I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by
> their result.
> What strategy do you recommend?
> Should I write a pass that does only this or should I copy and
> modify the SCCP pass?
>
> A problem with an extra pass could be that that I need to alternate
> my pass and
2010 Mar 05
0
[LLVMdev] folding x * 0 = 0
...> and _ZSt3cosf to cos I get the following:
>
> define float @_Z3fooff(float %a, float %b) nounwind {
> entry:
> %mul = fmul float %a, %b ;<float> [#uses=1]
> %mul2 = fmul float %mul, 0.000000e+000 ;<float> [#uses=1]
> %mul6 = fmul float 0x3FDAED54A0000000, %mul2 ;<float> [#uses=1]
> ret float %mul6
> }
>
> the sin and cos calls are folded, but not the mul by zero.
> May be this is missing in llvm::ConstantFoldInstOperands in
> ConsantFolding.cpp?
>
> I would expect the following o...
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
...align 4
%conv = fpext float %8 to double
%mul = fmul double %conv, 6.700000e-01
%9 = load float* %y, align 4
%conv3 = fpext float %9 to double
%mul4 = fmul double %conv3, 1.700000e-01
%add = fadd double %mul, %mul4
%10 = load float* %z, align 4
%conv5 = fpext float %10 to double
%mul6 = fmul double %conv5, 1.600000e-01
%add7 = fadd double %add, %mul6
%conv8 = fptrunc double %add7 to float
store float %conv8, float* %res, align 4
%11 = load float* %res, align 4
%12 = load i32* %i, align 4
%idxprom = sext i32 %12 to i64
%arrayidx9 = getelementptr inbounds float* %3,...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>> %div = lshr i64 %storemerge10, 2
>> %mul1 = shl i64 %div, 3
>> %rem = and i64 %storemerge10, 3
>> %add2 = or i64 %mul1, %rem
>> %0 = lshr i64 %storemerge10, 1
>> %add51 = shl i64 %0, 2
>> %mul6 = or i64 %rem, %add51
>> %add8 = or i64 %mul6, 4
>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>> %1 = load float* %arrayidx, align 4
>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>> %2 = load float* %arrayidx9, align 4
>> %add10 = fadd...
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
...float %8 to double
> %mul = fmul double %conv, 6.700000e-01
> %9 = load float* %y, align 4
> %conv3 = fpext float %9 to double
> %mul4 = fmul double %conv3, 1.700000e-01
> %add = fadd double %mul, %mul4
> %10 = load float* %z, align 4
> %conv5 = fpext float %10 to double
> %mul6 = fmul double %conv5, 1.600000e-01
> %add7 = fadd double %add, %mul6
> %conv8 = fptrunc double %add7 to float
> store float %conv8, float* %res, align 4
> %11 = load float* %res, align 4
> %12 = load i32* %i, align 4
> %idxprom = sext i32 %12 to i64
> %arrayidx9 = getelementptr...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...%inc, %for.body ], [ %start, %entry ]
>>> %div = lshr i64 %storemerge10, 2
>>> %mul1 = shl i64 %div, 3
>>> %rem = and i64 %storemerge10, 3
>>> %add2 = or i64 %mul1, %rem
>>> %0 = lshr i64 %storemerge10, 1
>>> %add51 = shl i64 %0, 2
>>> %mul6 = or i64 %rem, %add51
>>> %add8 = or i64 %mul6, 4
>>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>>> %1 = load float* %arrayidx, align 4
>>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>>> %2 = load float* %arrayidx9, align 4
&g...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...%entry ]
>>>> %div = lshr i64 %storemerge10, 2
>>>> %mul1 = shl i64 %div, 3
>>>> %rem = and i64 %storemerge10, 3
>>>> %add2 = or i64 %mul1, %rem
>>>> %0 = lshr i64 %storemerge10, 1
>>>> %add51 = shl i64 %0, 2
>>>> %mul6 = or i64 %rem, %add51
>>>> %add8 = or i64 %mul6, 4
>>>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>>>> %1 = load float* %arrayidx, align 4
>>>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>>>> %2 = load float* %a...