search for: mul6

Displaying 12 results from an estimated 12 matches for "mul6".

Did you mean: mul
2010 Mar 03
5
[LLVMdev] folding x * 0 = 0
...p mode) and renaming _ZSt3sinf to sin and _ZSt3cosf to cos I get the following: define float @_Z3fooff(float %a, float %b) nounwind { entry: %mul = fmul float %a, %b ; <float> [#uses=1] %mul2 = fmul float %mul, 0.000000e+000 ; <float> [#uses=1] %mul6 = fmul float 0x3FDAED54A0000000, %mul2 ; <float> [#uses=1] ret float %mul6 } the sin and cos calls are folded, but not the mul by zero. May be this is missing in llvm::ConstantFoldInstOperands in ConsantFolding.cpp? I would expect the following optimizations, but didn't find them...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; preds = %entry, %for.body %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ] %div = lshr i64 %storemerge10, 2 %mul1 = shl i64 %div, 3 %rem = and i64 %storemerge10, 3 %add2 = or i64 %mul1, %rem %0 = lshr i64 %storemerge10, 1 %add51 = shl i64 %0, 2 %mul6 = or i64 %rem, %add51 %add8 = or i64 %mul6, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %1 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %2 = load float* %arrayidx9, align 4 %add10 = fadd float %1, %2 %arrayidx11 = getel...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote: > The access pattern to arrays a and b is non-linear. Unrolled loops are > usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all > values for i ? > Based on his list of values, it seems that the induction stride is linear within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
...reds = %entry, %for.body > %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ] > %div = lshr i64 %storemerge10, 2 > %mul1 = shl i64 %div, 3 > %rem = and i64 %storemerge10, 3 > %add2 = or i64 %mul1, %rem > %0 = lshr i64 %storemerge10, 1 > %add51 = shl i64 %0, 2 > %mul6 = or i64 %rem, %add51 > %add8 = or i64 %mul6, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %1 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %2 = load float* %arrayidx9, align 4 > %add10 = fadd float %1, %2 > %arra...
2010 Mar 01
2
[LLVMdev] constant folding for standard math functions
Hi! I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by their result. What strategy do you recommend? Should I write a pass that does only this or should I copy and modify the SCCP pass? A problem with an extra pass could be that that I need to alternate my pass and SCCP several times since the results of the math functions could be folded again. -Jochen
2010 Mar 01
0
[LLVMdev] constant folding for standard math functions
On Mar 1, 2010, at 9:44 AM, Jochen Wilhelmy wrote: > Hi! > > I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by > their result. > What strategy do you recommend? > Should I write a pass that does only this or should I copy and > modify the SCCP pass? > > A problem with an extra pass could be that that I need to alternate > my pass and
2010 Mar 05
0
[LLVMdev] folding x * 0 = 0
...> and _ZSt3cosf to cos I get the following: > > define float @_Z3fooff(float %a, float %b) nounwind { > entry: > %mul = fmul float %a, %b ;<float> [#uses=1] > %mul2 = fmul float %mul, 0.000000e+000 ;<float> [#uses=1] > %mul6 = fmul float 0x3FDAED54A0000000, %mul2 ;<float> [#uses=1] > ret float %mul6 > } > > the sin and cos calls are folded, but not the mul by zero. > May be this is missing in llvm::ConstantFoldInstOperands in > ConsantFolding.cpp? > > I would expect the following o...
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
...align 4 %conv = fpext float %8 to double %mul = fmul double %conv, 6.700000e-01 %9 = load float* %y, align 4 %conv3 = fpext float %9 to double %mul4 = fmul double %conv3, 1.700000e-01 %add = fadd double %mul, %mul4 %10 = load float* %z, align 4 %conv5 = fpext float %10 to double %mul6 = fmul double %conv5, 1.600000e-01 %add7 = fadd double %add, %mul6 %conv8 = fptrunc double %add7 to float store float %conv8, float* %res, align 4 %11 = load float* %res, align 4 %12 = load i32* %i, align 4 %idxprom = sext i32 %12 to i64 %arrayidx9 = getelementptr inbounds float* %3,...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ] >> %div = lshr i64 %storemerge10, 2 >> %mul1 = shl i64 %div, 3 >> %rem = and i64 %storemerge10, 3 >> %add2 = or i64 %mul1, %rem >> %0 = lshr i64 %storemerge10, 1 >> %add51 = shl i64 %0, 2 >> %mul6 = or i64 %rem, %add51 >> %add8 = or i64 %mul6, 4 >> %arrayidx = getelementptr inbounds float* %a, i64 %add2 >> %1 = load float* %arrayidx, align 4 >> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 >> %2 = load float* %arrayidx9, align 4 >> %add10 = fadd...
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
...float %8 to double > %mul = fmul double %conv, 6.700000e-01 > %9 = load float* %y, align 4 > %conv3 = fpext float %9 to double > %mul4 = fmul double %conv3, 1.700000e-01 > %add = fadd double %mul, %mul4 > %10 = load float* %z, align 4 > %conv5 = fpext float %10 to double > %mul6 = fmul double %conv5, 1.600000e-01 > %add7 = fadd double %add, %mul6 > %conv8 = fptrunc double %add7 to float > store float %conv8, float* %res, align 4 > %11 = load float* %res, align 4 > %12 = load i32* %i, align 4 > %idxprom = sext i32 %12 to i64 > %arrayidx9 = getelementptr...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...%inc, %for.body ], [ %start, %entry ] >>> %div = lshr i64 %storemerge10, 2 >>> %mul1 = shl i64 %div, 3 >>> %rem = and i64 %storemerge10, 3 >>> %add2 = or i64 %mul1, %rem >>> %0 = lshr i64 %storemerge10, 1 >>> %add51 = shl i64 %0, 2 >>> %mul6 = or i64 %rem, %add51 >>> %add8 = or i64 %mul6, 4 >>> %arrayidx = getelementptr inbounds float* %a, i64 %add2 >>> %1 = load float* %arrayidx, align 4 >>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 >>> %2 = load float* %arrayidx9, align 4 &g...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...%entry ] >>>> %div = lshr i64 %storemerge10, 2 >>>> %mul1 = shl i64 %div, 3 >>>> %rem = and i64 %storemerge10, 3 >>>> %add2 = or i64 %mul1, %rem >>>> %0 = lshr i64 %storemerge10, 1 >>>> %add51 = shl i64 %0, 2 >>>> %mul6 = or i64 %rem, %add51 >>>> %add8 = or i64 %mul6, 4 >>>> %arrayidx = getelementptr inbounds float* %a, i64 %add2 >>>> %1 = load float* %arrayidx, align 4 >>>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 >>>> %2 = load float* %a...