Hi Roel, the code you list below is precisely what I expect to get (of course the stores must happen but the constant folding should happen as well). It all looks very strange. LLVM is behaving as if the __restrict__ keyword was not used at all. Even more strange is the fact that for this function: double f(double *__restrict__ x, double *__restrict__ y, double *__restrict__ z) { *x = 1.0; *y = *x + 2; *z = *x + 3; return *x + *y + *z; } everything works as expected: define double @_Z1fPdS_S_(double* noalias nocapture %x, double* noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { store double 1.000000e+00, double* %x, align 8, !tbaa !0 store double 3.000000e+00, double* %y, align 8, !tbaa !0 store double 4.000000e+00, double* %z, align 8, !tbaa !0 ret double 8.000000e+00 } !0 = metadata !{metadata !"double", metadata !1} !1 = metadata !{metadata !"omnipotent char", metadata !2} !2 = metadata !{metadata !"Simple C/C++ TBAA", null} So I can't figure out what the mechanism is for telling llvm that two pointers that are local variables (and not input arguments) to a function do not alias each other. I hope one of the developers can shed some light on this. Thank you in advance for any help, Brent On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans <r.jordans at tue.nl> wrote:> Hi Brent, > > Looking at your code I can see at least one reason why some of the store > operations remain in the output since you are (through x, y, and z) > writing in memory which exists outside of your function (p). > > Constant propagation also seems to work in the first few lines, *y = *x > +1 (%3) is stored directly. > > The strange thing to me is that the same doesn't happen for *z = *x + 2. > Here *x is loaded again and the addition is still performed... > > From this point on, constant propagation seems to stop working > completely. Looking at the IR I would have expected something like the > following: > > define double @f(double** nocapture %p) nounwind uwtable { > %1 = load double** %p, align 8, !tbaa !0 > %2 = getelementptr inbounds double** %p, i64 1 > %3 = load double** %2, align 8, !tbaa !0 > %4 = getelementptr inbounds double** %p, i64 2 > %5 = load double** %4, align 8, !tbaa !0 > store double 1.000000e+00, double* %1, align 8, !tbaa !3 > store double 3.000000e+00, double* %3, align 8, !tbaa !3 > store double 4.000000e+00, double* %5, align 8, !tbaa !3 > ret double 8.000000e+00 > } > > !0 = metadata !{metadata !"any pointer", metadata !1} > !1 = metadata !{metadata !"omnipotent char", metadata !2} > !2 = metadata !{metadata !"Simple C/C++ TBAA", null} > !3 = metadata !{metadata !"double", metadata !1} > > I am afraid I can't really help you in telling what went wrong here and > caused the missing optimizations, but I agree with you that the result > in not what I would have expected. > > Cheers, > Roel > On 01/23/2012 03:31 AM, Brent Walker wrote: >> Hi LLVMers, >> I would like to ask a question regarding aliasing. Suppose I have the >> following program: >> >> double f(double** p ) >> { >> double a,b,c; >> double * x =&a; >> double * y =&b; >> double * z =&c; >> >> *x = 1; >> *y = *x + 2; >> *z = *x + 3; >> >> return *x+*y+*z; >> } >> >> LLVM can tell that the three pointers do not alias each other so can >> perform the constant folding at compile time. >> >> define double @f(double** nocapture %p) nounwind uwtable readnone { >> ret double 8.000000e+00 >> } >> >> Now consider the function below. I know (in my particluar case) that >> the pointers in the p array do not alias each other. I tried to >> communicate this information to llvm/clang via the __restrict__ >> qualifier but it does not seem to have an effect. Can you please >> suggest what is wrong. How can I achieve what I want? >> >> double f(double** p ) >> { >> double *__restrict__ x = p[0]; >> double *__restrict__ y = p[1]; >> double *__restrict__ z = p[2]; >> >> *x = 1; >> *y = *x + 2; >> *z = *x + 3; >> >> return *x+*y+*z; >> } >> >> define double @f(double** nocapture %p) nounwind uwtable { >> %1 = load double** %p, align 8, !tbaa !0 >> %2 = getelementptr inbounds double** %p, i64 1 >> %3 = load double** %2, align 8, !tbaa !0 >> %4 = getelementptr inbounds double** %p, i64 2 >> %5 = load double** %4, align 8, !tbaa !0 >> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >> %6 = load double* %1, align 8, !tbaa !3 >> %7 = fadd double %6, 3.000000e+00 >> store double %7, double* %5, align 8, !tbaa !3 >> %8 = load double* %1, align 8, !tbaa !3 >> %9 = load double* %3, align 8, !tbaa !3 >> %10 = fadd double %8, %9 >> %11 = fadd double %10, %7 >> ret double %11 >> } >> >> !0 = metadata !{metadata !"any pointer", metadata !1} >> !1 = metadata !{metadata !"omnipotent char", metadata !2} >> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >> !3 = metadata !{metadata !"double", metadata !1} >> >> Thank you for any help. >> >> Brent >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi Brent, I think this is a problem in the easy-cse transform. In this transform load operations can be replaced by their subexpression, in this case the propagated constant, based on the value of the 'CurrentGeneration' of memory writes. This implies that any store operation invalidates the knowledge about previously stored subexpressions. In general, this is a safe assumption but in this case it is missing quite some optimization potential. The effect of this can be shown by moving the line %6 one up, to before the previous store operation. This doesn't change the program behaviour but does influence the optimization. More info on this is in lib/Transforms/Scalar/EarlyCSE.cpp (line 415 is a good start) I don't have time to improve this at this moment so I'll leave that to you (or anyone else that feels inspired). Cheers, Roel On 01/24/2012 03:59 PM, Brent Walker wrote:> Hi Roel, > the code you list below is precisely what I expect to get (of course > the stores must happen but the constant folding should happen as > well). > > It all looks very strange. LLVM is behaving as if the __restrict__ > keyword was not used at all. Even more strange is the fact that for > this function: > > double f(double *__restrict__ x, double *__restrict__ y, double *__restrict__ z) > { > *x = 1.0; > *y = *x + 2; > *z = *x + 3; > > return *x + *y + *z; > } > > everything works as expected: > > define double @_Z1fPdS_S_(double* noalias nocapture %x, double* > noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { > store double 1.000000e+00, double* %x, align 8, !tbaa !0 > store double 3.000000e+00, double* %y, align 8, !tbaa !0 > store double 4.000000e+00, double* %z, align 8, !tbaa !0 > ret double 8.000000e+00 > } > > !0 = metadata !{metadata !"double", metadata !1} > !1 = metadata !{metadata !"omnipotent char", metadata !2} > !2 = metadata !{metadata !"Simple C/C++ TBAA", null} > > So I can't figure out what the mechanism is for telling llvm that two > pointers that are local variables (and not input arguments) to a > function do not alias each other. I hope one of the developers can > shed some light on this. > > Thank you in advance for any help, > Brent > > > On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans<r.jordans at tue.nl> wrote: >> Hi Brent, >> >> Looking at your code I can see at least one reason why some of the store >> operations remain in the output since you are (through x, y, and z) >> writing in memory which exists outside of your function (p). >> >> Constant propagation also seems to work in the first few lines, *y = *x >> +1 (%3) is stored directly. >> >> The strange thing to me is that the same doesn't happen for *z = *x + 2. >> Here *x is loaded again and the addition is still performed... >> >> From this point on, constant propagation seems to stop working >> completely. Looking at the IR I would have expected something like the >> following: >> >> define double @f(double** nocapture %p) nounwind uwtable { >> %1 = load double** %p, align 8, !tbaa !0 >> %2 = getelementptr inbounds double** %p, i64 1 >> %3 = load double** %2, align 8, !tbaa !0 >> %4 = getelementptr inbounds double** %p, i64 2 >> %5 = load double** %4, align 8, !tbaa !0 >> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >> store double 4.000000e+00, double* %5, align 8, !tbaa !3 >> ret double 8.000000e+00 >> } >> >> !0 = metadata !{metadata !"any pointer", metadata !1} >> !1 = metadata !{metadata !"omnipotent char", metadata !2} >> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >> !3 = metadata !{metadata !"double", metadata !1} >> >> I am afraid I can't really help you in telling what went wrong here and >> caused the missing optimizations, but I agree with you that the result >> in not what I would have expected. >> >> Cheers, >> Roel >> On 01/23/2012 03:31 AM, Brent Walker wrote: >>> Hi LLVMers, >>> I would like to ask a question regarding aliasing. Suppose I have the >>> following program: >>> >>> double f(double** p ) >>> { >>> double a,b,c; >>> double * x =&a; >>> double * y =&b; >>> double * z =&c; >>> >>> *x = 1; >>> *y = *x + 2; >>> *z = *x + 3; >>> >>> return *x+*y+*z; >>> } >>> >>> LLVM can tell that the three pointers do not alias each other so can >>> perform the constant folding at compile time. >>> >>> define double @f(double** nocapture %p) nounwind uwtable readnone { >>> ret double 8.000000e+00 >>> } >>> >>> Now consider the function below. I know (in my particluar case) that >>> the pointers in the p array do not alias each other. I tried to >>> communicate this information to llvm/clang via the __restrict__ >>> qualifier but it does not seem to have an effect. Can you please >>> suggest what is wrong. How can I achieve what I want? >>> >>> double f(double** p ) >>> { >>> double *__restrict__ x = p[0]; >>> double *__restrict__ y = p[1]; >>> double *__restrict__ z = p[2]; >>> >>> *x = 1; >>> *y = *x + 2; >>> *z = *x + 3; >>> >>> return *x+*y+*z; >>> } >>> >>> define double @f(double** nocapture %p) nounwind uwtable { >>> %1 = load double** %p, align 8, !tbaa !0 >>> %2 = getelementptr inbounds double** %p, i64 1 >>> %3 = load double** %2, align 8, !tbaa !0 >>> %4 = getelementptr inbounds double** %p, i64 2 >>> %5 = load double** %4, align 8, !tbaa !0 >>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>> %6 = load double* %1, align 8, !tbaa !3 >>> %7 = fadd double %6, 3.000000e+00 >>> store double %7, double* %5, align 8, !tbaa !3 >>> %8 = load double* %1, align 8, !tbaa !3 >>> %9 = load double* %3, align 8, !tbaa !3 >>> %10 = fadd double %8, %9 >>> %11 = fadd double %10, %7 >>> ret double %11 >>> } >>> >>> !0 = metadata !{metadata !"any pointer", metadata !1} >>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>> !3 = metadata !{metadata !"double", metadata !1} >>> >>> Thank you for any help. >>> >>> Brent >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Can you explain please why it works for this version of the function: double f(double *__restrict__ x, double *__restrict__ y, double *__restrict__ z); What is different here? There are stores here as well. Brent On Wed, Jan 25, 2012 at 12:34 AM, Roel Jordans <r.jordans at tue.nl> wrote:> Hi Brent, > > I think this is a problem in the easy-cse transform. In this transform load > operations can be replaced by their subexpression, in this case the > propagated constant, based on the value of the 'CurrentGeneration' of memory > writes. This implies that any store operation invalidates the knowledge > about previously stored subexpressions. > > In general, this is a safe assumption but in this case it is missing quite > some optimization potential. > > The effect of this can be shown by moving the line %6 one up, to before the > previous store operation. This doesn't change the program behaviour but does > influence the optimization. > > More info on this is in lib/Transforms/Scalar/EarlyCSE.cpp (line 415 is a > good start) > > I don't have time to improve this at this moment so I'll leave that to you > (or anyone else that feels inspired). > > Cheers, > Roel > > > On 01/24/2012 03:59 PM, Brent Walker wrote: >> >> Hi Roel, >> the code you list below is precisely what I expect to get (of course >> the stores must happen but the constant folding should happen as >> well). >> >> It all looks very strange. LLVM is behaving as if the __restrict__ >> keyword was not used at all. Even more strange is the fact that for >> this function: >> >> double f(double *__restrict__ x, double *__restrict__ y, double >> *__restrict__ z) >> { >> *x = 1.0; >> *y = *x + 2; >> *z = *x + 3; >> >> return *x + *y + *z; >> } >> >> everything works as expected: >> >> define double @_Z1fPdS_S_(double* noalias nocapture %x, double* >> noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { >> store double 1.000000e+00, double* %x, align 8, !tbaa !0 >> store double 3.000000e+00, double* %y, align 8, !tbaa !0 >> store double 4.000000e+00, double* %z, align 8, !tbaa !0 >> ret double 8.000000e+00 >> } >> >> !0 = metadata !{metadata !"double", metadata !1} >> !1 = metadata !{metadata !"omnipotent char", metadata !2} >> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >> >> So I can't figure out what the mechanism is for telling llvm that two >> pointers that are local variables (and not input arguments) to a >> function do not alias each other. I hope one of the developers can >> shed some light on this. >> >> Thank you in advance for any help, >> Brent >> >> >> On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans<r.jordans at tue.nl> wrote: >>> >>> Hi Brent, >>> >>> Looking at your code I can see at least one reason why some of the store >>> operations remain in the output since you are (through x, y, and z) >>> writing in memory which exists outside of your function (p). >>> >>> Constant propagation also seems to work in the first few lines, *y = *x >>> +1 (%3) is stored directly. >>> >>> The strange thing to me is that the same doesn't happen for *z = *x + 2. >>> Here *x is loaded again and the addition is still performed... >>> >>> From this point on, constant propagation seems to stop working >>> completely. Looking at the IR I would have expected something like the >>> following: >>> >>> define double @f(double** nocapture %p) nounwind uwtable { >>> %1 = load double** %p, align 8, !tbaa !0 >>> %2 = getelementptr inbounds double** %p, i64 1 >>> %3 = load double** %2, align 8, !tbaa !0 >>> %4 = getelementptr inbounds double** %p, i64 2 >>> %5 = load double** %4, align 8, !tbaa !0 >>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>> store double 4.000000e+00, double* %5, align 8, !tbaa !3 >>> ret double 8.000000e+00 >>> } >>> >>> !0 = metadata !{metadata !"any pointer", metadata !1} >>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>> !3 = metadata !{metadata !"double", metadata !1} >>> >>> I am afraid I can't really help you in telling what went wrong here and >>> caused the missing optimizations, but I agree with you that the result >>> in not what I would have expected. >>> >>> Cheers, >>> Roel >>> On 01/23/2012 03:31 AM, Brent Walker wrote: >>>> >>>> Hi LLVMers, >>>> I would like to ask a question regarding aliasing. Suppose I have the >>>> following program: >>>> >>>> double f(double** p ) >>>> { >>>> double a,b,c; >>>> double * x =&a; >>>> double * y =&b; >>>> double * z =&c; >>>> >>>> *x = 1; >>>> *y = *x + 2; >>>> *z = *x + 3; >>>> >>>> return *x+*y+*z; >>>> } >>>> >>>> LLVM can tell that the three pointers do not alias each other so can >>>> perform the constant folding at compile time. >>>> >>>> define double @f(double** nocapture %p) nounwind uwtable readnone { >>>> ret double 8.000000e+00 >>>> } >>>> >>>> Now consider the function below. I know (in my particluar case) that >>>> the pointers in the p array do not alias each other. I tried to >>>> communicate this information to llvm/clang via the __restrict__ >>>> qualifier but it does not seem to have an effect. Can you please >>>> suggest what is wrong. How can I achieve what I want? >>>> >>>> double f(double** p ) >>>> { >>>> double *__restrict__ x = p[0]; >>>> double *__restrict__ y = p[1]; >>>> double *__restrict__ z = p[2]; >>>> >>>> *x = 1; >>>> *y = *x + 2; >>>> *z = *x + 3; >>>> >>>> return *x+*y+*z; >>>> } >>>> >>>> define double @f(double** nocapture %p) nounwind uwtable { >>>> %1 = load double** %p, align 8, !tbaa !0 >>>> %2 = getelementptr inbounds double** %p, i64 1 >>>> %3 = load double** %2, align 8, !tbaa !0 >>>> %4 = getelementptr inbounds double** %p, i64 2 >>>> %5 = load double** %4, align 8, !tbaa !0 >>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>> %6 = load double* %1, align 8, !tbaa !3 >>>> %7 = fadd double %6, 3.000000e+00 >>>> store double %7, double* %5, align 8, !tbaa !3 >>>> %8 = load double* %1, align 8, !tbaa !3 >>>> %9 = load double* %3, align 8, !tbaa !3 >>>> %10 = fadd double %8, %9 >>>> %11 = fadd double %10, %7 >>>> ret double %11 >>>> } >>>> >>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>> !3 = metadata !{metadata !"double", metadata !1} >>>> >>>> Thank you for any help. >>>> >>>> Brent >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev