I think the problem here is that the IR doesn't have any way to attach restrict information to loads/stores/pointers. It works on arguments because they can be given the 'noalias' attribute, and then the alias analyzer must understand what that means. Pete On Jan 24, 2012, at 7:47 AM, Roel Jordans wrote:> I have no clue, I didn't have time to look into that example yet. > > How does the IR (before optimization) differ from the other version? > > Roel > > On 01/24/2012 04:45 PM, Brent Walker wrote: >> Can you explain please why it works for this version of the function: >> >> double f(double *__restrict__ x, double *__restrict__ y, double >> *__restrict__ z); >> >> What is different here? There are stores here as well. >> >> Brent >> >> >> On Wed, Jan 25, 2012 at 12:34 AM, Roel Jordans<r.jordans at tue.nl> wrote: >>> Hi Brent, >>> >>> I think this is a problem in the easy-cse transform. In this transform load >>> operations can be replaced by their subexpression, in this case the >>> propagated constant, based on the value of the 'CurrentGeneration' of memory >>> writes. This implies that any store operation invalidates the knowledge >>> about previously stored subexpressions. >>> >>> In general, this is a safe assumption but in this case it is missing quite >>> some optimization potential. >>> >>> The effect of this can be shown by moving the line %6 one up, to before the >>> previous store operation. This doesn't change the program behaviour but does >>> influence the optimization. >>> >>> More info on this is in lib/Transforms/Scalar/EarlyCSE.cpp (line 415 is a >>> good start) >>> >>> I don't have time to improve this at this moment so I'll leave that to you >>> (or anyone else that feels inspired). >>> >>> Cheers, >>> Roel >>> >>> >>> On 01/24/2012 03:59 PM, Brent Walker wrote: >>>> >>>> Hi Roel, >>>> the code you list below is precisely what I expect to get (of course >>>> the stores must happen but the constant folding should happen as >>>> well). >>>> >>>> It all looks very strange. LLVM is behaving as if the __restrict__ >>>> keyword was not used at all. Even more strange is the fact that for >>>> this function: >>>> >>>> double f(double *__restrict__ x, double *__restrict__ y, double >>>> *__restrict__ z) >>>> { >>>> *x = 1.0; >>>> *y = *x + 2; >>>> *z = *x + 3; >>>> >>>> return *x + *y + *z; >>>> } >>>> >>>> everything works as expected: >>>> >>>> define double @_Z1fPdS_S_(double* noalias nocapture %x, double* >>>> noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { >>>> store double 1.000000e+00, double* %x, align 8, !tbaa !0 >>>> store double 3.000000e+00, double* %y, align 8, !tbaa !0 >>>> store double 4.000000e+00, double* %z, align 8, !tbaa !0 >>>> ret double 8.000000e+00 >>>> } >>>> >>>> !0 = metadata !{metadata !"double", metadata !1} >>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>> >>>> So I can't figure out what the mechanism is for telling llvm that two >>>> pointers that are local variables (and not input arguments) to a >>>> function do not alias each other. I hope one of the developers can >>>> shed some light on this. >>>> >>>> Thank you in advance for any help, >>>> Brent >>>> >>>> >>>> On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans<r.jordans at tue.nl> wrote: >>>>> >>>>> Hi Brent, >>>>> >>>>> Looking at your code I can see at least one reason why some of the store >>>>> operations remain in the output since you are (through x, y, and z) >>>>> writing in memory which exists outside of your function (p). >>>>> >>>>> Constant propagation also seems to work in the first few lines, *y = *x >>>>> +1 (%3) is stored directly. >>>>> >>>>> The strange thing to me is that the same doesn't happen for *z = *x + 2. >>>>> Here *x is loaded again and the addition is still performed... >>>>> >>>>> From this point on, constant propagation seems to stop working >>>>> completely. Looking at the IR I would have expected something like the >>>>> following: >>>>> >>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>> store double 4.000000e+00, double* %5, align 8, !tbaa !3 >>>>> ret double 8.000000e+00 >>>>> } >>>>> >>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>> >>>>> I am afraid I can't really help you in telling what went wrong here and >>>>> caused the missing optimizations, but I agree with you that the result >>>>> in not what I would have expected. >>>>> >>>>> Cheers, >>>>> Roel >>>>> On 01/23/2012 03:31 AM, Brent Walker wrote: >>>>>> >>>>>> Hi LLVMers, >>>>>> I would like to ask a question regarding aliasing. Suppose I have the >>>>>> following program: >>>>>> >>>>>> double f(double** p ) >>>>>> { >>>>>> double a,b,c; >>>>>> double * x =&a; >>>>>> double * y =&b; >>>>>> double * z =&c; >>>>>> >>>>>> *x = 1; >>>>>> *y = *x + 2; >>>>>> *z = *x + 3; >>>>>> >>>>>> return *x+*y+*z; >>>>>> } >>>>>> >>>>>> LLVM can tell that the three pointers do not alias each other so can >>>>>> perform the constant folding at compile time. >>>>>> >>>>>> define double @f(double** nocapture %p) nounwind uwtable readnone { >>>>>> ret double 8.000000e+00 >>>>>> } >>>>>> >>>>>> Now consider the function below. I know (in my particluar case) that >>>>>> the pointers in the p array do not alias each other. I tried to >>>>>> communicate this information to llvm/clang via the __restrict__ >>>>>> qualifier but it does not seem to have an effect. Can you please >>>>>> suggest what is wrong. How can I achieve what I want? >>>>>> >>>>>> double f(double** p ) >>>>>> { >>>>>> double *__restrict__ x = p[0]; >>>>>> double *__restrict__ y = p[1]; >>>>>> double *__restrict__ z = p[2]; >>>>>> >>>>>> *x = 1; >>>>>> *y = *x + 2; >>>>>> *z = *x + 3; >>>>>> >>>>>> return *x+*y+*z; >>>>>> } >>>>>> >>>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>>> %6 = load double* %1, align 8, !tbaa !3 >>>>>> %7 = fadd double %6, 3.000000e+00 >>>>>> store double %7, double* %5, align 8, !tbaa !3 >>>>>> %8 = load double* %1, align 8, !tbaa !3 >>>>>> %9 = load double* %3, align 8, !tbaa !3 >>>>>> %10 = fadd double %8, %9 >>>>>> %11 = fadd double %10, %7 >>>>>> ret double %11 >>>>>> } >>>>>> >>>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>>> >>>>>> Thank you for any help. >>>>>> >>>>>> Brent >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Peter Cooper wrote:> I think the problem here is that the IR doesn't have any way to attach restrict information to loads/stores/pointers.I think we do now, actually. Now that the loads and stores have TBAA metadata, I think the restrict attribute can go there. It needs to be attached to every use of a restrict pointer, but that's similar to how TBAA already works.> It works on arguments because they can be given the 'noalias' attribute, and then the alias analyzer must understand what that means.Yep, I concur. Nick> Pete > > On Jan 24, 2012, at 7:47 AM, Roel Jordans wrote: > >> I have no clue, I didn't have time to look into that example yet. >> >> How does the IR (before optimization) differ from the other version? >> >> Roel >> >> On 01/24/2012 04:45 PM, Brent Walker wrote: >>> Can you explain please why it works for this version of the function: >>> >>> double f(double *__restrict__ x, double *__restrict__ y, double >>> *__restrict__ z); >>> >>> What is different here? There are stores here as well. >>> >>> Brent >>> >>> >>> On Wed, Jan 25, 2012 at 12:34 AM, Roel Jordans<r.jordans at tue.nl> wrote: >>>> Hi Brent, >>>> >>>> I think this is a problem in the easy-cse transform. In this transform load >>>> operations can be replaced by their subexpression, in this case the >>>> propagated constant, based on the value of the 'CurrentGeneration' of memory >>>> writes. This implies that any store operation invalidates the knowledge >>>> about previously stored subexpressions. >>>> >>>> In general, this is a safe assumption but in this case it is missing quite >>>> some optimization potential. >>>> >>>> The effect of this can be shown by moving the line %6 one up, to before the >>>> previous store operation. This doesn't change the program behaviour but does >>>> influence the optimization. >>>> >>>> More info on this is in lib/Transforms/Scalar/EarlyCSE.cpp (line 415 is a >>>> good start) >>>> >>>> I don't have time to improve this at this moment so I'll leave that to you >>>> (or anyone else that feels inspired). >>>> >>>> Cheers, >>>> Roel >>>> >>>> >>>> On 01/24/2012 03:59 PM, Brent Walker wrote: >>>>> >>>>> Hi Roel, >>>>> the code you list below is precisely what I expect to get (of course >>>>> the stores must happen but the constant folding should happen as >>>>> well). >>>>> >>>>> It all looks very strange. LLVM is behaving as if the __restrict__ >>>>> keyword was not used at all. Even more strange is the fact that for >>>>> this function: >>>>> >>>>> double f(double *__restrict__ x, double *__restrict__ y, double >>>>> *__restrict__ z) >>>>> { >>>>> *x = 1.0; >>>>> *y = *x + 2; >>>>> *z = *x + 3; >>>>> >>>>> return *x + *y + *z; >>>>> } >>>>> >>>>> everything works as expected: >>>>> >>>>> define double @_Z1fPdS_S_(double* noalias nocapture %x, double* >>>>> noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { >>>>> store double 1.000000e+00, double* %x, align 8, !tbaa !0 >>>>> store double 3.000000e+00, double* %y, align 8, !tbaa !0 >>>>> store double 4.000000e+00, double* %z, align 8, !tbaa !0 >>>>> ret double 8.000000e+00 >>>>> } >>>>> >>>>> !0 = metadata !{metadata !"double", metadata !1} >>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>> >>>>> So I can't figure out what the mechanism is for telling llvm that two >>>>> pointers that are local variables (and not input arguments) to a >>>>> function do not alias each other. I hope one of the developers can >>>>> shed some light on this. >>>>> >>>>> Thank you in advance for any help, >>>>> Brent >>>>> >>>>> >>>>> On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans<r.jordans at tue.nl> wrote: >>>>>> >>>>>> Hi Brent, >>>>>> >>>>>> Looking at your code I can see at least one reason why some of the store >>>>>> operations remain in the output since you are (through x, y, and z) >>>>>> writing in memory which exists outside of your function (p). >>>>>> >>>>>> Constant propagation also seems to work in the first few lines, *y = *x >>>>>> +1 (%3) is stored directly. >>>>>> >>>>>> The strange thing to me is that the same doesn't happen for *z = *x + 2. >>>>>> Here *x is loaded again and the addition is still performed... >>>>>> >>>>>> From this point on, constant propagation seems to stop working >>>>>> completely. Looking at the IR I would have expected something like the >>>>>> following: >>>>>> >>>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>>> store double 4.000000e+00, double* %5, align 8, !tbaa !3 >>>>>> ret double 8.000000e+00 >>>>>> } >>>>>> >>>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>>> >>>>>> I am afraid I can't really help you in telling what went wrong here and >>>>>> caused the missing optimizations, but I agree with you that the result >>>>>> in not what I would have expected. >>>>>> >>>>>> Cheers, >>>>>> Roel >>>>>> On 01/23/2012 03:31 AM, Brent Walker wrote: >>>>>>> >>>>>>> Hi LLVMers, >>>>>>> I would like to ask a question regarding aliasing. Suppose I have the >>>>>>> following program: >>>>>>> >>>>>>> double f(double** p ) >>>>>>> { >>>>>>> double a,b,c; >>>>>>> double * x =&a; >>>>>>> double * y =&b; >>>>>>> double * z =&c; >>>>>>> >>>>>>> *x = 1; >>>>>>> *y = *x + 2; >>>>>>> *z = *x + 3; >>>>>>> >>>>>>> return *x+*y+*z; >>>>>>> } >>>>>>> >>>>>>> LLVM can tell that the three pointers do not alias each other so can >>>>>>> perform the constant folding at compile time. >>>>>>> >>>>>>> define double @f(double** nocapture %p) nounwind uwtable readnone { >>>>>>> ret double 8.000000e+00 >>>>>>> } >>>>>>> >>>>>>> Now consider the function below. I know (in my particluar case) that >>>>>>> the pointers in the p array do not alias each other. I tried to >>>>>>> communicate this information to llvm/clang via the __restrict__ >>>>>>> qualifier but it does not seem to have an effect. Can you please >>>>>>> suggest what is wrong. How can I achieve what I want? >>>>>>> >>>>>>> double f(double** p ) >>>>>>> { >>>>>>> double *__restrict__ x = p[0]; >>>>>>> double *__restrict__ y = p[1]; >>>>>>> double *__restrict__ z = p[2]; >>>>>>> >>>>>>> *x = 1; >>>>>>> *y = *x + 2; >>>>>>> *z = *x + 3; >>>>>>> >>>>>>> return *x+*y+*z; >>>>>>> } >>>>>>> >>>>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>>>> %6 = load double* %1, align 8, !tbaa !3 >>>>>>> %7 = fadd double %6, 3.000000e+00 >>>>>>> store double %7, double* %5, align 8, !tbaa !3 >>>>>>> %8 = load double* %1, align 8, !tbaa !3 >>>>>>> %9 = load double* %3, align 8, !tbaa !3 >>>>>>> %10 = fadd double %8, %9 >>>>>>> %11 = fadd double %10, %7 >>>>>>> ret double %11 >>>>>>> } >>>>>>> >>>>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>>>> >>>>>>> Thank you for any help. >>>>>>> >>>>>>> Brent >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Jan 24, 2012, at 11:00 AM, Nick Lewycky wrote:> Peter Cooper wrote: >> I think the problem here is that the IR doesn't have any way to attach restrict information to loads/stores/pointers. > > I think we do now, actually. Now that the loads and stores have TBAA metadata, I think the restrict attribute can go there. It needs to be attached to every use of a restrict pointer, but that's similar to how TBAA already works.Yeah, metadata would work between 2 restrict pointers, but i'm not sure if its safe to use between restrict and non-restrict pointers. For example, float* __restrict p = … float* q = p; *p = ... *q = … The 2 stores would be given different metadata which would make them not alias, but i'm not sure if thats what the spec says.> >> It works on arguments because they can be given the 'noalias' attribute, and then the alias analyzer must understand what that means. > > Yep, I concur. > > Nick > >> Pete >> >> On Jan 24, 2012, at 7:47 AM, Roel Jordans wrote: >> >>> I have no clue, I didn't have time to look into that example yet. >>> >>> How does the IR (before optimization) differ from the other version? >>> >>> Roel >>> >>> On 01/24/2012 04:45 PM, Brent Walker wrote: >>>> Can you explain please why it works for this version of the function: >>>> >>>> double f(double *__restrict__ x, double *__restrict__ y, double >>>> *__restrict__ z); >>>> >>>> What is different here? There are stores here as well. >>>> >>>> Brent >>>> >>>> >>>> On Wed, Jan 25, 2012 at 12:34 AM, Roel Jordans<r.jordans at tue.nl> wrote: >>>>> Hi Brent, >>>>> >>>>> I think this is a problem in the easy-cse transform. In this transform load >>>>> operations can be replaced by their subexpression, in this case the >>>>> propagated constant, based on the value of the 'CurrentGeneration' of memory >>>>> writes. This implies that any store operation invalidates the knowledge >>>>> about previously stored subexpressions. >>>>> >>>>> In general, this is a safe assumption but in this case it is missing quite >>>>> some optimization potential. >>>>> >>>>> The effect of this can be shown by moving the line %6 one up, to before the >>>>> previous store operation. This doesn't change the program behaviour but does >>>>> influence the optimization. >>>>> >>>>> More info on this is in lib/Transforms/Scalar/EarlyCSE.cpp (line 415 is a >>>>> good start) >>>>> >>>>> I don't have time to improve this at this moment so I'll leave that to you >>>>> (or anyone else that feels inspired). >>>>> >>>>> Cheers, >>>>> Roel >>>>> >>>>> >>>>> On 01/24/2012 03:59 PM, Brent Walker wrote: >>>>>> >>>>>> Hi Roel, >>>>>> the code you list below is precisely what I expect to get (of course >>>>>> the stores must happen but the constant folding should happen as >>>>>> well). >>>>>> >>>>>> It all looks very strange. LLVM is behaving as if the __restrict__ >>>>>> keyword was not used at all. Even more strange is the fact that for >>>>>> this function: >>>>>> >>>>>> double f(double *__restrict__ x, double *__restrict__ y, double >>>>>> *__restrict__ z) >>>>>> { >>>>>> *x = 1.0; >>>>>> *y = *x + 2; >>>>>> *z = *x + 3; >>>>>> >>>>>> return *x + *y + *z; >>>>>> } >>>>>> >>>>>> everything works as expected: >>>>>> >>>>>> define double @_Z1fPdS_S_(double* noalias nocapture %x, double* >>>>>> noalias nocapture %y, double* noalias nocapture %z) nounwind uwtable { >>>>>> store double 1.000000e+00, double* %x, align 8, !tbaa !0 >>>>>> store double 3.000000e+00, double* %y, align 8, !tbaa !0 >>>>>> store double 4.000000e+00, double* %z, align 8, !tbaa !0 >>>>>> ret double 8.000000e+00 >>>>>> } >>>>>> >>>>>> !0 = metadata !{metadata !"double", metadata !1} >>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>> >>>>>> So I can't figure out what the mechanism is for telling llvm that two >>>>>> pointers that are local variables (and not input arguments) to a >>>>>> function do not alias each other. I hope one of the developers can >>>>>> shed some light on this. >>>>>> >>>>>> Thank you in advance for any help, >>>>>> Brent >>>>>> >>>>>> >>>>>> On Tue, Jan 24, 2012 at 9:01 PM, Roel Jordans<r.jordans at tue.nl> wrote: >>>>>>> >>>>>>> Hi Brent, >>>>>>> >>>>>>> Looking at your code I can see at least one reason why some of the store >>>>>>> operations remain in the output since you are (through x, y, and z) >>>>>>> writing in memory which exists outside of your function (p). >>>>>>> >>>>>>> Constant propagation also seems to work in the first few lines, *y = *x >>>>>>> +1 (%3) is stored directly. >>>>>>> >>>>>>> The strange thing to me is that the same doesn't happen for *z = *x + 2. >>>>>>> Here *x is loaded again and the addition is still performed... >>>>>>> >>>>>>> From this point on, constant propagation seems to stop working >>>>>>> completely. Looking at the IR I would have expected something like the >>>>>>> following: >>>>>>> >>>>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>>>> store double 4.000000e+00, double* %5, align 8, !tbaa !3 >>>>>>> ret double 8.000000e+00 >>>>>>> } >>>>>>> >>>>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>>>> >>>>>>> I am afraid I can't really help you in telling what went wrong here and >>>>>>> caused the missing optimizations, but I agree with you that the result >>>>>>> in not what I would have expected. >>>>>>> >>>>>>> Cheers, >>>>>>> Roel >>>>>>> On 01/23/2012 03:31 AM, Brent Walker wrote: >>>>>>>> >>>>>>>> Hi LLVMers, >>>>>>>> I would like to ask a question regarding aliasing. Suppose I have the >>>>>>>> following program: >>>>>>>> >>>>>>>> double f(double** p ) >>>>>>>> { >>>>>>>> double a,b,c; >>>>>>>> double * x =&a; >>>>>>>> double * y =&b; >>>>>>>> double * z =&c; >>>>>>>> >>>>>>>> *x = 1; >>>>>>>> *y = *x + 2; >>>>>>>> *z = *x + 3; >>>>>>>> >>>>>>>> return *x+*y+*z; >>>>>>>> } >>>>>>>> >>>>>>>> LLVM can tell that the three pointers do not alias each other so can >>>>>>>> perform the constant folding at compile time. >>>>>>>> >>>>>>>> define double @f(double** nocapture %p) nounwind uwtable readnone { >>>>>>>> ret double 8.000000e+00 >>>>>>>> } >>>>>>>> >>>>>>>> Now consider the function below. I know (in my particluar case) that >>>>>>>> the pointers in the p array do not alias each other. I tried to >>>>>>>> communicate this information to llvm/clang via the __restrict__ >>>>>>>> qualifier but it does not seem to have an effect. Can you please >>>>>>>> suggest what is wrong. How can I achieve what I want? >>>>>>>> >>>>>>>> double f(double** p ) >>>>>>>> { >>>>>>>> double *__restrict__ x = p[0]; >>>>>>>> double *__restrict__ y = p[1]; >>>>>>>> double *__restrict__ z = p[2]; >>>>>>>> >>>>>>>> *x = 1; >>>>>>>> *y = *x + 2; >>>>>>>> *z = *x + 3; >>>>>>>> >>>>>>>> return *x+*y+*z; >>>>>>>> } >>>>>>>> >>>>>>>> define double @f(double** nocapture %p) nounwind uwtable { >>>>>>>> %1 = load double** %p, align 8, !tbaa !0 >>>>>>>> %2 = getelementptr inbounds double** %p, i64 1 >>>>>>>> %3 = load double** %2, align 8, !tbaa !0 >>>>>>>> %4 = getelementptr inbounds double** %p, i64 2 >>>>>>>> %5 = load double** %4, align 8, !tbaa !0 >>>>>>>> store double 1.000000e+00, double* %1, align 8, !tbaa !3 >>>>>>>> store double 3.000000e+00, double* %3, align 8, !tbaa !3 >>>>>>>> %6 = load double* %1, align 8, !tbaa !3 >>>>>>>> %7 = fadd double %6, 3.000000e+00 >>>>>>>> store double %7, double* %5, align 8, !tbaa !3 >>>>>>>> %8 = load double* %1, align 8, !tbaa !3 >>>>>>>> %9 = load double* %3, align 8, !tbaa !3 >>>>>>>> %10 = fadd double %8, %9 >>>>>>>> %11 = fadd double %10, %7 >>>>>>>> ret double %11 >>>>>>>> } >>>>>>>> >>>>>>>> !0 = metadata !{metadata !"any pointer", metadata !1} >>>>>>>> !1 = metadata !{metadata !"omnipotent char", metadata !2} >>>>>>>> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >>>>>>>> !3 = metadata !{metadata !"double", metadata !1} >>>>>>>> >>>>>>>> Thank you for any help. >>>>>>>> >>>>>>>> Brent >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >