Thank you for your reply. The compromise you describe below, is it a compromise in the LLVM back end or in clang? I run into this while building a compiler for a small DSL language for which I generate functions that receive a context from which they extract a bunch of pointers to doubles from which inputs are passed to the function (I just used C/clang in my examples to illustrate the problem). Unless I can mark these extracted pointers as non-aliasing, performance suffers a lot. It's not just CSE that suffers -- LLVM does not do copy propagation either so we keep loading the same values from memory over and over again. See the example here: double f(double a, double * x, double *y, double * z) { *x = a; *y = a+1; *z = *x + 3; return *x + *y + *z; } define double @f(double %a, double* nocapture %x, double* nocapture %y, double* nocapture %z) nounwind uwtable { store double %a, double* %x, align 8, !tbaa !0 %1 = fadd double %a, 1.000000e+00 store double %1, double* %y, align 8, !tbaa !0 %2 = load double* %x, align 8, !tbaa !0 %3 = fadd double %2, 3.000000e+00 store double %3, double* %z, align 8, !tbaa !0 %4 = load double* %x, align 8, !tbaa !0 %5 = load double* %y, align 8, !tbaa !0 %6 = fadd double %4, %5 %7 = fadd double %6, %3 ret double %7 } !0 = metadata !{metadata !"double", metadata !1} !1 = metadata !{metadata !"omnipotent char", metadata !2} !2 = metadata !{metadata !"Simple C/C++ TBAA", null} (I used arguments here without __restrict__ which has the same effect as loading my pointers from context as locals). As you can see we keep loading the value of x from memory, even though we just stored a local into it. Given that I am generating LLVM IR directly (via the C++ interface) can you suggest someway I could pass the noalias attribute onto the locals? One work around is of course to generate two functions as follows: double f1( struct ctx* ctx ) { return f2(ctx->a, ctx->x, ctx->y, ctx->z); } double f2( double a, double *__restrict__ x, double *__restrict__ y, double *__restrict__ z) { *x = a; *y = a+1; *z = *x + 3; return *x + *y + *z; } but if at all possible I would like to avoid such acrobatics. Thank you in advance for any help. Brent On Wed, Jan 25, 2012 at 4:52 AM, Dan Gohman <gohman at apple.com> wrote:> On Jan 24, 2012, at 7:45 AM, Brent Walker wrote: > >> Can you explain please why it works for this version of the function: >> >> double f(double *__restrict__ x, double *__restrict__ y, double >> *__restrict__ z); >> >> What is different here? There are stores here as well. > > LLVM ignores restrict everywhere except function parameters. This is a > compromise aimed at a sweet spot in the balance of compiler complexity > vs. optimization opportunity. > > - Many analysis and optimization techniques naturally apply to whole > functions. When restrict appears on a local variable inside a > function, its special aliasing property applies to only a subset of > the function. It's awkward to teach such code to understand and > respect local scope boundaries, in general. > > - Function boundaries are often the boundaries of analysis. > Interprocedural analysis can be expensive and complex, so many > optimization passes are limited to thinking about one function > at a time. And even interprocedural analysis passes are > bounded by shared library boundaries. While local variables can > often be analyzed automatically (as in your first example), > function paramters are often incoming mystery values, so they > are where restrict is most often interesting. > > This compromise does mean that some opportunities are lost (as in > your second example), but from clang's perspective these cases are > rare. > > Dan >
Hi Brent,> Unless I can mark these extracted pointers as non-aliasing, performance > suffers a lot. It's not just CSE that suffers -- LLVM does not do > copy propagation either so we keep loading the same values from memory > over and over again. See the example here: > > double f(double a, double * x, double *y, double * z) > { > *x = a; > *y = a+1; > *z = *x + 3; > > return *x + *y + *z; > }here you are obliged to reload the values since some of the pointers might be equal. For example the store to *y will change the value of *x if x and y are the same pointer. Ciao, Duncan.> > define double @f(double %a, double* nocapture %x, double* nocapture > %y, double* nocapture %z) nounwind uwtable { > store double %a, double* %x, align 8, !tbaa !0 > %1 = fadd double %a, 1.000000e+00 > store double %1, double* %y, align 8, !tbaa !0 > %2 = load double* %x, align 8, !tbaa !0 > %3 = fadd double %2, 3.000000e+00 > store double %3, double* %z, align 8, !tbaa !0 > %4 = load double* %x, align 8, !tbaa !0 > %5 = load double* %y, align 8, !tbaa !0 > %6 = fadd double %4, %5 > %7 = fadd double %6, %3 > ret double %7 > } > > !0 = metadata !{metadata !"double", metadata !1} > !1 = metadata !{metadata !"omnipotent char", metadata !2} > !2 = metadata !{metadata !"Simple C/C++ TBAA", null} > > (I used arguments here without __restrict__ which has the same effect as > loading my pointers from context as locals). As you can see we keep > loading the value of x from memory, even though we just stored a local > into it. Given that I am generating LLVM IR directly (via the C++ > interface) can you suggest someway I could pass the noalias attribute > onto the locals? > > One work around is of course to generate two functions as follows: > > double f1( struct ctx* ctx ) > { > return f2(ctx->a, ctx->x, ctx->y, ctx->z); > } > > double f2( double a, double *__restrict__ x, double *__restrict__ y, > double *__restrict__ z) > { > *x = a; > *y = a+1; > *z = *x + 3; > > return *x + *y + *z; > } > > but if at all possible I would like to avoid such acrobatics. > > Thank you in advance for any help. > Brent > > > On Wed, Jan 25, 2012 at 4:52 AM, Dan Gohman<gohman at apple.com> wrote: >> On Jan 24, 2012, at 7:45 AM, Brent Walker wrote: >> >>> Can you explain please why it works for this version of the function: >>> >>> double f(double *__restrict__ x, double *__restrict__ y, double >>> *__restrict__ z); >>> >>> What is different here? There are stores here as well. >> >> LLVM ignores restrict everywhere except function parameters. This is a >> compromise aimed at a sweet spot in the balance of compiler complexity >> vs. optimization opportunity. >> >> - Many analysis and optimization techniques naturally apply to whole >> functions. When restrict appears on a local variable inside a >> function, its special aliasing property applies to only a subset of >> the function. It's awkward to teach such code to understand and >> respect local scope boundaries, in general. >> >> - Function boundaries are often the boundaries of analysis. >> Interprocedural analysis can be expensive and complex, so many >> optimization passes are limited to thinking about one function >> at a time. And even interprocedural analysis passes are >> bounded by shared library boundaries. While local variables can >> often be analyzed automatically (as in your first example), >> function paramters are often incoming mystery values, so they >> are where restrict is most often interesting. >> >> This compromise does mean that some opportunities are lost (as in >> your second example), but from clang's perspective these cases are >> rare. >> >> Dan >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi Duncan, you have misunderstood me I think. Of course in this case you have to reload the doubles since as you say the pointers may be aliasing each other. I just wrote it this way to show the kind of code one gets if one cannot specify that pointers do now alias -- not adding the __restrict__ to the function arguments gives the same code as in my example with the local variables (with or without the restrict). If you could you answer the question in my email I would be grateful. Given that I am generating IR directly, is it possible for me to mark locals pointers as non-aliasing? When you said LLVM cannot do it, did you mean clang cannot or the backend cannot? Is the 2 function solution I outlined in my previous email my only option? Thanks, Brent On Thu, Jan 26, 2012 at 12:45 AM, Duncan Sands <baldrick at free.fr> wrote:> Hi Brent, > >> Unless I can mark these extracted pointers as non-aliasing, performance >> suffers a lot. It's not just CSE that suffers -- LLVM does not do >> copy propagation either so we keep loading the same values from memory >> over and over again. See the example here: >> >> double f(double a, double * x, double *y, double * z) >> { >> *x = a; >> *y = a+1; >> *z = *x + 3; >> >> return *x + *y + *z; >> } > > here you are obliged to reload the values since some of the pointers might > be equal. For example the store to *y will change the value of *x if x and > y are the same pointer. > > Ciao, Duncan. > >> >> define double @f(double %a, double* nocapture %x, double* nocapture >> %y, double* nocapture %z) nounwind uwtable { >> store double %a, double* %x, align 8, !tbaa !0 >> %1 = fadd double %a, 1.000000e+00 >> store double %1, double* %y, align 8, !tbaa !0 >> %2 = load double* %x, align 8, !tbaa !0 >> %3 = fadd double %2, 3.000000e+00 >> store double %3, double* %z, align 8, !tbaa !0 >> %4 = load double* %x, align 8, !tbaa !0 >> %5 = load double* %y, align 8, !tbaa !0 >> %6 = fadd double %4, %5 >> %7 = fadd double %6, %3 >> ret double %7 >> } >> >> !0 = metadata !{metadata !"double", metadata !1} >> !1 = metadata !{metadata !"omnipotent char", metadata !2} >> !2 = metadata !{metadata !"Simple C/C++ TBAA", null} >> >> (I used arguments here without __restrict__ which has the same effect as >> loading my pointers from context as locals). As you can see we keep >> loading the value of x from memory, even though we just stored a local >> into it. Given that I am generating LLVM IR directly (via the C++ >> interface) can you suggest someway I could pass the noalias attribute >> onto the locals? >> >> One work around is of course to generate two functions as follows: >> >> double f1( struct ctx* ctx ) >> { >> return f2(ctx->a, ctx->x, ctx->y, ctx->z); >> } >> >> double f2( double a, double *__restrict__ x, double *__restrict__ y, >> double *__restrict__ z) >> { >> *x = a; >> *y = a+1; >> *z = *x + 3; >> >> return *x + *y + *z; >> } >> >> but if at all possible I would like to avoid such acrobatics. >> >> Thank you in advance for any help. >> Brent >> >> >> On Wed, Jan 25, 2012 at 4:52 AM, Dan Gohman<gohman at apple.com> wrote: >>> On Jan 24, 2012, at 7:45 AM, Brent Walker wrote: >>> >>>> Can you explain please why it works for this version of the function: >>>> >>>> double f(double *__restrict__ x, double *__restrict__ y, double >>>> *__restrict__ z); >>>> >>>> What is different here? There are stores here as well. >>> >>> LLVM ignores restrict everywhere except function parameters. This is a >>> compromise aimed at a sweet spot in the balance of compiler complexity >>> vs. optimization opportunity. >>> >>> - Many analysis and optimization techniques naturally apply to whole >>> functions. When restrict appears on a local variable inside a >>> function, its special aliasing property applies to only a subset of >>> the function. It's awkward to teach such code to understand and >>> respect local scope boundaries, in general. >>> >>> - Function boundaries are often the boundaries of analysis. >>> Interprocedural analysis can be expensive and complex, so many >>> optimization passes are limited to thinking about one function >>> at a time. And even interprocedural analysis passes are >>> bounded by shared library boundaries. While local variables can >>> often be analyzed automatically (as in your first example), >>> function paramters are often incoming mystery values, so they >>> are where restrict is most often interesting. >>> >>> This compromise does mean that some opportunities are lost (as in >>> your second example), but from clang's perspective these cases are >>> rare. >>> >>> Dan >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Jan 24, 2012, at 8:58 PM, Brent Walker wrote:> Thank you for your reply. The compromise you describe below, is it a > compromise in the LLVM back end or in clang? I run into this while > building a compiler for a small DSL language for which I generate > functions that receive a context from which they extract a bunch of > pointers to doubles from which inputs are passed to the function (I > just used C/clang in my examples to illustrate the problem).For restrict-style alias information, the limitation is in LLVM. Since you're not using clang, you might be able to use a custom TBAA type tree instead. TBAA works differently from restrict; it applies to loads and stores, rather than to pointers. But if you can frame your aliasing property as a type-oriented property, saying for example that each array element is a pointer to a distinct type, it may suffice. Dan
Hi Dan and Others , I'm newbie to LLVM and Clang ,But has experience on compiler optimization and VM . Everyone talking about the LLVM in my organisation so thought of peeking into it and where this discussion is stalled me ... so i tried to simulate the problem ,which is discussed here . vi sample.c double f(double** p ) { double a,b,c; double * x = &a; double * y = &b; double * z = &c; *x = 1; *y = *x + 2; *z = *x + 3; return *x+*y+*z; } double ff(double** p ) { double a,b,c; double * x = &a; double * y = &b; double * z = &c; *x = 1; *y = *x + 2; *z = *x + 3; return *x+*y+*z; } compiled the sample.c .i.e clang sample.c -S -O3 -emit-llvm cat sample.s target datalayout "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32-S32" target triple = "i386-pc-cygwin" define double @f(double** nocapture %p) nounwind readnone { ret double 8.000000e+00 } define double @ff(double** nocapture %p) nounwind readnone { ret double 8.000000e+00 } Boom ...BasicAA or TBAA is doing what it suppose to do :) for restrict qualifier too.. Any lights on this ??? FYI , $ lli --version Low Level Virtual Machine (http://llvm.org/): llvm version 3.0 Optimized build. Built Jan 24 2012 (05:48:10). Host: i386-pc-cygwin Host CPU: penryn Thanks ~Umesh On Fri, Jan 27, 2012 at 12:37 AM, Dan Gohman <gohman at apple.com> wrote:> On Jan 24, 2012, at 8:58 PM, Brent Walker wrote: > > > Thank you for your reply. The compromise you describe below, is it a > > compromise in the LLVM back end or in clang? I run into this while > > building a compiler for a small DSL language for which I generate > > functions that receive a context from which they extract a bunch of > > pointers to doubles from which inputs are passed to the function (I > > just used C/clang in my examples to illustrate the problem). > > For restrict-style alias information, the limitation is in LLVM. > > Since you're not using clang, you might be able to use a custom TBAA > type tree instead. TBAA works differently from restrict; it applies > to loads and stores, rather than to pointers. But if you can frame > your aliasing property as a type-oriented property, saying for > example that each array element is a pointer to a distinct type, it > may suffice. > > Dan > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120131/707295d4/attachment.html>