As simple as void foo (int n, double *p, int *q) { for (int i = 0; i < n; i++) *p += *q; } clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc Memory accesses remain in the loop. The following works fine: void foo(int n, double *restrict p, int * restrict *q) { ... } By the way, Is there a performance category in the llvm bug database? Thanks, David On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com> wrote:> > On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: > > > > > > > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com> > > 2010/10/27 Xinliang David Li <xinliangli at gmail.com>: > > > Thanks. Just built clang and saw the meta data and annotations on the > memory > > > accesses -- is any opt pass consuming the information? > > > > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that at > > least licm is using it. Also note that > > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as enable-tbaa option > > that is off by default. > > LICM, GVN, and DSE are the major consumers right now. That said, the > current TBAA implementation is not very advanced yet. > > > I tried the option -- no much differences in the generated code. > > Can you give an example of code you'd expect to be optimized which isn't? > > Dan > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101028/d688e25b/attachment.html>
Xinliang David Li wrote:> As simple as > > void foo (int n, double *p, int *q) > { > for (int i = 0; i < n; i++) > *p += *q; > } > > clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c > llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bcThere's a couple things interacting here: * clang -fstrict-aliasing -O2 does generate the TBAA info, but it runs the optimizers without enabling the -enable-tbaa flag, so the optimizers never look at it. Oops. * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the resulting .bc file. This is probably intended to speed up -O0 builds even if -fstrict-aliasing is set, but is annoying for debugging what's going on under the hood. * If clang -O2 worked by running 'opt' and 'llc' under the hood, we could tell it to pass a flag along to them, but it doesn't. As it stands, you can't turn -enable-tbaa on when running clang. So, putting that together, one way to do it is: clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc opt -O2 -enable-tbaa foo.bc foo2.bc llc -O2 -enable-tbaa foo2.bc -o foo2.s at which point the opt run will hoist the loads into a loop preheader. Sadly this runs the LLVM optimizers twice (once in clang -O2 and once in opt) which could skew results. I think the right thing to do is to teach the clang driver to remove -fstrict-aliasing from the cc1 invocation when optimizations are off. This would let us force the flag through with "-Xclang -fstrict-aliasing".> Memory accesses remain in the loop. > > The following works fine: > > void foo(int n, double *restrict p, int * restrict *q) > { > ... > } > > By the way, Is there a performance category in the llvm bug database?Nope, we file bugs based on the type of optimization ought to solve it (ie., there's a Scalar optimizations category, a Loop optimizer category, Backend: X86, etc.). Many miscellaneous performance improvements actually live in lib/Target/README.txt (and subdirs of there) instead of the bug tracker. Nick> Thanks, > > David > > On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com > <mailto:gohman at apple.com>> wrote: > > > On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: > > > > > > > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com > <mailto:rafael.espindola at gmail.com>> > > 2010/10/27 Xinliang David Li <xinliangli at gmail.com > <mailto:xinliangli at gmail.com>>: > > > Thanks. Just built clang and saw the meta data and annotations > on the memory > > > accesses -- is any opt pass consuming the information? > > > > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that at > > least licm is using it. Also note that > > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as enable-tbaa option > > that is off by default. > > LICM, GVN, and DSE are the major consumers right now. That said, the > current TBAA implementation is not very advanced yet. > > > I tried the option -- no much differences in the generated code. > > Can you give an example of code you'd expect to be optimized which > isn't? > > Dan > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 29.10.2010, at 09:26, Nick Lewycky wrote:> * If clang -O2 worked by running 'opt' and 'llc' under the hood, we > could tell it to pass a flag along to them, but it doesn't. As it > stands, you can't turn -enable-tbaa on when running clang. > > So, putting that together, one way to do it is: > > clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc > opt -O2 -enable-tbaa foo.bc foo2.bc > llc -O2 -enable-tbaa foo2.bc -o foo2.sclang -O2 foo.c -S -o foo.s -mllvm -enable-tbaa
> * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the > resulting .bc file. This is probably intended to speed up -O0 builds > even if -fstrict-aliasing is set, but is annoying for debugging what's > going on under the hood.I would expect -O0 to turn off strict-aliasing, so this seems like correct behavior, applying the usual "last flag wins" rule for conflicts. -O0 -fstrict-aliasing should do what you want.
On Oct 29, 2010, at 12:26 AM, Nick Lewycky wrote:> Xinliang David Li wrote: >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc > > There's a couple things interacting here: > * clang -fstrict-aliasing -O2 does generate the TBAA info, but it runs the optimizers without enabling the -enable-tbaa flag, so the optimizers never look at it. Oops. > * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the resulting .bc file. This is probably intended to speed up -O0 builds even if -fstrict-aliasing is set, but is annoying for debugging what's going on under the hood. > * If clang -O2 worked by running 'opt' and 'llc' under the hood, we could tell it to pass a flag along to them, but it doesn't. As it stands, you can't turn -enable-tbaa on when running clang.In case there is any confusion, the -enable-tbaa option is temporary. TBAA is a new feature which is still under development. Dan
On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca> wrote:> Xinliang David Li wrote: > >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc >> > > There's a couple things interacting here: > * clang -fstrict-aliasing -O2 does generate the TBAA info, but it runs the > optimizers without enabling the -enable-tbaa flag, so the optimizers never > look at it. Oops. > * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the > resulting .bc file. This is probably intended to speed up -O0 builds even if > -fstrict-aliasing is set, but is annoying for debugging what's going on > under the hood. > * If clang -O2 worked by running 'opt' and 'llc' under the hood, we could > tell it to pass a flag along to them, but it doesn't. As it stands, you > can't turn -enable-tbaa on when running clang. > > So, putting that together, one way to do it is: > > clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc > opt -O2 -enable-tbaa foo.bc foo2.bc >-o foo2.bc> llc -O2 -enable-tbaa foo2.bc -o foo2.s > > at which point the opt run will hoist the loads into a loop preheader. > Sadly this runs the LLVM optimizers twice (once in clang -O2 and once in > opt) which could skew results. >Yes, I verified these steps work, but my head is spinning: 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode and exit without invoking llvm backend? 2) why do you need to invoke both opt and llc -- I verified invoking just llc is also fine. 3) more general question -- is opt just a barebone llc without invoking any llvm passes? So why is there a need for two opt driver? Thanks, David> > I think the right thing to do is to teach the clang driver to remove > -fstrict-aliasing from the cc1 invocation when optimizations are off. This > would let us force the flag through with "-Xclang -fstrict-aliasing". > > > Memory accesses remain in the loop. >> >> The following works fine: >> >> void foo(int n, double *restrict p, int * restrict *q) >> { >> ... >> } >> >> By the way, Is there a performance category in the llvm bug database? >> > > Nope, we file bugs based on the type of optimization ought to solve it > (ie., there's a Scalar optimizations category, a Loop optimizer category, > Backend: X86, etc.). Many miscellaneous performance improvements actually > live in lib/Target/README.txt (and subdirs of there) instead of the bug > tracker. > > Nick > > Thanks, >> >> David >> >> On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com >> <mailto:gohman at apple.com>> wrote: >> >> >> On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: >> >> > >> > >> > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com >> <mailto:rafael.espindola at gmail.com>> >> >> > 2010/10/27 Xinliang David Li <xinliangli at gmail.com >> <mailto:xinliangli at gmail.com>>: >> >> > > Thanks. Just built clang and saw the meta data and annotations >> on the memory >> > > accesses -- is any opt pass consuming the information? >> > >> > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that at >> > least licm is using it. Also note that >> > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as enable-tbaa >> option >> > that is off by default. >> >> LICM, GVN, and DSE are the major consumers right now. That said, the >> current TBAA implementation is not very advanced yet. >> >> > I tried the option -- no much differences in the generated code. >> >> Can you give an example of code you'd expect to be optimized which >> isn't? >> >> Dan >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101029/18f799fc/attachment.html>