On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca> wrote:> Xinliang David Li wrote: > >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc >> > > There's a couple things interacting here: > * clang -fstrict-aliasing -O2 does generate the TBAA info, but it runs the > optimizers without enabling the -enable-tbaa flag, so the optimizers never > look at it. Oops. > * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the > resulting .bc file. This is probably intended to speed up -O0 builds even if > -fstrict-aliasing is set, but is annoying for debugging what's going on > under the hood. > * If clang -O2 worked by running 'opt' and 'llc' under the hood, we could > tell it to pass a flag along to them, but it doesn't. As it stands, you > can't turn -enable-tbaa on when running clang. > > So, putting that together, one way to do it is: > > clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc > opt -O2 -enable-tbaa foo.bc foo2.bc >-o foo2.bc> llc -O2 -enable-tbaa foo2.bc -o foo2.s > > at which point the opt run will hoist the loads into a loop preheader. > Sadly this runs the LLVM optimizers twice (once in clang -O2 and once in > opt) which could skew results. >Yes, I verified these steps work, but my head is spinning: 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode and exit without invoking llvm backend? 2) why do you need to invoke both opt and llc -- I verified invoking just llc is also fine. 3) more general question -- is opt just a barebone llc without invoking any llvm passes? So why is there a need for two opt driver? Thanks, David> > I think the right thing to do is to teach the clang driver to remove > -fstrict-aliasing from the cc1 invocation when optimizations are off. This > would let us force the flag through with "-Xclang -fstrict-aliasing". > > > Memory accesses remain in the loop. >> >> The following works fine: >> >> void foo(int n, double *restrict p, int * restrict *q) >> { >> ... >> } >> >> By the way, Is there a performance category in the llvm bug database? >> > > Nope, we file bugs based on the type of optimization ought to solve it > (ie., there's a Scalar optimizations category, a Loop optimizer category, > Backend: X86, etc.). Many miscellaneous performance improvements actually > live in lib/Target/README.txt (and subdirs of there) instead of the bug > tracker. > > Nick > > Thanks, >> >> David >> >> On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com >> <mailto:gohman at apple.com>> wrote: >> >> >> On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: >> >> > >> > >> > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com >> <mailto:rafael.espindola at gmail.com>> >> >> > 2010/10/27 Xinliang David Li <xinliangli at gmail.com >> <mailto:xinliangli at gmail.com>>: >> >> > > Thanks. Just built clang and saw the meta data and annotations >> on the memory >> > > accesses -- is any opt pass consuming the information? >> > >> > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that at >> > least licm is using it. Also note that >> > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as enable-tbaa >> option >> > that is off by default. >> >> LICM, GVN, and DSE are the major consumers right now. That said, the >> current TBAA implementation is not very advanced yet. >> >> > I tried the option -- no much differences in the generated code. >> >> Can you give an example of code you'd expect to be optimized which >> isn't? >> >> Dan >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101029/18f799fc/attachment.html>
> Yes, I verified these steps work, but my head is spinning: > 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode and > exit without invoking llvm backend?I think they are the same, but maybe -flto also affects which passes are run. Not sure. I always used -emit-llvm...> 2) why do you need to invoke both opt and llc -- I verified invoking just > llc is also fine. > 3) more general question -- is opt just a barebone llc without invoking any > llvm passes? So why is there a need for two opt driver?opt contains the IL -> IL optimizations. It is what you want to use for testing that a loop is unrolled for example. llc does the IL -> .s (or .o) transformation. It will also run low level passes. I guess they could be merged. Never tough about the trade offs.> Thanks, > David >Cheers, Rafael
Xinliang David Li wrote:> > > On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca > <mailto:nicholas at mxc.ca>> wrote: > > Xinliang David Li wrote: > > As simple as > > void foo (int n, double *p, int *q) > { > for (int i = 0; i < n; i++) > *p += *q; > } > > clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c > llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc > > > There's a couple things interacting here: > * clang -fstrict-aliasing -O2 does generate the TBAA info, but it > runs the optimizers without enabling the -enable-tbaa flag, so the > optimizers never look at it. Oops. > * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in > the resulting .bc file. This is probably intended to speed up -O0 > builds even if -fstrict-aliasing is set, but is annoying for > debugging what's going on under the hood. > * If clang -O2 worked by running 'opt' and 'llc' under the hood, > we could tell it to pass a flag along to them, but it doesn't. As it > stands, you can't turn -enable-tbaa on when running clang. > > So, putting that together, one way to do it is: > > clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc > opt -O2 -enable-tbaa foo.bc foo2.bc > > > -o foo2.bc > > llc -O2 -enable-tbaa foo2.bc -o foo2.s > > at which point the opt run will hoist the loads into a loop > preheader. Sadly this runs the LLVM optimizers twice (once in clang > -O2 and once in opt) which could skew results. > > > Yes, I verified these steps work, but my head is spinning: > > 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode > and exit without invoking llvm backend?Yes, -flto and -emit-llvm are synonyms.> 2) why do you need to invoke both opt and llc -- I verified invoking > just llc is also fine."llc" is really just codegen; the only optimizations it does are ones that are naturally part of lowering from llvm IR to assembly. For example, that includes another run of loop invariant code motion because some loads may have been added -- such as a load of the GOT pointer -- which weren't there in the IR to be hoisted. "opt" runs any IR pass. You can ask run a single optimization, for example "opt -licm" or you can run an analysis pass like scalar evolutions with "opt -analyze -scalar-evolution". This is where the bulk of LLVM's optimizations live.> 3) more general question -- is opt just a barebone llc without invoking > any llvm passes? So why is there a need for two opt driver?I think of it as opt transforms .bc -> .bc and llc transforms .bc -> .s. Nick> Thanks, > > David > > > I think the right thing to do is to teach the clang driver to remove > -fstrict-aliasing from the cc1 invocation when optimizations are > off. This would let us force the flag through with "-Xclang > -fstrict-aliasing". > > > Memory accesses remain in the loop. > > The following works fine: > > void foo(int n, double *restrict p, int * restrict *q) > { > ... > } > > By the way, Is there a performance category in the llvm bug > database? > > > Nope, we file bugs based on the type of optimization ought to solve > it (ie., there's a Scalar optimizations category, a Loop optimizer > category, Backend: X86, etc.). Many miscellaneous performance > improvements actually live in lib/Target/README.txt (and subdirs of > there) instead of the bug tracker. > > Nick > > Thanks, > > David > > On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com > <mailto:gohman at apple.com> > <mailto:gohman at apple.com <mailto:gohman at apple.com>>> wrote: > > > On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: > > > > > > > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com > <mailto:rafael.espindola at gmail.com> > <mailto:rafael.espindola at gmail.com > <mailto:rafael.espindola at gmail.com>>> > > > 2010/10/27 Xinliang David Li <xinliangli at gmail.com > <mailto:xinliangli at gmail.com> > <mailto:xinliangli at gmail.com <mailto:xinliangli at gmail.com>>>: > > > > Thanks. Just built clang and saw the meta data and annotations > on the memory > > > accesses -- is any opt pass consuming the information? > > > > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that at > > least licm is using it. Also note that > > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as > enable-tbaa option > > that is off by default. > > LICM, GVN, and DSE are the major consumers right now. That > said, the > current TBAA implementation is not very advanced yet. > > > I tried the option -- no much differences in the generated code. > > Can you give an example of code you'd expect to be optimized > which > isn't? > > Dan > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >
On Sat, Oct 30, 2010 at 1:44 AM, Nick Lewycky <nicholas at mxc.ca> wrote:> Xinliang David Li wrote: > >> >> >> On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca >> <mailto:nicholas at mxc.ca>> wrote: >> >> Xinliang David Li wrote: >> >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc >> >> >> There's a couple things interacting here: >> * clang -fstrict-aliasing -O2 does generate the TBAA info, but it >> runs the optimizers without enabling the -enable-tbaa flag, so the >> optimizers never look at it. Oops. >> * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in >> the resulting .bc file. This is probably intended to speed up -O0 >> builds even if -fstrict-aliasing is set, but is annoying for >> debugging what's going on under the hood. >> * If clang -O2 worked by running 'opt' and 'llc' under the hood, >> we could tell it to pass a flag along to them, but it doesn't. As it >> stands, you can't turn -enable-tbaa on when running clang. >> >> So, putting that together, one way to do it is: >> >> clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc >> opt -O2 -enable-tbaa foo.bc foo2.bc >> >> >> -o foo2.bc >> >> llc -O2 -enable-tbaa foo2.bc -o foo2.s >> >> at which point the opt run will hoist the loads into a loop >> preheader. Sadly this runs the LLVM optimizers twice (once in clang >> -O2 and once in opt) which could skew results. >> >> >> Yes, I verified these steps work, but my head is spinning: >> >> 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode >> and exit without invoking llvm backend? >> > > Yes, -flto and -emit-llvm are synonyms. > > > 2) why do you need to invoke both opt and llc -- I verified invoking >> just llc is also fine. >> > > "llc" is really just codegen; the only optimizations it does are ones that > are naturally part of lowering from llvm IR to assembly. For example, that > includes another run of loop invariant code motion because some loads may > have been added -- such as a load of the GOT pointer -- which weren't there > in the IR to be hoisted. > > "opt" runs any IR pass. You can ask run a single optimization, for example > "opt -licm" or you can run an analysis pass like scalar evolutions with "opt > -analyze -scalar-evolution". This is where the bulk of LLVM's optimizations > live.I thought llc include all standard optimization passes (equivalent to opt -std-compile-opts + machine code generation) --- it seems more natural (to me) to have a single optimization driver that takes llvm bit code as the input. David> > 3) more general question -- is opt just a barebone llc without invoking >> any llvm passes? So why is there a need for two opt driver? >> > > I think of it as opt transforms .bc -> .bc and llc transforms .bc -> .s. > > Nick > > Thanks, >> >> David >> >> >> I think the right thing to do is to teach the clang driver to remove >> -fstrict-aliasing from the cc1 invocation when optimizations are >> off. This would let us force the flag through with "-Xclang >> -fstrict-aliasing". >> >> >> Memory accesses remain in the loop. >> >> The following works fine: >> >> void foo(int n, double *restrict p, int * restrict *q) >> { >> ... >> } >> >> By the way, Is there a performance category in the llvm bug >> database? >> >> >> Nope, we file bugs based on the type of optimization ought to solve >> it (ie., there's a Scalar optimizations category, a Loop optimizer >> category, Backend: X86, etc.). Many miscellaneous performance >> improvements actually live in lib/Target/README.txt (and subdirs of >> there) instead of the bug tracker. >> >> Nick >> >> Thanks, >> >> David >> >> On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com >> <mailto:gohman at apple.com> >> <mailto:gohman at apple.com <mailto:gohman at apple.com>>> wrote: >> >> >> On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote: >> >> > >> > >> > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com >> <mailto:rafael.espindola at gmail.com> >> <mailto:rafael.espindola at gmail.com >> <mailto:rafael.espindola at gmail.com>>> >> >> > 2010/10/27 Xinliang David Li <xinliangli at gmail.com >> <mailto:xinliangli at gmail.com> >> <mailto:xinliangli at gmail.com <mailto:xinliangli at gmail.com>>>: >> >> >> > > Thanks. Just built clang and saw the meta data and annotations >> on the memory >> > > accesses -- is any opt pass consuming the information? >> > >> > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that >> at >> > least licm is using it. Also note that >> > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as >> enable-tbaa option >> > that is off by default. >> >> LICM, GVN, and DSE are the major consumers right now. That >> said, the >> current TBAA implementation is not very advanced yet. >> >> > I tried the option -- no much differences in the generated code. >> >> Can you give an example of code you'd expect to be optimized >> which >> isn't? >> >> Dan >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> >> >> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101030/d7918e16/attachment.html>