thr3ads.net - llvm dev - [LLVMdev] strict aliasing and LLVM [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Xinliang David Li

2010-Oct-29 18:32 UTC

[LLVMdev] strict aliasing and LLVM

On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca> wrote:
> Xinliang David Li wrote:
>
>> As simple as
>>
>> void foo (int n, double *p, int *q)
>> {
>>    for (int i = 0; i < n; i++)
>>      *p += *q;
>> }
>>
>> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c
>> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc
>>
>
> There's a couple things interacting here:
>  * clang -fstrict-aliasing -O2 does generate the TBAA info, but it runs the
> optimizers without enabling the -enable-tbaa flag, so the optimizers never
> look at it. Oops.
>  * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in the
> resulting .bc file. This is probably intended to speed up -O0 builds even
if
> -fstrict-aliasing is set, but is annoying for debugging what's going on
> under the hood.
>  * If clang -O2 worked by running 'opt' and 'llc' under the
hood, we could
> tell it to pass a flag along to them, but it doesn't. As it stands, you
> can't turn -enable-tbaa on when running clang.
>
> So, putting that together, one way to do it is:
>
>  clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc
>  opt -O2 -enable-tbaa foo.bc foo2.bc
>
   -o foo2.bc

>  llc -O2 -enable-tbaa foo2.bc -o foo2.s
>
> at which point the opt run will hoist the loads into a loop preheader.
> Sadly this runs the LLVM optimizers twice (once in clang -O2 and once in
> opt) which could skew results.
>
Yes, I verified these steps work, but my head is spinning:

1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode and
exit without invoking llvm backend?
2) why do you need to invoke both opt and llc -- I verified invoking just
llc is also fine.

3) more general question -- is opt just a barebone llc without invoking any
llvm passes? So why is there a need for two opt driver?

Thanks,

David


>
> I think the right thing to do is to teach the clang driver to remove
> -fstrict-aliasing from the cc1 invocation when optimizations are off. This
> would let us force the flag through with "-Xclang
-fstrict-aliasing".
>
>
>  Memory accesses remain in the loop.
>>
>> The following works fine:
>>
>> void foo(int n, double *restrict p, int * restrict *q)
>> {
>>   ...
>> }
>>
>> By the way, Is there a performance category in the llvm bug database?
>>
>
> Nope, we file bugs based on the type of optimization ought to solve it
> (ie., there's a Scalar optimizations category, a Loop optimizer
category,
> Backend: X86, etc.). Many miscellaneous performance improvements actually
> live in lib/Target/README.txt (and subdirs of there) instead of the bug
> tracker.
>
> Nick
>
>  Thanks,
>>
>> David
>>
>> On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com
>> <mailto:gohman at apple.com>> wrote:
>>
>>
>>    On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote:
>>
>>     >
>>     >
>>     > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com
>>    <mailto:rafael.espindola at gmail.com>>
>>
>>     > 2010/10/27 Xinliang David Li <xinliangli at gmail.com
>>    <mailto:xinliangli at gmail.com>>:
>>
>>     > > Thanks. Just built clang and saw the meta data and
annotations
>>    on the memory
>>     > > accesses --  is any opt pass consuming the information?
>>     >
>>     > The tests in test/Analysis/TypeBasedAliasAnalysis suggest that
at
>>     > least licm is using it. Also note that
>>     > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as enable-tbaa
>> option
>>     > that is off by default.
>>
>>    LICM, GVN, and DSE are the major consumers right now. That said, the
>>    current TBAA implementation is not very advanced yet.
>>
>>     > I tried the option -- no much differences in the generated
code.
>>
>>    Can you give an example of code you'd expect to be optimized
which
>>    isn't?
>>
>>    Dan
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101029/18f799fc/attachment.html>

Rafael Espíndola

2010-Oct-30 02:44 UTC

head link

[LLVMdev] strict aliasing and LLVM

> Yes, I verified these steps work, but my head is spinning:
> 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode and
> exit without invoking llvm backend?
I think they are the same, but maybe -flto also affects which passes
are run. Not sure. I always used -emit-llvm...
> 2) why do you need to invoke both opt and llc -- I verified invoking just
> llc is also fine.
> 3) more general question -- is opt just a barebone llc without invoking any
> llvm passes? So why is there a need for two opt driver?
opt contains the IL -> IL optimizations. It is what you want to use
for testing that a loop is unrolled for example.
llc does the IL -> .s (or .o) transformation. It will also run low level
passes.

I guess they could be merged. Never tough about the trade offs.
> Thanks,
> David
>
Cheers,
Rafael

Nick Lewycky

2010-Oct-30 08:44 UTC

head link

[LLVMdev] strict aliasing and LLVM

Xinliang David Li wrote:>
>
> On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca
> <mailto:nicholas at mxc.ca>> wrote:
>
>     Xinliang David Li wrote:
>
>         As simple as
>
>         void foo (int n, double *p, int *q)
>         {
>             for (int i = 0; i < n; i++)
>               *p += *q;
>         }
>
>         clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c
>         llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc
>
>
>     There's a couple things interacting here:
>       * clang -fstrict-aliasing -O2 does generate the TBAA info, but it
>     runs the optimizers without enabling the -enable-tbaa flag, so the
>     optimizers never look at it. Oops.
>       * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in
>     the resulting .bc file. This is probably intended to speed up -O0
>     builds even if -fstrict-aliasing is set, but is annoying for
>     debugging what's going on under the hood.
>       * If clang -O2 worked by running 'opt' and 'llc'
under the hood,
>     we could tell it to pass a flag along to them, but it doesn't. As
it
>     stands, you can't turn -enable-tbaa on when running clang.
>
>     So, putting that together, one way to do it is:
>
>       clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc
>       opt -O2 -enable-tbaa foo.bc foo2.bc
>
>
>     -o foo2.bc
>
>       llc -O2 -enable-tbaa foo2.bc -o foo2.s
>
>     at which point the opt run will hoist the loads into a loop
>     preheader. Sadly this runs the LLVM optimizers twice (once in clang
>     -O2 and once in opt) which could skew results.
>
>
> Yes, I verified these steps work, but my head is spinning:
>
> 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode
> and exit without invoking llvm backend?
Yes, -flto and -emit-llvm are synonyms.
> 2) why do you need to invoke both opt and llc -- I verified invoking
> just llc is also fine.
"llc" is really just codegen; the only optimizations it does are ones 
that are naturally part of lowering from llvm IR to assembly. For 
example, that includes another run of loop invariant code motion because 
some loads may have been added -- such as a load of the GOT pointer -- 
which weren't there in the IR to be hoisted.

"opt" runs any IR pass. You can ask run a single optimization, for 
example "opt -licm" or you can run an analysis pass like scalar 
evolutions with "opt -analyze -scalar-evolution". This is where the
bulk
of LLVM's optimizations live.
> 3) more general question -- is opt just a barebone llc without invoking
> any llvm passes? So why is there a need for two opt driver?
I think of it as opt transforms .bc -> .bc and llc transforms .bc -> .s.

Nick
> Thanks,
>
> David
>
>
>     I think the right thing to do is to teach the clang driver to remove
>     -fstrict-aliasing from the cc1 invocation when optimizations are
>     off. This would let us force the flag through with "-Xclang
>     -fstrict-aliasing".
>
>
>         Memory accesses remain in the loop.
>
>         The following works fine:
>
>         void foo(int n, double *restrict p, int * restrict *q)
>         {
>            ...
>         }
>
>         By the way, Is there a performance category in the llvm bug
>         database?
>
>
>     Nope, we file bugs based on the type of optimization ought to solve
>     it (ie., there's a Scalar optimizations category, a Loop optimizer
>     category, Backend: X86, etc.). Many miscellaneous performance
>     improvements actually live in lib/Target/README.txt (and subdirs of
>     there) instead of the bug tracker.
>
>     Nick
>
>         Thanks,
>
>         David
>
>         On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at apple.com
>         <mailto:gohman at apple.com>
>         <mailto:gohman at apple.com <mailto:gohman at
apple.com>>> wrote:
>
>
>             On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote:
>
>          >
>          >
>          > 2010/10/27 Rafael Espíndola <rafael.espindola at gmail.com
>         <mailto:rafael.espindola at gmail.com>
>         <mailto:rafael.espindola at gmail.com
>         <mailto:rafael.espindola at gmail.com>>>
>
>          > 2010/10/27 Xinliang David Li <xinliangli at gmail.com
>         <mailto:xinliangli at gmail.com>
>         <mailto:xinliangli at gmail.com <mailto:xinliangli at
gmail.com>>>:
>
>          > > Thanks. Just built clang and saw the meta data and
annotations
>             on the memory
>          > > accesses --  is any opt pass consuming the information?
>          >
>          > The tests in test/Analysis/TypeBasedAliasAnalysis suggest
that at
>          > least licm is using it. Also note that
>          > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as
>         enable-tbaa option
>          > that is off by default.
>
>             LICM, GVN, and DSE are the major consumers right now. That
>         said, the
>             current TBAA implementation is not very advanced yet.
>
>          > I tried the option -- no much differences in the generated
code.
>
>             Can you give an example of code you'd expect to be
optimized
>         which
>             isn't?
>
>             Dan
>
>
>
>
>         _______________________________________________
>         LLVM Developers mailing list
>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>         http://llvm.cs.uiuc.edu
>         http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>

Xinliang David Li

2010-Oct-30 22:26 UTC

head link

[LLVMdev] strict aliasing and LLVM

On Sat, Oct 30, 2010 at 1:44 AM, Nick Lewycky <nicholas at mxc.ca> wrote:
> Xinliang David Li wrote:
>
>>
>>
>> On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca
>> <mailto:nicholas at mxc.ca>> wrote:
>>
>>    Xinliang David Li wrote:
>>
>>        As simple as
>>
>>        void foo (int n, double *p, int *q)
>>        {
>>            for (int i = 0; i < n; i++)
>>              *p += *q;
>>        }
>>
>>        clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c
>>        llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc
>>
>>
>>    There's a couple things interacting here:
>>      * clang -fstrict-aliasing -O2 does generate the TBAA info, but it
>>    runs the optimizers without enabling the -enable-tbaa flag, so the
>>    optimizers never look at it. Oops.
>>      * clang -fstrict-aliasing -O0 does *not* generate the TBAA info in
>>    the resulting .bc file. This is probably intended to speed up -O0
>>    builds even if -fstrict-aliasing is set, but is annoying for
>>    debugging what's going on under the hood.
>>      * If clang -O2 worked by running 'opt' and 'llc'
under the hood,
>>    we could tell it to pass a flag along to them, but it doesn't.
As it
>>    stands, you can't turn -enable-tbaa on when running clang.
>>
>>    So, putting that together, one way to do it is:
>>
>>      clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc
>>      opt -O2 -enable-tbaa foo.bc foo2.bc
>>
>>
>>    -o foo2.bc
>>
>>      llc -O2 -enable-tbaa foo2.bc -o foo2.s
>>
>>    at which point the opt run will hoist the loads into a loop
>>    preheader. Sadly this runs the LLVM optimizers twice (once in clang
>>    -O2 and once in opt) which could skew results.
>>
>>
>> Yes, I verified these steps work, but my head is spinning:
>>
>> 1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode
>> and exit without invoking llvm backend?
>>
>
> Yes, -flto and -emit-llvm are synonyms.
>
>
>  2) why do you need to invoke both opt and llc -- I verified invoking
>> just llc is also fine.
>>
>
> "llc" is really just codegen; the only optimizations it does are
ones that
> are naturally part of lowering from llvm IR to assembly. For example, that
> includes another run of loop invariant code motion because some loads may
> have been added -- such as a load of the GOT pointer -- which weren't
there
> in the IR to be hoisted.
>
> "opt" runs any IR pass. You can ask run a single optimization,
for example
> "opt -licm" or you can run an analysis pass like scalar
evolutions with "opt
> -analyze -scalar-evolution". This is where the bulk of LLVM's
optimizations
> live.


I thought llc include all standard optimization passes (equivalent to opt
-std-compile-opts + machine code generation) --- it seems more natural (to
me) to have  a single optimization driver that takes llvm bit code as the
input.


David

>
>  3) more general question -- is opt just a barebone llc without invoking
>> any llvm passes? So why is there a need for two opt driver?
>>
>
> I think of it as opt transforms .bc -> .bc and llc transforms .bc ->
.s.
>
> Nick
>
>  Thanks,
>>
>> David
>>
>>
>>    I think the right thing to do is to teach the clang driver to remove
>>    -fstrict-aliasing from the cc1 invocation when optimizations are
>>    off. This would let us force the flag through with "-Xclang
>>    -fstrict-aliasing".
>>
>>
>>        Memory accesses remain in the loop.
>>
>>        The following works fine:
>>
>>        void foo(int n, double *restrict p, int * restrict *q)
>>        {
>>           ...
>>        }
>>
>>        By the way, Is there a performance category in the llvm bug
>>        database?
>>
>>
>>    Nope, we file bugs based on the type of optimization ought to solve
>>    it (ie., there's a Scalar optimizations category, a Loop
optimizer
>>    category, Backend: X86, etc.). Many miscellaneous performance
>>    improvements actually live in lib/Target/README.txt (and subdirs of
>>    there) instead of the bug tracker.
>>
>>    Nick
>>
>>        Thanks,
>>
>>        David
>>
>>        On Thu, Oct 28, 2010 at 5:59 PM, Dan Gohman <gohman at
apple.com
>>        <mailto:gohman at apple.com>
>>        <mailto:gohman at apple.com <mailto:gohman at
apple.com>>> wrote:
>>
>>
>>            On Oct 28, 2010, at 2:43 PM, Xinliang David Li wrote:
>>
>>         >
>>         >
>>         > 2010/10/27 Rafael Espíndola <rafael.espindola at
gmail.com
>>        <mailto:rafael.espindola at gmail.com>
>>        <mailto:rafael.espindola at gmail.com
>>        <mailto:rafael.espindola at gmail.com>>>
>>
>>         > 2010/10/27 Xinliang David Li <xinliangli at gmail.com
>>        <mailto:xinliangli at gmail.com>
>>        <mailto:xinliangli at gmail.com <mailto:xinliangli at
gmail.com>>>:
>>
>>
>>         > > Thanks. Just built clang and saw the meta data and
annotations
>>            on the memory
>>         > > accesses --  is any opt pass consuming the
information?
>>         >
>>         > The tests in test/Analysis/TypeBasedAliasAnalysis suggest
that
>> at
>>         > least licm is using it. Also note that
>>         > lib/Analysis/TypeBasedAliasAnalysis.cpp defines as
>>        enable-tbaa option
>>         > that is off by default.
>>
>>            LICM, GVN, and DSE are the major consumers right now. That
>>        said, the
>>            current TBAA implementation is not very advanced yet.
>>
>>         > I tried the option -- no much differences in the generated
code.
>>
>>            Can you give an example of code you'd expect to be
optimized
>>        which
>>            isn't?
>>
>>            Dan
>>
>>
>>
>>
>>        _______________________________________________
>>        LLVM Developers mailing list
>>        LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>
>>        http://llvm.cs.uiuc.edu
>>        http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101030/d7918e16/attachment.html>

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Oct 2010 - [LLVMdev] strict aliasing and LLVM

[LLVMdev] strict aliasing and LLVM

[LLVMdev] strict aliasing and LLVM

[LLVMdev] strict aliasing and LLVM

[LLVMdev] strict aliasing and LLVM

Reasonably Related Threads