thr3ads.net - llvm dev - [LLVMdev] some llvm/clang missed optimizations [Jan 2010]

If this information is useful, please help other people find it:
Share via:

John Regehr

2010-Jan-26 20:36 UTC

[LLVMdev] some llvm/clang missed optimizations

A few random observations:

1.

Clang could do better with large but boring switches like this:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/E8/E88C5111.shtml

Performance of clang's output will be fine but this is a major code size 
lose.

2.

Destruction of stupid loops is incomplete, sometimes due to phase 
ordering problems:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/FC/FCADC848.shtml

Sometimes not:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/EC/ECC74C0C.shtml

This is both a speed and size issue.  Probably this kind of code most 
often appears in machine-generated C or where loops contain logging code 
that is conditionally compiled away.

3.

Repetitive code with lots of bitwise operations is compiled by LLVM into 
much larger code than the other compilers:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/ED/ED37DAF5.shtml
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/1F/1F4003C7.shtml

Note that this is straight-line code, so LLVM's output will run 4-5 
times longer than everyone else's.

I'll be interested to learn the source of this one.

4.

It seems possible to do a better job recognizing that the current stack 
frame can be used unmodified by a new call:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/0A/0A6CDE2D.shtml

This is a speed lose as well as size.  This pattern seems quite common 
in real code, due to layered APIs.  Of course when IPO is on, most of 
these calls should be destroyed.

5.

Sometimes a function modifies globals but even so has no net effect:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/8A/8AB0B238.shtml
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/14/14157FE8.shtml

Somehow gcc3 sees these but everyone else including gcc4 fails.

6.

Here llvm-gcc and gcc, but not clang, exploit undefinedness of integer 
overflow to eliminate most of the code in a function:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/82/82A5CC31.shtml

Most likely this is not what the authors of the code intended, but the 
compilers are correct.

7.

Cute elimination of useless varargs code:

http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/3A/3A235937.shtml

John

Eli Friedman

2010-Jan-26 22:57 UTC

head link

[LLVMdev] some llvm/clang missed optimizations

On Tue, Jan 26, 2010 at 12:36 PM, John Regehr <regehr at cs.utah.edu>
wrote:> 2.
> Sometimes not:
>
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/EC/ECC74C0C.shtml
The primary issue here is that scalar evolution doesn't know how to
deal with loops using "sle" for the exit condition.  Shouldn't be
too
hard to fix now that we have overflow flags for addition.
> 3.
>
> Repetitive code with lots of bitwise operations is compiled by LLVM into
> much larger code than the other compilers:
>
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/ED/ED37DAF5.shtml
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/1F/1F4003C7.shtml
>
> Note that this is straight-line code, so LLVM's output will run 4-5
> times longer than everyone else's.
>
> I'll be interested to learn the source of this one.
This looks like a one-off case; instcombine destroys the symmetry of
the code that the test harness duplicated by reducing the masking
constants.  Probably too complicated for too little gain to be worth
pursuing.
> 5.
>
> Sometimes a function modifies globals but even so has no net effect:
>
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/8A/8AB0B238.shtml
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/14/14157FE8.shtml
>
> Somehow gcc3 sees these but everyone else including gcc4 fails.
>
> 6.
>
> Here llvm-gcc and gcc, but not clang, exploit undefinedness of integer
> overflow to eliminate most of the code in a function:
>
>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/82/82A5CC31.shtml
>
> Most likely this is not what the authors of the code intended, but the
> compilers are correct.
LLVM doesn't handle correlated expressions at the moment.

-Eli

John Regehr

2010-Jan-27 01:55 UTC

head link

[LLVMdev] some llvm/clang missed optimizations

>> Repetitive code with lots of bitwise operations is compiled by LLVM
into
>> much larger code than the other compilers:
>>
>>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/ED/ED37DAF5.shtml
>>
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/1F/1F4003C7.shtml
>>
>> Note that this is straight-line code, so LLVM's output will run 4-5
>> times longer than everyone else's.
>>
>> I'll be interested to learn the source of this one.
>
> This looks like a one-off case; instcombine destroys the symmetry of
> the code that the test harness duplicated by reducing the masking
> constants.  Probably too complicated for too little gain to be worth
> pursuing.
There are a bunch of these actually, I can try to make a list...

John

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Jan 2010 - [LLVMdev] some llvm/clang missed optimizations

[LLVMdev] some llvm/clang missed optimizations

[LLVMdev] some llvm/clang missed optimizations

[LLVMdev] some llvm/clang missed optimizations

Apparently Analagous Threads