thr3ads.net - llvm dev - [LLVMdev] dragonegg svn benchmarks [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Jack Howarth

2011-Oct-12 13:47 UTC

[LLVMdev] dragonegg svn benchmarks

On Wed, Oct 12, 2011 at 09:40:53AM +0200, Duncan Sands
wrote:> Hi Chris,
>
>>> PS: With -fplugin-arg-dragonegg-enable-gcc-optzns the LLVM
optimizers are run at
>>> the following levels:
>>>
>>> Command line option      LLVM optimizers run at
>>> -------------------      ----------------------
>>>          -O1              tiny amount of optimization
>>>      -O2 or -O3                      -O1
>>>      -O4 or -O5                      -O2
>>>      -O6 or better                   -O3
>>
>> Hi Duncan,
>>
>> Out of curiosity, why do you follow this approach?  People generally
use -O2 or -O3.  I'd recommend switching dragonegg to line those up with
whatever you want people to use.
>
> note that this is done only when the GCC optimizers are run.  The basic
> observation is that running the LLVM optimizers at -O3 after running the
> GCC optimizers (at -O3) results in slower code!  I mean slower than what
> you get by running the LLVM optimizers at -O1 or -O2.  I didn't find
time
> to analyse this curiosity yet.  It might simply be that the LLVM inlining
> level is too high given that inlining has already been done by GCC. 
Anyway,
> I didn't want to run LLVM at -O3 because of this.  The next question
was:
> what is better: LLVM at -O1 or at -O2?  My first experiments showed that
> code quality was essentially the same.  Since at -O1 you get a nice compile
> time speedup, I settled on using -O1.  Also -O1 makes some sense if the GCC
> optimizers did a good job and all that is needed is to clean up the mess
that
> converting to LLVM IR can produce.  However later experiments showed that
-O2
> does seem to consistently result in slightly better code, so I've been
thinking
> of using -O2 instead.  This is one reason I encouraged Jack to use -O4 in
his
> benchmarks (i.e. GCC at -O3, LLVM at -O2) - to see if they show the same
thing.
Duncan,
   My preliminary runs of the pb05 benchmarks at -O4, -O5 and -O6 using
-fplugin-arg-dragonegg-enable-gcc-optzns didn't show any significant run
time
performance changes compared to -fplugin-arg-dragonegg-enable-gcc-optzns -O3. 
I'll rerun those and post the tabulated results this weekend. I am using
-ffast-math -funroll-loops as well in the optimization flags. Perhaps I should
repeat the benchmarks without those flags.
   IMHO, the more important thing is to fish out the remaining regressions
in the llvm vectorization code by defaulting
-fplugin-arg-dragonegg-enable-gcc-optzns
on in dragonegg svn once llvm 3.0 has branched. Hopefully this will get us wider
testing of the llvm vectorization support and some additional smaller test cases
that expose the remaining bugs in that code.
              Jack>
> Ciao, Duncan.
>
> PS: Dragonegg is a nice platform for understanding what the GCC optimizers
> do better than LLVM.  It's a pity no-one seems to have used it for
this.

Duncan Sands

2011-Oct-13 12:37 UTC

head link

[LLVMdev] dragonegg svn benchmarks

Hi Jack,
>     IMHO, the more important thing is to fish out the remaining regressions
> in the llvm vectorization code by defaulting
-fplugin-arg-dragonegg-enable-gcc-optzns
> on in dragonegg svn once llvm 3.0 has branched. Hopefully this will get us
wider
> testing of the llvm vectorization support and some additional smaller test
cases
> that expose the remaining bugs in that code.
turning on the GCC optimizers by default essentially means giving up on the LLVM
IR optimizers: one way of reading your benchmark results is that the LLVM IR
optimizers don't do anything useful that the GCC optimizers haven't done
already.  The fact that LLVM -O3 and -O2 don't produce better code than -O1
suggests that all that is needed is a little bit of optimization to clean up
the inevitable messy bits produced by the gimple -> LLVM IR conversion, but
that otherwise GCC already did all the interesting transforms.  Should this be
considered an LLVM bug or a dragonegg feature?

An LLVM bug: if the GCC optimizers work better than LLVM's then LLVM should
be
improved until LLVM's are better.  Turning on the GCC optimizers by default
just
hides the weaknesses of LLVM's optimizers, and reduces the pressure to
improve
things.

A dragonegg feature: users want their code to run fast.  Turning on the GCC
optimizers results in faster code, ergo the GCC optimizers should be turned
on by default.  That way you get faster compile times and fast code.

I have some sympathy for both viewpoints...

Ciao, Duncan.

Jack Howarth

2011-Oct-13 14:05 UTC

head link

[LLVMdev] dragonegg svn benchmarks

On Thu, Oct 13, 2011 at 02:37:54PM +0200, Duncan Sands
wrote:> Hi Jack,
>
>>     IMHO, the more important thing is to fish out the remaining
regressions
>> in the llvm vectorization code by defaulting
-fplugin-arg-dragonegg-enable-gcc-optzns
>> on in dragonegg svn once llvm 3.0 has branched. Hopefully this will get
us wider
>> testing of the llvm vectorization support and some additional smaller
test cases
>> that expose the remaining bugs in that code.
>
> turning on the GCC optimizers by default essentially means giving up on the
LLVM
> IR optimizers: one way of reading your benchmark results is that the LLVM
IR
> optimizers don't do anything useful that the GCC optimizers haven't
done
> already.  The fact that LLVM -O3 and -O2 don't produce better code than
-O1
> suggests that all that is needed is a little bit of optimization to clean
up
> the inevitable messy bits produced by the gimple -> LLVM IR conversion,
but
> that otherwise GCC already did all the interesting transforms.  Should this
be
> considered an LLVM bug or a dragonegg feature?
>
> An LLVM bug: if the GCC optimizers work better than LLVM's then LLVM
should be
> improved until LLVM's are better.  Turning on the GCC optimizers by
default just
> hides the weaknesses of LLVM's optimizers, and reduces the pressure to
improve
> things.
>
> A dragonegg feature: users want their code to run fast.  Turning on the GCC
> optimizers results in faster code, ergo the GCC optimizers should be turned
> on by default.  That way you get faster compile times and fast code.
Duncan,
    My main concern is that we test the vectorization support in llvm as hard as
possible post llvm 3.0. Considering that llvm is unlikely to get
autovectorization
support in the near term, it seems that FSF gcc/dragonegg is the best approach
to hunt for vectorization issues in llvm. Might we be able to split the
difference
here and create a variant of -fplugin-arg-dragonegg-enable-gcc-optzns which only
enables a limited set of FSF gcc optimizations (like -ftree-vectorize) required
to enable FSF gcc's autovectorization under dragonegg? For instance
couldn't
dragonegg just honor -ftree-vectorize when it or -O3 are passed as compiler
flags?
                      Jack
>
> I have some sympathy for both viewpoints...
>
> Ciao, Duncan.

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Oct 2011 - [LLVMdev] dragonegg svn benchmarks

[LLVMdev] dragonegg svn benchmarks

[LLVMdev] dragonegg svn benchmarks

[LLVMdev] dragonegg svn benchmarks

Maybe Matching Threads