thr3ads.net - llvm dev - [LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis] [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Reid Spencer

2003-Nov-18 14:19 UTC

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

I'm cross-posting the message below (from GCC list) because I believe it
would (at some point) be very beneficial to build an evolutionary
optimization pass into LLVM. The idea would be to discover the perfect
set of optimizations for a given program by trying them all and
analyzing the execution times of each. This would be somewhat like
profile driven optimization except the profile is easier to gather: its
the whole program timing. I imagine a "mega pass" could be written
that
tries all other passes in all their combinations to (eventually) arrive
at the right set for a given program.  There's one big issue with this:
workload. Unlike profile-driven optimization, which can find parts of
the program to optimize based on data collected from a run or several
runs of the program, this evolutionary pass would need to run the exact
same workload multiple times to discover which optimizations to apply.
Obtaining a useful/typical workload for a given program will be
difficult. Its easy for compilers and the like, but not all programs
work that way (consider optimizing a web server this way). Probably the
right approach is to punt on the workload issue leaving it to the
programmer to invoke/drive the program in such a way that it finds a
consistent workload to execute. 

I know its early in LLVM's life for something like this, but I thought
I'd mention it so it can be tucked away in the "think about this
later"
file.

Reid.

-----Forwarded Message-----> From: Scott Robert Ladd <coyote at coyotegulch.com>
> To: gcc mailing list <gcc at gcc.gnu.org>
> Subject: Optimization: Conclusions from Evolutionary Analysis
> Date: Tue, 18 Nov 2003 10:11:55 -0500
> 
> For those who haven't the time to read my full article, I thought
I'd
> pass along the conclusions section. One of my goals with Acovea was to 
> provide some empirical evidence of how optimizations behave, and to 
> dispell some common misconceptions about optimization.
> 
> The settings for -O1, -O2, and -O3 are valid. Every option implied by 
> -O3 may not be applicable to every program -- but enabling all those 
> options does not seem to be detrimental, except in the case of 
> huffbench. Several of the recently-added flags (-ftracer in particular) 
> may be candidates for inclusion in -O2 or -O3 for future versions of the 
> compiler.
> 
> Processor-specific options do not appear to be a major factor in 
> performance on these benchmarks. I don't know if this is due to the 
> nature of the processor, or if GCC can't take advantage of 
> processor-specific instructions. I have double-checked my results; 
> adding -mfpmath=sse (or any of its variants, or -msse) to a compile of 
> almabench does not make the code run any faster. The only ia32-specific 
> option that showed consistent value was -momit-leaf-frame-pointer.
> 
> The genetic algorithm was able to find sets of flags that produced 
> faster code than that emitted by the default -O1, -O2, and -O3 options 
> (with the exception of almabench, which is largely unoptimizable by 
> nature). In many cases, this was due to the inclusion of "new"
flags
> introduced with more recent version of GCC. Acovea discovers additional 
> options that improve performance over that provided by the default 
> settings; in essence, the genetic algorithm determines which of the 
> non-standard and processor- specific options are most effective for a 
> given algorithm.
> 
> A well-designed program will encapsulate algorithms. For most programs, 
> performance is predicated on specific algorithms that can be identified 
> via profiling; Acovea can then be applied to those algorithms for 
> finding the set of optimizations that produce the fastest code. Only 
> rarely does an entire program need to be fully optimized; instead, 
> optimization should be applied to specific, critical routines that 
> perform concise tasks.
> 
> The full article can be found at:
> 
> http://www.coyotegulch.com/acovea/index.html

_______________________
Reid Spencer
President & CTO
eXtensible Systems, Inc.
rspencer at x10sys.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20031118/c4e2caf8/attachment.sig>

Vikram Adve

2003-Nov-18 17:12 UTC

head link

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

This is a hot topic in the compiler research community, but the focus 
there is on
(a) choosing the right optimization sequences internally and 
transparently, rather than through combinations of options,
(b) performance prediction techniques so you don't actually have to run 
gazillion different choices, and perhaps can even avoid the problem of 
choosing representative inputs, as you talked about, and
(c) developing flexible internal representations so that you can 
dynamically generate appropriate code sequences.

We are actually involved in a project with IBM that aims to look at all 
of these issues.  Needless to say, I'm hoping that they will use LLVM 
as one of the components of the infrastructure for this work :)  They 
do seem interested.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

On Nov 18, 2003, at 2:18 PM, Reid Spencer wrote:
> I'm cross-posting the message below (from GCC list) because I believe 
> it
> would (at some point) be very beneficial to build an evolutionary
> optimization pass into LLVM. The idea would be to discover the perfect
> set of optimizations for a given program by trying them all and
> analyzing the execution times of each. This would be somewhat like
> profile driven optimization except the profile is easier to gather: its
> the whole program timing. I imagine a "mega pass" could be
written that
> tries all other passes in all their combinations to (eventually) arrive
> at the right set for a given program.  There's one big issue with this:
> workload. Unlike profile-driven optimization, which can find parts of
> the program to optimize based on data collected from a run or several
> runs of the program, this evolutionary pass would need to run the exact
> same workload multiple times to discover which optimizations to apply.
> Obtaining a useful/typical workload for a given program will be
> difficult. Its easy for compilers and the like, but not all programs
> work that way (consider optimizing a web server this way). Probably the
> right approach is to punt on the workload issue leaving it to the
> programmer to invoke/drive the program in such a way that it finds a
> consistent workload to execute.
>
> I know its early in LLVM's life for something like this, but I thought
> I'd mention it so it can be tucked away in the "think about this
later"
> file.
>
> Reid.
>
> -----Forwarded Message-----
>> From: Scott Robert Ladd <coyote at coyotegulch.com>
>> To: gcc mailing list <gcc at gcc.gnu.org>
>> Subject: Optimization: Conclusions from Evolutionary Analysis
>> Date: Tue, 18 Nov 2003 10:11:55 -0500
>>
>> For those who haven't the time to read my full article, I thought
I'd
>> pass along the conclusions section. One of my goals with Acovea was to
>> provide some empirical evidence of how optimizations behave, and to
>> dispell some common misconceptions about optimization.
>>
>> The settings for -O1, -O2, and -O3 are valid. Every option implied by
>> -O3 may not be applicable to every program -- but enabling all those
>> options does not seem to be detrimental, except in the case of
>> huffbench. Several of the recently-added flags (-ftracer in 
>> particular)
>> may be candidates for inclusion in -O2 or -O3 for future versions of 
>> the
>> compiler.
>>
>> Processor-specific options do not appear to be a major factor in
>> performance on these benchmarks. I don't know if this is due to the
>> nature of the processor, or if GCC can't take advantage of
>> processor-specific instructions. I have double-checked my results;
>> adding -mfpmath=sse (or any of its variants, or -msse) to a compile of
>> almabench does not make the code run any faster. The only 
>> ia32-specific
>> option that showed consistent value was -momit-leaf-frame-pointer.
>>
>> The genetic algorithm was able to find sets of flags that produced
>> faster code than that emitted by the default -O1, -O2, and -O3 options
>> (with the exception of almabench, which is largely unoptimizable by
>> nature). In many cases, this was due to the inclusion of
"new" flags
>> introduced with more recent version of GCC. Acovea discovers 
>> additional
>> options that improve performance over that provided by the default
>> settings; in essence, the genetic algorithm determines which of the
>> non-standard and processor- specific options are most effective for a
>> given algorithm.
>>
>> A well-designed program will encapsulate algorithms. For most 
>> programs,
>> performance is predicated on specific algorithms that can be 
>> identified
>> via profiling; Acovea can then be applied to those algorithms for
>> finding the set of optimizations that produce the fastest code. Only
>> rarely does an entire program need to be fully optimized; instead,
>> optimization should be applied to specific, critical routines that
>> perform concise tasks.
>>
>> The full article can be found at:
>>
>> http://www.coyotegulch.com/acovea/index.html
>
>
> _______________________
> Reid Spencer
> President & CTO
> eXtensible Systems, Inc.
> rspencer at x10sys.com

Reid Spencer

2003-Nov-19 14:06 UTC

head link

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

On Tue, 2003-11-18 at 15:11, Vikram Adve wrote:> This is a hot topic in the compiler research community, but the focus 
> there is on
> (a) choosing the right optimization sequences internally and 
> transparently, rather than through combinations of options,
> (b) performance prediction techniques so you don't actually have to run
> gazillion different choices, and perhaps can even avoid the problem of 
> choosing representative inputs, as you talked about, and
My experience with performance prediction (over a 20 year career) is
that its useless. The simulation world has long sought to be able to
predict outcomes, performance, scalability, etc. I haven't seen a model
yet that gets lower than a +/-50% error rate.  In the general case, the
only *reliable* way to do it is through collecting actual data. Of
course, in the specific case of compiler output, it might be possible to
get higher levels of accuracy. It'll be interesting to follow this
research.
> (c) developing flexible internal representations so that you can 
> dynamically generate appropriate code sequences.
Now THAT I agree with :) 
> We are actually involved in a project with IBM that aims to look at all 
> of these issues.  Needless to say, I'm hoping that they will use LLVM 
> as one of the components of the infrastructure for this work :)  They 
> do seem interested.
I hope so too .. great validation.

Reid.
> On Nov 18, 2003, at 2:18 PM, Reid Spencer wrote:
> 
> > I'm cross-posting the message below (from GCC list) because I
believe
> > it
> > would (at some point) be very beneficial to build an evolutionary
> > optimization pass into LLVM. The idea would be to discover the perfect
> > set of optimizations for a given program by trying them all and
> > analyzing the execution times of each. This would be somewhat like
> > profile driven optimization except the profile is easier to gather:
its
> > the whole program timing. I imagine a "mega pass" could be
written that
> > tries all other passes in all their combinations to (eventually)
arrive
> > at the right set for a given program.  There's one big issue with
this:
> > workload. Unlike profile-driven optimization, which can find parts of
> > the program to optimize based on data collected from a run or several
> > runs of the program, this evolutionary pass would need to run the
exact
> > same workload multiple times to discover which optimizations to apply.
> > Obtaining a useful/typical workload for a given program will be
> > difficult. Its easy for compilers and the like, but not all programs
> > work that way (consider optimizing a web server this way). Probably
the
> > right approach is to punt on the workload issue leaving it to the
> > programmer to invoke/drive the program in such a way that it finds a
> > consistent workload to execute.
> >
> > I know its early in LLVM's life for something like this, but I
thought
> > I'd mention it so it can be tucked away in the "think about
this later"
> > file.
> >
> > Reid.
> >
> > -----Forwarded Message-----
> >> From: Scott Robert Ladd <coyote at coyotegulch.com>
> >> To: gcc mailing list <gcc at gcc.gnu.org>
> >> Subject: Optimization: Conclusions from Evolutionary Analysis
> >> Date: Tue, 18 Nov 2003 10:11:55 -0500
> >>
> >> For those who haven't the time to read my full article, I
thought I'd
> >> pass along the conclusions section. One of my goals with Acovea
was to
> >> provide some empirical evidence of how optimizations behave, and
to
> >> dispell some common misconceptions about optimization.
> >>
> >> The settings for -O1, -O2, and -O3 are valid. Every option implied
by
> >> -O3 may not be applicable to every program -- but enabling all
those
> >> options does not seem to be detrimental, except in the case of
> >> huffbench. Several of the recently-added flags (-ftracer in 
> >> particular)
> >> may be candidates for inclusion in -O2 or -O3 for future versions
of
> >> the
> >> compiler.
> >>
> >> Processor-specific options do not appear to be a major factor in
> >> performance on these benchmarks. I don't know if this is due
to the
> >> nature of the processor, or if GCC can't take advantage of
> >> processor-specific instructions. I have double-checked my results;
> >> adding -mfpmath=sse (or any of its variants, or -msse) to a
compile of
> >> almabench does not make the code run any faster. The only 
> >> ia32-specific
> >> option that showed consistent value was -momit-leaf-frame-pointer.
> >>
> >> The genetic algorithm was able to find sets of flags that produced
> >> faster code than that emitted by the default -O1, -O2, and -O3
options
> >> (with the exception of almabench, which is largely unoptimizable
by
> >> nature). In many cases, this was due to the inclusion of
"new" flags
> >> introduced with more recent version of GCC. Acovea discovers 
> >> additional
> >> options that improve performance over that provided by the default
> >> settings; in essence, the genetic algorithm determines which of
the
> >> non-standard and processor- specific options are most effective
for a
> >> given algorithm.
> >>
> >> A well-designed program will encapsulate algorithms. For most 
> >> programs,
> >> performance is predicated on specific algorithms that can be 
> >> identified
> >> via profiling; Acovea can then be applied to those algorithms for
> >> finding the set of optimizations that produce the fastest code.
Only
> >> rarely does an entire program need to be fully optimized; instead,
> >> optimization should be applied to specific, critical routines that
> >> perform concise tasks.
> >>
> >> The full article can be found at:
> >>
> >> http://www.coyotegulch.com/acovea/index.html
> >
> >
> > _______________________
> > Reid Spencer
> > President & CTO
> > eXtensible Systems, Inc.
> > rspencer at x10sys.com
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________
Reid Spencer
President & CTO
eXtensible Systems, Inc.
rspencer at x10sys.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20031119/df8a5603/attachment.sig>

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Nov 2003 - [LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

[LLVMdev] [Fwd: Optimization: Conclusions from Evolutionary Analysis]

Seemingly Similar Threads