thr3ads.net - llvm dev - [LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2013-Jan-14 09:09 UTC

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

This has been an idea floating around in my head for a while and after
several discussions with others it continues to hold up so I thought I
would mail it out. Sorry for cross posting to both lists, but this is an
issue that would significantly impact both LLVM and Clang.

Essentially, LLVM provides canned optimization "levels" for frontends
to
re-use. This is nothing new. However, we don't have good names for them, we
don't expose them at the IR level, and we have a hard time figuring out
which optimizations belong in which levels. I'd like to try addressing that
by coming up with names and a description of the basic intend goal of each
level. I would like, if folks are happy with these ideas, to add these
types of descriptions along side these attributes to the langref. Ideas on
other (better?) places to document this would be welcome. Certainly,
Clang's documentation would need to be updated to reflect this.

Hopefully we can minimally debate this until the bikeshed is a tolerable
shade. Note that I'm absolutely biased based on the behavior of Clang and
GCC with these optimization levels, and the relevant history there.
However, I'm adding and deviating from the purely historical differences to
try and better model the latest developments in LLVM's optimizer... So here
goes:


1) The easiest: 'Minimize Size' or '-Oz'
- Attribute: minsize (we already have it, nothing to do here)
- Goal: minimize the size of the resulting binary, at (nearly) any cost.


2) Optimize for size or '-Os'
- Attribute: optsize (we already have it, nothing to do here)
- Goal: Optimize the execution of the binary without unreasonably[1]
increasing the binary size.
This one is a bit fuzzy, but usually people don't have a hard time figuring
out where the line is. The primary difference between minsize and optsize
is that with minsize a pass is free to *hurt* performance to shrink the
size.

[1] The definition of 'unreasonable' is of course subjective, but here
is
at least one strong indicator: any code size growth which is inherently
*speculative* (that is, there isn't a known, demonstrable performance
benefit, but rather it is "often" or "maybe" a benefit) is
unlikely to be a
good fit in optsize. The canonical example IMO is a vectorizer -- while it
is reasonable to vectorize a loop, if the vector version might not be
executed, and thus the scalar loop remains as well, then it is a poor fit
for optsize.


3) Optimize quickly or '-O1'
- Attribute: quickopt (this would be a new attribute)
- Goal: Perform basic optimizations to improve both performance and
simplicity of the code, but perform them *quickly*.
This level is all about compile time, but in a holistic sense. It tries to
perform basic optimizations to get reasonably efficient code, and get it
very quickly.


4) Good, well-balanced optimizations, or '-O2'
- Attribute: opt (new attribute)
- Goal: produce a well optimized binary trading off compile time, space,
and runtime efficiency.
This should be an excellent default for general purpose programs. The idea
is to do as much optimization as we can, in as reasonable of a time frame,
and with as reasonable code size impact as possible. This level should
always produce binaries at least as fast as optsize, but they might be both
bigger and faster. This level should always produce binaries at least as
fast as quickopt, but they might be both slower to compile.


5) Optimize to the max or '-O3'
- Attribute: maxopt (new attribute)
- Goal: produce the fastest binary possible.
This level has historically been almost exclusively about trading off more
binary size for speed than '-O2', but I would propose we change it to be
more about trading off either binary size or compilation time to achieve a
better performing binary. This level should always produce binaries at
least as fast as opt, but they might be faster at the cost of them being
larger and taking more time to compile. This would in some cases be a
change for LLVM and is definitely a deviation from GCC where O3 will in
many cases produce *slower* binaries due to code size increases that are
not accompanied by corresponding performance increases.


To go with these LLVM attributes I'd like to add support for adding
attributes in Clang, both compatible with GCC and with the names above for
clarity. The goal being to allow a specific function to have its
optimization level overridden from the command line based level.


A final note: I would like to remove all other variations on the '-O'
flag.
That includes the really strange '-O4' behavior. Whether the compilation
is
LTO should be an orthogonal decision to the particular level of
optimization, and we have -flto to achieve this.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/b07d9f40/attachment.html>

henry miller

2013-Jan-14 12:27 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

As a user I like this it is hard to understand what each level does.  I know
projects that are using O3 because 'more must be better'.  I don't
know how to explain that it might be slower, and measuring performance is
tricky. (Many programs do multpile things, and spend most of their time waiting
on the event loop)

Would it be unreasonable to ask for a new/seperate set of optimizations:
optimize debug.  This would apple agressive optimizations, but not
"significantly" changing the order of the code.

I don't know the optimizer, but I know as a user of compilers that minimal
optimization is often the difference between painfully slow program execution
and okay performance. However debugging optimized programs can be difficult
because the debugger jumps all over making the problem hard to understand.

I'll leave it to experts to debate shades.


Chandler Carruth <chandlerc at gmail.com> wrote:
>This has been an idea floating around in my head for a while and after
>several discussions with others it continues to hold up so I thought I
>would mail it out. Sorry for cross posting to both lists, but this is
>an
>issue that would significantly impact both LLVM and Clang.
>
>Essentially, LLVM provides canned optimization "levels" for
frontends
>to
>re-use. This is nothing new. However, we don't have good names for
>them, we
>don't expose them at the IR level, and we have a hard time figuring out
>which optimizations belong in which levels. I'd like to try addressing
>that
>by coming up with names and a description of the basic intend goal of
>each
>level. I would like, if folks are happy with these ideas, to add these
>types of descriptions along side these attributes to the langref. Ideas
>on
>other (better?) places to document this would be welcome. Certainly,
>Clang's documentation would need to be updated to reflect this.
>
>Hopefully we can minimally debate this until the bikeshed is a
>tolerable
>shade. Note that I'm absolutely biased based on the behavior of Clang
>and
>GCC with these optimization levels, and the relevant history there.
>However, I'm adding and deviating from the purely historical
>differences to
>try and better model the latest developments in LLVM's optimizer... So
>here
>goes:
>
>
>1) The easiest: 'Minimize Size' or '-Oz'
>- Attribute: minsize (we already have it, nothing to do here)
>- Goal: minimize the size of the resulting binary, at (nearly) any
>cost.
>
>
>2) Optimize for size or '-Os'
>- Attribute: optsize (we already have it, nothing to do here)
>- Goal: Optimize the execution of the binary without unreasonably[1]
>increasing the binary size.
>This one is a bit fuzzy, but usually people don't have a hard time
>figuring
>out where the line is. The primary difference between minsize and
>optsize
>is that with minsize a pass is free to *hurt* performance to shrink the
>size.
>
>[1] The definition of 'unreasonable' is of course subjective, but
here
>is
>at least one strong indicator: any code size growth which is inherently
>*speculative* (that is, there isn't a known, demonstrable performance
>benefit, but rather it is "often" or "maybe" a benefit)
is unlikely to
>be a
>good fit in optsize. The canonical example IMO is a vectorizer -- while
>it
>is reasonable to vectorize a loop, if the vector version might not be
>executed, and thus the scalar loop remains as well, then it is a poor
>fit
>for optsize.
>
>
>3) Optimize quickly or '-O1'
>- Attribute: quickopt (this would be a new attribute)
>- Goal: Perform basic optimizations to improve both performance and
>simplicity of the code, but perform them *quickly*.
>This level is all about compile time, but in a holistic sense. It tries
>to
>perform basic optimizations to get reasonably efficient code, and get
>it
>very quickly.
>
>
>4) Good, well-balanced optimizations, or '-O2'
>- Attribute: opt (new attribute)
>- Goal: produce a well optimized binary trading off compile time,
>space,
>and runtime efficiency.
>This should be an excellent default for general purpose programs. The
>idea
>is to do as much optimization as we can, in as reasonable of a time
>frame,
>and with as reasonable code size impact as possible. This level should
>always produce binaries at least as fast as optsize, but they might be
>both
>bigger and faster. This level should always produce binaries at least
>as
>fast as quickopt, but they might be both slower to compile.
>
>
>5) Optimize to the max or '-O3'
>- Attribute: maxopt (new attribute)
>- Goal: produce the fastest binary possible.
>This level has historically been almost exclusively about trading off
>more
>binary size for speed than '-O2', but I would propose we change it
to
>be
>more about trading off either binary size or compilation time to
>achieve a
>better performing binary. This level should always produce binaries at
>least as fast as opt, but they might be faster at the cost of them
>being
>larger and taking more time to compile. This would in some cases be a
>change for LLVM and is definitely a deviation from GCC where O3 will in
>many cases produce *slower* binaries due to code size increases that
>are
>not accompanied by corresponding performance increases.
>
>
>To go with these LLVM attributes I'd like to add support for adding
>attributes in Clang, both compatible with GCC and with the names above
>for
>clarity. The goal being to allow a specific function to have its
>optimization level overridden from the command line based level.
>
>
>A final note: I would like to remove all other variations on the
'-O'
>flag.
>That includes the really strange '-O4' behavior. Whether the
>compilation is
>LTO should be an orthogonal decision to the particular level of
>optimization, and we have -flto to achieve this.
>
>-Chandler
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>cfe-dev mailing list
>cfe-dev at cs.uiuc.edu
>http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/7352fbf4/attachment.html>

Justin Holewinski

2013-Jan-14 12:46 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

If I understand the attributes correctly, they would be function-level
attributes applied to IR functions, correct?  I'm curious what the
semantics would be for cross-function optimization.  For example, consider
a function "foo" defined with maxopt and a function "bar"
defined with
optsize.  If foo() calls bar() and the inliner wants to inline bar() into
foo(), is that legal?  If so, that may cause odd effects as you may perform
expensive optimizations later on the inlined version of bar(), even though
the original function is marked optsize.

Also, a nit-pick:  can we make the naming consistent?  It feels a bit weird
to have maxOPT and OPTsize.  Perhaps use sizeopt and minsizeopt, or optmax
and optquick?


On Mon, Jan 14, 2013 at 4:09 AM, Chandler Carruth <chandlerc at
gmail.com>wrote:
> This has been an idea floating around in my head for a while and after
> several discussions with others it continues to hold up so I thought I
> would mail it out. Sorry for cross posting to both lists, but this is an
> issue that would significantly impact both LLVM and Clang.
>
> Essentially, LLVM provides canned optimization "levels" for
frontends to
> re-use. This is nothing new. However, we don't have good names for
them, we
> don't expose them at the IR level, and we have a hard time figuring out
> which optimizations belong in which levels. I'd like to try addressing
that
> by coming up with names and a description of the basic intend goal of each
> level. I would like, if folks are happy with these ideas, to add these
> types of descriptions along side these attributes to the langref. Ideas on
> other (better?) places to document this would be welcome. Certainly,
> Clang's documentation would need to be updated to reflect this.
>
> Hopefully we can minimally debate this until the bikeshed is a tolerable
> shade. Note that I'm absolutely biased based on the behavior of Clang
and
> GCC with these optimization levels, and the relevant history there.
> However, I'm adding and deviating from the purely historical
differences to
> try and better model the latest developments in LLVM's optimizer... So
here
> goes:
>
>
> 1) The easiest: 'Minimize Size' or '-Oz'
> - Attribute: minsize (we already have it, nothing to do here)
> - Goal: minimize the size of the resulting binary, at (nearly) any cost.
>
>
> 2) Optimize for size or '-Os'
> - Attribute: optsize (we already have it, nothing to do here)
> - Goal: Optimize the execution of the binary without unreasonably[1]
> increasing the binary size.
> This one is a bit fuzzy, but usually people don't have a hard time
> figuring out where the line is. The primary difference between minsize and
> optsize is that with minsize a pass is free to *hurt* performance to shrink
> the size.
>
> [1] The definition of 'unreasonable' is of course subjective, but
here is
> at least one strong indicator: any code size growth which is inherently
> *speculative* (that is, there isn't a known, demonstrable performance
> benefit, but rather it is "often" or "maybe" a benefit)
is unlikely to be a
> good fit in optsize. The canonical example IMO is a vectorizer -- while it
> is reasonable to vectorize a loop, if the vector version might not be
> executed, and thus the scalar loop remains as well, then it is a poor fit
> for optsize.
>
>
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
> simplicity of the code, but perform them *quickly*.
> This level is all about compile time, but in a holistic sense. It tries to
> perform basic optimizations to get reasonably efficient code, and get it
> very quickly.
>
>
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
> - Goal: produce a well optimized binary trading off compile time, space,
> and runtime efficiency.
> This should be an excellent default for general purpose programs. The idea
> is to do as much optimization as we can, in as reasonable of a time frame,
> and with as reasonable code size impact as possible. This level should
> always produce binaries at least as fast as optsize, but they might be both
> bigger and faster. This level should always produce binaries at least as
> fast as quickopt, but they might be both slower to compile.
>
>
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level has historically been almost exclusively about trading off more
> binary size for speed than '-O2', but I would propose we change it
to be
> more about trading off either binary size or compilation time to achieve a
> better performing binary. This level should always produce binaries at
> least as fast as opt, but they might be faster at the cost of them being
> larger and taking more time to compile. This would in some cases be a
> change for LLVM and is definitely a deviation from GCC where O3 will in
> many cases produce *slower* binaries due to code size increases that are
> not accompanied by corresponding performance increases.
>
>
> To go with these LLVM attributes I'd like to add support for adding
> attributes in Clang, both compatible with GCC and with the names above for
> clarity. The goal being to allow a specific function to have its
> optimization level overridden from the command line based level.
>
>
> A final note: I would like to remove all other variations on the
'-O'
> flag. That includes the really strange '-O4' behavior. Whether the
> compilation is LTO should be an orthogonal decision to the particular level
> of optimization, and we have -flto to achieve this.
>
> -Chandler
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/e2c9e32a/attachment.html>

James Molloy

2013-Jan-14 13:37 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

Would it be possible to have a parameterised attribute? opt(min),
opt(size), opt(max) ?


On 14 January 2013 12:46, Justin Holewinski <justin.holewinski at
gmail.com>wrote:
> If I understand the attributes correctly, they would be function-level
> attributes applied to IR functions, correct?  I'm curious what the
> semantics would be for cross-function optimization.  For example, consider
> a function "foo" defined with maxopt and a function
"bar" defined with
> optsize.  If foo() calls bar() and the inliner wants to inline bar() into
> foo(), is that legal?  If so, that may cause odd effects as you may perform
> expensive optimizations later on the inlined version of bar(), even though
> the original function is marked optsize.
>
> Also, a nit-pick:  can we make the naming consistent?  It feels a bit
> weird to have maxOPT and OPTsize.  Perhaps use sizeopt and minsizeopt, or
> optmax and optquick?
>
>
> On Mon, Jan 14, 2013 at 4:09 AM, Chandler Carruth <chandlerc at
gmail.com>wrote:
>
>> This has been an idea floating around in my head for a while and after
>> several discussions with others it continues to hold up so I thought I
>> would mail it out. Sorry for cross posting to both lists, but this is
an
>> issue that would significantly impact both LLVM and Clang.
>>
>> Essentially, LLVM provides canned optimization "levels" for
frontends to
>> re-use. This is nothing new. However, we don't have good names for
them, we
>> don't expose them at the IR level, and we have a hard time figuring
out
>> which optimizations belong in which levels. I'd like to try
addressing that
>> by coming up with names and a description of the basic intend goal of
each
>> level. I would like, if folks are happy with these ideas, to add these
>> types of descriptions along side these attributes to the langref. Ideas
on
>> other (better?) places to document this would be welcome. Certainly,
>> Clang's documentation would need to be updated to reflect this.
>>
>> Hopefully we can minimally debate this until the bikeshed is a
tolerable
>> shade. Note that I'm absolutely biased based on the behavior of
Clang and
>> GCC with these optimization levels, and the relevant history there.
>> However, I'm adding and deviating from the purely historical
differences to
>> try and better model the latest developments in LLVM's optimizer...
So here
>> goes:
>>
>>
>> 1) The easiest: 'Minimize Size' or '-Oz'
>> - Attribute: minsize (we already have it, nothing to do here)
>> - Goal: minimize the size of the resulting binary, at (nearly) any
cost.
>>
>>
>> 2) Optimize for size or '-Os'
>> - Attribute: optsize (we already have it, nothing to do here)
>> - Goal: Optimize the execution of the binary without unreasonably[1]
>> increasing the binary size.
>> This one is a bit fuzzy, but usually people don't have a hard time
>> figuring out where the line is. The primary difference between minsize
and
>> optsize is that with minsize a pass is free to *hurt* performance to
shrink
>> the size.
>>
>> [1] The definition of 'unreasonable' is of course subjective,
but here is
>> at least one strong indicator: any code size growth which is inherently
>> *speculative* (that is, there isn't a known, demonstrable
performance
>> benefit, but rather it is "often" or "maybe" a
benefit) is unlikely to be a
>> good fit in optsize. The canonical example IMO is a vectorizer -- while
it
>> is reasonable to vectorize a loop, if the vector version might not be
>> executed, and thus the scalar loop remains as well, then it is a poor
fit
>> for optsize.
>>
>>
>> 3) Optimize quickly or '-O1'
>> - Attribute: quickopt (this would be a new attribute)
>> - Goal: Perform basic optimizations to improve both performance and
>> simplicity of the code, but perform them *quickly*.
>> This level is all about compile time, but in a holistic sense. It tries
>> to perform basic optimizations to get reasonably efficient code, and
get it
>> very quickly.
>>
>>
>> 4) Good, well-balanced optimizations, or '-O2'
>> - Attribute: opt (new attribute)
>> - Goal: produce a well optimized binary trading off compile time,
space,
>> and runtime efficiency.
>> This should be an excellent default for general purpose programs. The
>> idea is to do as much optimization as we can, in as reasonable of a
time
>> frame, and with as reasonable code size impact as possible. This level
>> should always produce binaries at least as fast as optsize, but they
might
>> be both bigger and faster. This level should always produce binaries at
>> least as fast as quickopt, but they might be both slower to compile.
>>
>>
>> 5) Optimize to the max or '-O3'
>> - Attribute: maxopt (new attribute)
>> - Goal: produce the fastest binary possible.
>> This level has historically been almost exclusively about trading off
>> more binary size for speed than '-O2', but I would propose we
change it to
>> be more about trading off either binary size or compilation time to
achieve
>> a better performing binary. This level should always produce binaries
at
>> least as fast as opt, but they might be faster at the cost of them
being
>> larger and taking more time to compile. This would in some cases be a
>> change for LLVM and is definitely a deviation from GCC where O3 will in
>> many cases produce *slower* binaries due to code size increases that
are
>> not accompanied by corresponding performance increases.
>>
>>
>> To go with these LLVM attributes I'd like to add support for adding
>> attributes in Clang, both compatible with GCC and with the names above
for
>> clarity. The goal being to allow a specific function to have its
>> optimization level overridden from the command line based level.
>>
>>
>> A final note: I would like to remove all other variations on the
'-O'
>> flag. That includes the really strange '-O4' behavior. Whether
the
>> compilation is LTO should be an orthogonal decision to the particular
level
>> of optimization, and we have -flto to achieve this.
>>
>> -Chandler
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
>
> --
>
> Thanks,
>
> Justin Holewinski
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/9c001f74/attachment.html>

Renato Golin Linaro

2013-Jan-14 14:41 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On 14 January 2013 12:27, henry miller <hank at millerfarm.com> wrote:
> Would it be unreasonable to ask for a new/seperate set of optimizations:
> optimize debug. This would apple agressive optimizations, but not
> "significantly" changing the order of the code.
>
This will be interesting, but knowing how LLVM can segfault when commenting
out one or the other pass from a sequence, I'd think that it'd take a
long
time before we could test *every* combination of these "nice to have"
 optimization profiles. Basically, you get all the tests we have today and
duplicate it for each profile.


I don't know the optimizer, but I know as a user of compilers that
minimal> optimization is often the difference between painfully slow program
> execution and okay performance. However debugging optimized programs can be
> difficult because the debugger jumps all over making the problem hard to
> understand.
>
I don't think you really need a performing debug image, though. Debuggers
tend to be so slow than the performance of the image is irrelevant.
Normally, the extra time is the number of breakpoints or watchpoints you
have set, and not the image itself... ;)

--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/3121fbff/attachment.html>

Krzysztof Parzyszek

2013-Jan-14 16:00 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On 1/14/2013 3:09 AM, Chandler Carruth wrote:>
> [...] This level should always produce
> binaries at least as fast as quickopt, but they might be both slower to
> compile.
The "always" part cannot really be guaranteed or enforced.  I'd
state it
in terms of intention, i.e. "this level is intended to produce binaries 
at least as fast as quickopt".  Otherwise, the wording may imply that it 
is a compiler bug if there exists a binary that runs slower at -O2 than 
at -O1.

-Krzysztof


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Hal Finkel

2013-Jan-14 18:12 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

----- Original Message -----> From: "Chandler Carruth" <chandlerc at gmail.com>
> To: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "clang-dev Developers" <cfe-dev at cs.uiuc.edu>
> Sent: Monday, January 14, 2013 3:09:01 AM
> Subject: [LLVMdev] RFC: Codifying (but not formalizing) the optimization
levels in LLVM and Clang
> 
> 
> 
> 
> This has been an idea floating around in my head for a while and
> after several discussions with others it continues to hold up so I
> thought I would mail it out. Sorry for cross posting to both lists,
> but this is an issue that would significantly impact both LLVM and
> Clang.
> 
> 
> Essentially, LLVM provides canned optimization "levels" for
frontends
> to re-use. This is nothing new. However, we don't have good names
> for them, we don't expose them at the IR level, and we have a hard
> time figuring out which optimizations belong in which levels. I'd
> like to try addressing that by coming up with names and a
> description of the basic intend goal of each level. I would like, if
> folks are happy with these ideas, to add these types of descriptions
> along side these attributes to the langref. Ideas on other (better?)
> places to document this would be welcome. Certainly, Clang's
> documentation would need to be updated to reflect this.
> 
> 
> Hopefully we can minimally debate this until the bikeshed is a
> tolerable shade. Note that I'm absolutely biased based on the
> behavior of Clang and GCC with these optimization levels, and the
> relevant history there. However, I'm adding and deviating from the
> purely historical differences to try and better model the latest
> developments in LLVM's optimizer... So here goes:
> 
> 
> 
> 
> 1) The easiest: 'Minimize Size' or '-Oz'
> - Attribute: minsize (we already have it, nothing to do here)
> 
> - Goal: minimize the size of the resulting binary, at (nearly) any
> cost.
> 
> 
> 
> 
> 2) Optimize for size or '-Os'
> - Attribute: optsize (we already have it, nothing to do here)
> - Goal: Optimize the execution of the binary without unreasonably[1]
> increasing the binary size.
> This one is a bit fuzzy, but usually people don't have a hard time
> figuring out where the line is. The primary difference between
> minsize and optsize is that with minsize a pass is free to *hurt*
> performance to shrink the size.
> 
> 
> [1] The definition of 'unreasonable' is of course subjective, but
> here is at least one strong indicator: any code size growth which is
> inherently *speculative* (that is, there isn't a known, demonstrable
> performance benefit, but rather it is "often" or
"maybe" a benefit)
> is unlikely to be a good fit in optsize. The canonical example IMO
> is a vectorizer -- while it is reasonable to vectorize a loop, if
> the vector version might not be executed, and thus the scalar loop
> remains as well, then it is a poor fit for optsize.
> 
> 
> 
> 
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
> simplicity of the code, but perform them *quickly*.
> This level is all about compile time, but in a holistic sense. It
> tries to perform basic optimizations to get reasonably efficient
> code, and get it very quickly.
> 
> 
> 
> 
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
> - Goal: produce a well optimized binary trading off compile time,
> space, and runtime efficiency.
> This should be an excellent default for general purpose programs. The
> idea is to do as much optimization as we can, in as reasonable of a
> time frame, and with as reasonable code size impact as possible.
> This level should always produce binaries at least as fast as
> optsize, but they might be both bigger and faster. This level should
> always produce binaries at least as fast as quickopt, but they might
> be both slower to compile.
> 
> 
> 
> 
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level has historically been almost exclusively about trading off
> more binary size for speed than '-O2', but I would propose we
change
> it to be more about trading off either binary size or compilation
> time to achieve a better performing binary. This level should always
> produce binaries at least as fast as opt, but they might be faster
> at the cost of them being larger and taking more time to compile.
> This would in some cases be a change for LLVM and is definitely a
> deviation from GCC where O3 will in many cases produce *slower*
> binaries due to code size increases that are not accompanied by
> corresponding performance increases.
> 
> 
> 
> 
> To go with these LLVM attributes I'd like to add support for adding
> attributes in Clang, both compatible with GCC and with the names
> above for clarity. The goal being to allow a specific function to
> have its optimization level overridden from the command line based
> level.
> 
> 
> 
> 
> A final note: I would like to remove all other variations on the
'-O'
> flag. That includes the really strange '-O4' behavior. Whether the
> compilation is LTO should be an orthogonal decision to the
> particular level of optimization, and we have -flto to achieve this.
I agree that -O4 and LTO should be separated. In general, we need an easy way
for passes (like the vectorizer, loop unroller, etc.) to know not only the
current optimization level (which should be easy if these are all function
attributes) but also whether they are executing before or during the
"final" optimization phase (because there are some things you
don't want to do if you'll be later using LTO).

Nevertheless, I think that we definitely do want optimization levels above -O3.
None of these should make code slower, but provide a rough indication of how
long the user is willing to wait for the compilation process to complete. For
example, currently when -vectorize is given, not only is basic-block
vectorization enabled, but so are an extra invocation of EarlyCSE and
InstCombine. These latter two passes run even if BBVectorizer does nothing
(which indicates another needed infrastructure improvement, but that's
another story), and sometimes produce a small speedup independent of the
vectorizer. Because of the compile-time impact, there might not be sufficient
justification for enabling these passes by default at -O3, but it would make
sense to do so at a -O4 level: The user wants the compiler to 'try
harder'.

There are several parts of the compiler that are essentially solving
combinatorial optimization problems. Basic-block vectorization is an obvious
example, but instruction scheduling can also fall into this category, as can
some forms of reassociation, inlining, etc. We currently use heuristics and
cutoffs to deal with this, but given the asymptotic scaling of each algorithm,
we could set these cutoffs, and choose the approximation algorithm to use, based
on the current optimization level. I'd recommend something like this: -O4
can take twice as long as -O3, -O5 can take twice as long as -O4, etc.

 -Hal
> 
> 
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

dag at cray.com

2013-Jan-14 18:34 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

Chandler Carruth <chandlerc at gmail.com> writes:
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
Since all other levels have a qualifier, I'd suggest we do that here as
well.  Perhaps balancedopt?
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level should always produce binaries at least as fast as opt, but
> they might be faster at the cost of them being larger and taking more
> time to compile.
That's almost impossible to achieve if you include things like
vectorization.  It is often difficult to know statically whether
vectorization will help or harm.  One can do runtime code selection but
that has its own costs and can be slower than O2 in some cases.

I simply don't think it's a useful or achieveable guarantee.  It's a
good goal but I would be against making it a reuiqrement as it will
severely limit what we do at O3 over O2.

FWIW, the Cray compiler considers O3 to be "try to make it as fast as
possible without unreasonably increasing compile time" but does not
guarantee it absolutely will be faster than -O2.  We have -Oaggress for
"pull out all the stops, take days to compile the code and you get what
you get."  :)
> A final note: I would like to remove all other variations on the
'-O'
> flag. That includes the really strange '-O4' behavior. Whether the
> compilation is LTO should be an orthogonal decision to the particular
> level of optimization, and we have -flto to achieve this.
Agreed.

                                     -David

Sean Silva

2013-Jan-14 19:54 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 4:09 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:> - Attribute: minsize (we already have it, nothing to do here)
This doesn't appear to be documented
<http://llvm.org/docs/LangRef.html#function-attributes>.

-- Sean Silva

Chandler Carruth

2013-Jan-14 21:09 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

FYI

On Mon, Jan 14, 2013 at 4:27 AM, henry miller <hank at millerfarm.com>
wrote:
> Would it be unreasonable to ask for a new/seperate set of optimizations:
> optimize debug. This would apple agressive optimizations, but not
> "significantly" changing the order of the code.

I don't think it's unreasonable necessarily, but it should be a separate
discussion, and it's not something I'm planning to pursue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/f61256ed/attachment.html>

Jim Grosbach

2013-Jan-14 21:14 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

Hi Chandler,

All of the below sounds completely reasonable to me. In particular, I'm in
favor of the clarifications/adjustments to O3 and the removal of O4.

-Jim

On Jan 14, 2013, at 1:09 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> This has been an idea floating around in my head for a while and after
several discussions with others it continues to hold up so I thought I would
mail it out. Sorry for cross posting to both lists, but this is an issue that
would significantly impact both LLVM and Clang.
> 
> Essentially, LLVM provides canned optimization "levels" for
frontends to re-use. This is nothing new. However, we don't have good names
for them, we don't expose them at the IR level, and we have a hard time
figuring out which optimizations belong in which levels. I'd like to try
addressing that by coming up with names and a description of the basic intend
goal of each level. I would like, if folks are happy with these ideas, to add
these types of descriptions along side these attributes to the langref. Ideas on
other (better?) places to document this would be welcome. Certainly, Clang's
documentation would need to be updated to reflect this.
> 
> Hopefully we can minimally debate this until the bikeshed is a tolerable
shade. Note that I'm absolutely biased based on the behavior of Clang and
GCC with these optimization levels, and the relevant history there. However,
I'm adding and deviating from the purely historical differences to try and
better model the latest developments in LLVM's optimizer... So here goes:
> 
> 
> 1) The easiest: 'Minimize Size' or '-Oz'
> - Attribute: minsize (we already have it, nothing to do here)
> - Goal: minimize the size of the resulting binary, at (nearly) any cost.
> 
> 
> 2) Optimize for size or '-Os'
> - Attribute: optsize (we already have it, nothing to do here)
> - Goal: Optimize the execution of the binary without unreasonably[1]
increasing the binary size.
> This one is a bit fuzzy, but usually people don't have a hard time
figuring out where the line is. The primary difference between minsize and
optsize is that with minsize a pass is free to *hurt* performance to shrink the
size.
> 
> [1] The definition of 'unreasonable' is of course subjective, but
here is at least one strong indicator: any code size growth which is inherently
*speculative* (that is, there isn't a known, demonstrable performance
benefit, but rather it is "often" or "maybe" a benefit) is
unlikely to be a good fit in optsize. The canonical example IMO is a vectorizer
-- while it is reasonable to vectorize a loop, if the vector version might not
be executed, and thus the scalar loop remains as well, then it is a poor fit for
optsize.
> 
> 
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
simplicity of the code, but perform them *quickly*.
> This level is all about compile time, but in a holistic sense. It tries to
perform basic optimizations to get reasonably efficient code, and get it very
quickly.
> 
> 
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
> - Goal: produce a well optimized binary trading off compile time, space,
and runtime efficiency.
> This should be an excellent default for general purpose programs. The idea
is to do as much optimization as we can, in as reasonable of a time frame, and
with as reasonable code size impact as possible. This level should always
produce binaries at least as fast as optsize, but they might be both bigger and
faster. This level should always produce binaries at least as fast as quickopt,
but they might be both slower to compile.
> 
> 
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level has historically been almost exclusively about trading off more
binary size for speed than '-O2', but I would propose we change it to be
more about trading off either binary size or compilation time to achieve a
better performing binary. This level should always produce binaries at least as
fast as opt, but they might be faster at the cost of them being larger and
taking more time to compile. This would in some cases be a change for LLVM and
is definitely a deviation from GCC where O3 will in many cases produce *slower*
binaries due to code size increases that are not accompanied by corresponding
performance increases.
> 
> 
> To go with these LLVM attributes I'd like to add support for adding
attributes in Clang, both compatible with GCC and with the names above for
clarity. The goal being to allow a specific function to have its optimization
level overridden from the command line based level.
> 
> 
> A final note: I would like to remove all other variations on the
'-O' flag. That includes the really strange '-O4' behavior.
Whether the compilation is LTO should be an orthogonal decision to the
particular level of optimization, and we have -flto to achieve this.
> 
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chandler Carruth

2013-Jan-14 21:23 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 4:46 AM, Justin Holewinski <
justin.holewinski at gmail.com> wrote:
> If I understand the attributes correctly, they would be function-level
> attributes applied to IR functions, correct?  I'm curious what the
> semantics would be for cross-function optimization.  For example, consider
> a function "foo" defined with maxopt and a function
"bar" defined with
> optsize.  If foo() calls bar() and the inliner wants to inline bar() into
> foo(), is that legal?  If so, that may cause odd effects as you may perform
> expensive optimizations later on the inlined version of bar(), even though
> the original function is marked optsize.
>
This is a great question. My plan would be: inlining doesn't impact the
attributes. The inliner will be free to look at both the caller and the
callee's attributes to choose the best inlining decision.

>
> Also, a nit-pick:  can we make the naming consistent?  It feels a bit
> weird to have maxOPT and OPTsize.  Perhaps use sizeopt and minsizeopt, or
> optmax and optquick?

Meh. I don't care really. It would require changing existing attributes,
but we can do that. I think the most readable structure is the first:

minsizeopt
sizeopt
quickopt
opt
maxopt

I'd like to hear some support for one or the other of these before deciding.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/2828bf59/attachment.html>

Christopher Jefferson

2013-Jan-14 22:07 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On 14 Jan 2013 09:10, "Chandler Carruth" <chandlerc at
gmail.com> wrote:>
> This has been an idea floating around in my head for a while and afterseveral discussions with others it continues to hold up so I thought I
would mail it out. Sorry for cross posting to both lists, but this is an
issue that would significantly impact both LLVM and
Clang.>
>
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
simplicity of the code, but perform them *quickly*.> This level is all about compile time, but in a holistic sense. It triesto perform basic optimizations to get reasonably efficient code, and get it
very quickly.>
GCC tries to make -O1 be compatible with debugging. I find having debugging
functional with some level of optimisation very useful, particularly in
template heavy C++ code. I don't know if there is an optimisation level
with similar behaviour in clang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/094b3b05/attachment.html>

Chandler Carruth

2013-Jan-14 22:43 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 8:00 AM, Krzysztof Parzyszek <
kparzysz at codeaurora.org> wrote:
> On 1/14/2013 3:09 AM, Chandler Carruth wrote:
>
>>
>> [...] This level should always produce
>>
>> binaries at least as fast as quickopt, but they might be both slower to
>> compile.
>>
>
> The "always" part cannot really be guaranteed or enforced. 
I'd state it
> in terms of intention, i.e. "this level is intended to produce
binaries at
> least as fast as quickopt".  Otherwise, the wording may imply that it
is a
> compiler bug if there exists a binary that runs slower at -O2 than at -O1.

The use of 'should' is not accidental in my proposal. I don't read
'should'
as implying any more or less of a guarantee (much less an enforcement) than
yours.

But clearly we cannot make guarantees or enforce things. That's not really
the point. What matters is that when at some point someone shows up with
data that shows "here is a way in which -O2 is faster than -O3", that
does
get classified as a bug, not a feature. =] That doesn't mean there will be
an easy or obvious solution, but it does give us the right perspective on
finding a solution.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/099a9d36/attachment.html>

Chandler Carruth

2013-Jan-14 22:50 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 10:34 AM, <dag at cray.com> wrote:
> Chandler Carruth <chandlerc at gmail.com> writes:
> > 5) Optimize to the max or '-O3'
> > - Attribute: maxopt (new attribute)
> > - Goal: produce the fastest binary possible.
>
> > This level should always produce binaries at least as fast as opt, but
> > they might be faster at the cost of them being larger and taking more
> > time to compile.
>
> That's almost impossible to achieve if you include things like
> vectorization.  It is often difficult to know statically whether
> vectorization will help or harm.  One can do runtime code selection but
> that has its own costs and can be slower than O2 in some cases.
>
> I simply don't think it's a useful or achieveable guarantee. 
It's a
> good goal but I would be against making it a reuiqrement as it will
> severely limit what we do at O3 over O2.
>
Note that this isn't a guarantee or a requirement, but a goal. That said,
we should push hard to achieve that goal, including turning off
optimizations until we get them *right* if absolutely necessary.

I specifically agree with you that this makes vectorization really hard to
get right. I think that is because vectorization is really hard to get
right! ;] Unless passes and optimizations can be set up to speed up code at
least as much (and hopefully more... ;]) than they slow down code, I don't
think they should be on by default.

None of this precludes researching new optimization passes, trying
different things, or exposing flags for users with deeply domain specific
needs to turn on domain targeted optimization strategies (quite aside from
any optimization level). This is just about creating reasonable
expectations for the default collections.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/8ea0e864/attachment.html>

Chandler Carruth

2013-Jan-14 22:58 UTC

head link

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 2:07 PM, Christopher Jefferson <
chris at bubblescope.net> wrote:
> > 3) Optimize quickly or '-O1'
> > - Attribute: quickopt (this would be a new attribute)
> > - Goal: Perform basic optimizations to improve both performance and
> simplicity of the code, but perform them *quickly*.
> > This level is all about compile time, but in a holistic sense. It
tries
> to perform basic optimizations to get reasonably efficient code, and get it
> very quickly.
> >
>
> GCC tries to make -O1 be compatible with debugging. I find having
> debugging functional with some level of optimisation very useful,
> particularly in template heavy C++ code. I don't know if there is an
> optimisation level with similar behaviour in clang.
>I'm happy to have an added goal here of code that is reasonable to debug,
but I think before we can have that we need to have at least *some* idea
what that constitutes. Currently, I have no really good ideas....
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/9721a467/attachment.html>

Evan Cheng

2013-Jan-15 06:48 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

Sent from my iPad

On Jan 14, 2013, at 1:09 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> This has been an idea floating around in my head for a while and after
several discussions with others it continues to hold up so I thought I would
mail it out. Sorry for cross posting to both lists, but this is an issue that
would significantly impact both LLVM and Clang.
> 
> Essentially, LLVM provides canned optimization "levels" for
frontends to re-use. This is nothing new. However, we don't have good names
for them, we don't expose them at the IR level, and we have a hard time
figuring out which optimizations belong in which levels. I'd like to try
addressing that by coming up with names and a description of the basic intend
goal of each level. I would like, if folks are happy with these ideas, to add
these types of descriptions along side these attributes to the langref. Ideas on
other (better?) places to document this would be welcome. Certainly, Clang's
documentation would need to be updated to reflect this.
> 
> Hopefully we can minimally debate this until the bikeshed is a tolerable
shade. Note that I'm absolutely biased based on the behavior of Clang and
GCC with these optimization levels, and the relevant history there. However,
I'm adding and deviating from the purely historical differences to try and
better model the latest developments in LLVM's optimizer... So here goes:
> 
> 
> 1) The easiest: 'Minimize Size' or '-Oz'
> - Attribute: minsize (we already have it, nothing to do here)
> - Goal: minimize the size of the resulting binary, at (nearly) any cost.
> 
> 
> 2) Optimize for size or '-Os'
> - Attribute: optsize (we already have it, nothing to do here)
> - Goal: Optimize the execution of the binary without unreasonably[1]
increasing the binary size.
> This one is a bit fuzzy, but usually people don't have a hard time
figuring out where the line is. The primary difference between minsize and
optsize is that with minsize a pass is free to *hurt* performance to shrink the
size.
I'd like to point out that -Os is currently the same level of optimization
but with extra attention on code size. It would have significant impact on a lot
of clients if we were to change its definition.

Evan
> 
> [1] The definition of 'unreasonable' is of course subjective, but
here is at least one strong indicator: any code size growth which is inherently
*speculative* (that is, there isn't a known, demonstrable performance
benefit, but rather it is "often" or "maybe" a benefit) is
unlikely to be a good fit in optsize. The canonical example IMO is a vectorizer
-- while it is reasonable to vectorize a loop, if the vector version might not
be executed, and thus the scalar loop remains as well, then it is a poor fit for
optsize.
> 
> 
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
simplicity of the code, but perform them *quickly*.
> This level is all about compile time, but in a holistic sense. It tries to
perform basic optimizations to get reasonably efficient code, and get it very
quickly.
> 
> 
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
> - Goal: produce a well optimized binary trading off compile time, space,
and runtime efficiency.
> This should be an excellent default for general purpose programs. The idea
is to do as much optimization as we can, in as reasonable of a time frame, and
with as reasonable code size impact as possible. This level should always
produce binaries at least as fast as optsize, but they might be both bigger and
faster. This level should always produce binaries at least as fast as quickopt,
but they might be both slower to compile.
> 
> 
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level has historically been almost exclusively about trading off more
binary size for speed than '-O2', but I would propose we change it to be
more about trading off either binary size or compilation time to achieve a
better performing binary. This level should always produce binaries at least as
fast as opt, but they might be faster at the cost of them being larger and
taking more time to compile. This would in some cases be a change for LLVM and
is definitely a deviation from GCC where O3 will in many cases produce *slower*
binaries due to code size increases that are not accompanied by corresponding
performance increases.
> 
> 
> To go with these LLVM attributes I'd like to add support for adding
attributes in Clang, both compatible with GCC and with the names above for
clarity. The goal being to allow a specific function to have its optimization
level overridden from the command line based level.
> 
> 
> A final note: I would like to remove all other variations on the
'-O' flag. That includes the really strange '-O4' behavior.
Whether the compilation is LTO should be an orthogonal decision to the
particular level of optimization, and we have -flto to achieve this.
> 
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chandler Carruth

2013-Jan-15 07:07 UTC

head link

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

On Mon, Jan 14, 2013 at 10:48 PM, Evan Cheng <evan.cheng at apple.com>
wrote:
>
>
> Sent from my iPad
>
> On Jan 14, 2013, at 1:09 AM, Chandler Carruth <chandlerc at
gmail.com> wrote:
>
> > This has been an idea floating around in my head for a while and after
> several discussions with others it continues to hold up so I thought I
> would mail it out. Sorry for cross posting to both lists, but this is an
> issue that would significantly impact both LLVM and Clang.
> >
> > Essentially, LLVM provides canned optimization "levels" for
frontends to
> re-use. This is nothing new. However, we don't have good names for
them, we
> don't expose them at the IR level, and we have a hard time figuring out
> which optimizations belong in which levels. I'd like to try addressing
that
> by coming up with names and a description of the basic intend goal of each
> level. I would like, if folks are happy with these ideas, to add these
> types of descriptions along side these attributes to the langref. Ideas on
> other (better?) places to document this would be welcome. Certainly,
> Clang's documentation would need to be updated to reflect this.
> >
> > Hopefully we can minimally debate this until the bikeshed is a
tolerable
> shade. Note that I'm absolutely biased based on the behavior of Clang
and
> GCC with these optimization levels, and the relevant history there.
> However, I'm adding and deviating from the purely historical
differences to
> try and better model the latest developments in LLVM's optimizer... So
here
> goes:
> >
> >
> > 1) The easiest: 'Minimize Size' or '-Oz'
> > - Attribute: minsize (we already have it, nothing to do here)
> > - Goal: minimize the size of the resulting binary, at (nearly) any
cost.
> >
> >
> > 2) Optimize for size or '-Os'
> > - Attribute: optsize (we already have it, nothing to do here)
> > - Goal: Optimize the execution of the binary without unreasonably[1]
> increasing the binary size.
> > This one is a bit fuzzy, but usually people don't have a hard time
> figuring out where the line is. The primary difference between minsize and
> optsize is that with minsize a pass is free to *hurt* performance to shrink
> the size.
>
> I'd like to point out that -Os is currently the same level of
optimization
> but with extra attention on code size. It would have significant impact on
> a lot of clients if we were to change its definition.
>
My intent was not to change the behavior of any of these flags from their
current behavior, and mostly to clarify the existing ideas behind them...
The most changed / clarified is probably -O3, the rest I think are already
working exactly in line with my email... At least, that was my intent!

Is there something about my proposed wording that makes you think it
differs from the status quo, or would lead to differences?

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/707bddc4/attachment.html>

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Jan 2013 - [LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

Apparently Analagous Threads