thr3ads.net - llvm dev - [LLVMdev] [RFC] Encoding Compile Flags into the IR [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Bill Wendling

2012-Apr-29 22:44 UTC

[LLVMdev] [RFC] Encoding Compile Flags into the IR

Hi,

Link-Time Optimization has a problem. We need to preserve some of the flags with
which the modules were compiled so that the semantics of the resulting program
are correct. For example, a module compiled with `-msoft-float' should use
library calls for floating point. And that's only the tip of the proverbial
iceberg.

Goals
====
My goals for whichever solution we come up with are to be:

1) Flexible enough to differentiate between flags which affect a module as a
whole and those which affect individual functions.
2) Quick to query for specific flags.
3) Easily extensible, preferably without changing the IR for each new flag.

Proposed Solution
================
My solution to this is to use a mixture of module-level flags and named
metadata. It gives us the flexibility asked for in (1), they are relatively
quick to query (after being read in, the module flags could be placed into an
efficient data structure), and can be extended by updating the LangRef.html doc.

- Module-level flags would be used for those options which affect the whole
module and which prevent two modules that do not have that flag set from being
merged together. For example, `-msoft-float' changes the calling convention
in the output file. Therefore, it's only useful if the whole program is
compiled with it. It would be a module-level IR flag:

	!0 = metadata !{ i32 1, metadata !"-msoft-float", i1 true }
	!llvm.module.flags = !{ !0 }

- Named metadata would be used for those options which affect code generation
for the functions, but which doesn't prevent two modules from being merged
together. For example, `-fomit-frame-pointer' applies to individual
functions, but it doesn't prevent a module compiled with
`-fno-omit-frame-pointer' from being merged with one compiled with
`-fomit-frame-pointer'. We would use a named metadata flag:

	define void @foo () { ... }
	define double @bar(i32 %a) { ... }

	; A list of the functions affected by `-fno-omit-frame-pointer' within the
Module.
	!0 = metadata !{ void ()* @foo, double (i32)* @bar }
	!fno.omit.frame.pointer = !{ !0 }

And so on.

The second part of the solution (using named metadata) could be replaced by
function attributes. However, I see a couple of problems with that. First, the
number of flags could be quite large. We would soon run out of space in the
Attributes structure, not to mention that printing the function becomes
unwieldy.. Second, it's not very extensible. Adding a new flag requires
changing the IR. While it's relatively easy to change the IR in this case,
it is less desirable to me than simply adding a note in LangRef.html about the
flag and its semantics.

Code Generation
==============
The part of the compiler most effected by this change will be the back-end. (The
front-end already has mechanisms in place to handling merging metadata.) Several
different flags are currently specified as function-level attributes, which is
fine, and those need not change. But other flags, which are specified in an ad
hoc manner, will need to change to be queried for by the back-end. This is good
as it will consolidate these flags into one interface, but it will take some
work to complete.

Conclusion
=========
LTO definitely needs this, or an equivalent, solution to the current problem.
Without it, we cannot claim that LTO is "ready for prime-time."

No part of this proposal is set in stone, and I'm open to modifications and
other ideas. (Dons asbestos suit.) ;-)

Share and enjoy!
-bw

Rafael Espíndola

2012-Apr-30 00:39 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

On 29 April 2012 18:44, Bill Wendling <wendling at apple.com>
wrote:> Hi,
>
> Link-Time Optimization has a problem. We need to preserve some of the flags
with which the modules were compiled so that the semantics of the resulting
program are correct. For example, a module compiled with `-msoft-float'
should use library calls for floating point. And that's only the tip of the
proverbial iceberg.
>
> Goals
> ====>
> My goals for whichever solution we come up with are to be:
>
> 1) Flexible enough to differentiate between flags which affect a module as
a whole and those which affect individual functions.
> 2) Quick to query for specific flags.
> 3) Easily extensible, preferably without changing the IR for each new flag.
>
> Proposed Solution
> ================>
> My solution to this is to use a mixture of module-level flags and named
metadata. It gives us the flexibility asked for in (1), they are relatively
quick to query (after being read in, the module flags could be placed into an
efficient data structure), and can be extended by updating the LangRef.html doc.
>
> - Module-level flags would be used for those options which affect the whole
module and which prevent two modules that do not have that flag set from being
merged together. For example, `-msoft-float' changes the calling convention
in the output file. Therefore, it's only useful if the whole program is
compiled with it. It would be a module-level IR flag:
>
>        !0 = metadata !{ i32 1, metadata !"-msoft-float", i1 true
}
>        !llvm.module.flags = !{ !0 }
So the objective in here is to diagnose cases where the program would
already be broken even without LTO, correct? If so I think this going
on the right direction, I am just not sure if a 1:1 mapping with
command line options is the best solution. These are basic "abi
options".
> - Named metadata would be used for those options which affect code
generation for the functions, but which doesn't prevent two modules from
being merged together. For example, `-fomit-frame-pointer' applies to
individual functions, but it doesn't prevent a module compiled with
`-fno-omit-frame-pointer' from being merged with one compiled with
`-fomit-frame-pointer'. We would use a named metadata flag:
>
>        define void @foo () { ... }
>        define double @bar(i32 %a) { ... }
>
>        ; A list of the functions affected by `-fno-omit-frame-pointer'
within the Module.
>        !0 = metadata !{ void ()* @foo, double (i32)* @bar }
>        !fno.omit.frame.pointer = !{ !0 }
>
> And so on.
This part I am not so sure about. I fixed a similar problem for unwind
tables by adding an attribute. It could be safely done with a metadata
with the oposite meaning (nouwtable). The things I am uncomfortable
with this part of the proposal are:

* Why not use metadata attached directly to the functions? They are
the closest thing to an easy to add attribute.

* The recent discussion about metadata points out that it is not
really safe to add information that only part of the compiler reasons
about. Metadata adds the nice property that it is safe to drop, but
that is it. Passes have to know about it, it should be documented in
the language ref and the verifier should check it. I am afraid that
this part of the proposal would again create a feeling that we have a
magic bullet for passing semantic info from the FE to some passes.

* As you mention, this is probably the tip of the iceberg. Maybe we
should explore it a bit more with the tools we have before declaring
them insufficient. Duncan is working on fp precision, you can probably
add a no_frame_pointer metadata to functions and from there we will
have a better idea of how things are going.

Cheers,
Rafael

Bill Wendling

2012-Apr-30 01:25 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

On Apr 29, 2012, at 5:39 PM, Rafael Espíndola wrote:
> On 29 April 2012 18:44, Bill Wendling <wendling at apple.com> wrote:
>> Hi,
>> 
>> Link-Time Optimization has a problem. We need to preserve some of the
flags with which the modules were compiled so that the semantics of the
resulting program are correct. For example, a module compiled with
`-msoft-float' should use library calls for floating point. And that's
only the tip of the proverbial iceberg.
>> 
>> Goals
>> ====>> 
>> My goals for whichever solution we come up with are to be:
>> 
>> 1) Flexible enough to differentiate between flags which affect a module
as a whole and those which affect individual functions.
>> 2) Quick to query for specific flags.
>> 3) Easily extensible, preferably without changing the IR for each new
flag.
>> 
>> Proposed Solution
>> ================>> 
>> My solution to this is to use a mixture of module-level flags and named
metadata. It gives us the flexibility asked for in (1), they are relatively
quick to query (after being read in, the module flags could be placed into an
efficient data structure), and can be extended by updating the LangRef.html doc.
>> 
>> - Module-level flags would be used for those options which affect the
whole module and which prevent two modules that do not have that flag set from
being merged together. For example, `-msoft-float' changes the calling
convention in the output file. Therefore, it's only useful if the whole
program is compiled with it. It would be a module-level IR flag:
>> 
>>        !0 = metadata !{ i32 1, metadata !"-msoft-float", i1
true }
>>        !llvm.module.flags = !{ !0 }
> 
> So the objective in here is to diagnose cases where the program would
> already be broken even without LTO, correct? If so I think this going
> on the right direction, I am just not sure if a 1:1 mapping with
> command line options is the best solution. These are basic "abi
> options".
> Diagnosis is only one use of this proposal. The other, more important, use is to
generate the correct code.
>> - Named metadata would be used for those options which affect code
generation for the functions, but which doesn't prevent two modules from
being merged together. For example, `-fomit-frame-pointer' applies to
individual functions, but it doesn't prevent a module compiled with
`-fno-omit-frame-pointer' from being merged with one compiled with
`-fomit-frame-pointer'. We would use a named metadata flag:
>> 
>>        define void @foo () { ... }
>>        define double @bar(i32 %a) { ... }
>> 
>>        ; A list of the functions affected by
`-fno-omit-frame-pointer' within the Module.
>>        !0 = metadata !{ void ()* @foo, double (i32)* @bar }
>>        !fno.omit.frame.pointer = !{ !0 }
>> 
>> And so on.
> 
> This part I am not so sure about. I fixed a similar problem for unwind
> tables by adding an attribute. It could be safely done with a metadata
> with the oposite meaning (nouwtable). The things I am uncomfortable
> with this part of the proposal are:
> 
> * Why not use metadata attached directly to the functions? They are
> the closest thing to an easy to add attribute.
> Possible, but the problem with metadata is that it should be possible to remove
them from the object and not affect the semantics of the program.
> * The recent discussion about metadata points out that it is not
> really safe to add information that only part of the compiler reasons
> about. Metadata adds the nice property that it is safe to drop, but
> that is it. Passes have to know about it, it should be documented in
> the language ref and the verifier should check it. I am afraid that
> this part of the proposal would again create a feeling that we have a
> magic bullet for passing semantic info from the FE to some passes.
> I'm not sure I understand your meaning here. The point of making this named
metadata is that it cannot be stripped from the module via normal methods. And
it's inevitable that we will need for passes to know about the metadata and
modify their behavior accordingly. That's the whole point, of course. :)
(They will, at least, be able to query the Module object for the information
they care about. The Module is the one which knows about the metadata.)
> * As you mention, this is probably the tip of the iceberg. Maybe we
> should explore it a bit more with the tools we have before declaring
> them insufficient. Duncan is working on fp precision, you can probably
> add a no_frame_pointer metadata to functions and from there we will
> have a better idea of how things are going.
> I'd rather not start coding before we can agree on a concrete
implementation.

-bw

Renato Golin

2012-Apr-30 08:03 UTC

head link

[LLVMdev] [RFC] Encoding Compile Flags into the IR

On 29 April 2012 23:44, Bill Wendling <wendling at apple.com>
wrote:> Hi,
>
> Link-Time Optimization has a problem. We need to preserve some of the flags
with which the modules were compiled so that the semantics of the resulting
program are correct. For example, a module compiled with `-msoft-float'
should use library calls for floating point. And that's only the tip of the
proverbial iceberg.
Hi Bill,

While it's true that knowing compiler flags will help you with linking
problems (including optimisations), I don't think they're 1:1 with
link issues, nor I think storing all compilation options on all
modules every time is a fair price to pay for something that specific.

You have a goal to correct link-time optimisations, or as we discussed
earlier in the fp-math thread, even code generation could be broken
without the knowledge of the user's intent. That can be accomplished
now by putting "-msoft-float" as a global metadata, yes, but does that
fit a general solution for the general problem? I literally don't
know.

What you need to do, if your intent to create a long-lasting framework
- not just a quick fix for the LTO, is to analyse the biggest problems
and the information you need. If you have problems in multiple domains
(I'm guessing fp is not the only one), and could get information from
multiple sources (again, guessing compile options is not the only
one), then your solution is lacking.

I'm guessing linker scripts could have a lot to say about link-time
issues, as well as environment, ABI, chipset, ISA and so on. If you
put all compiler flags in metadata now, we'll end up putting all
options of all sources in global metadata, and well, that's far from
desirable.

I propose a more general scheme of global metadata, similar to yours
(one global for each big problem, multiple options inside for each
user intent), but generated from cherry-picked sources and put into
specific global metadata baskets (duplication could occur, if the
semantics is different). So each further step reads its own basket
(LTO reads @llvm.metadata.lto {...}) and so on. Of course LTO could
read other baskets, but it'll have to be for a precise reason, with a
precise meaning.

While merging modules (inlining included) with different metadata, you
have to have a specific well defined merge rule, with warnings and
errors in case they mismatch. We were discussing the merge semantics
for fp models earlier, that kind of analysis should happen for every
new flag you put in.

Though you have to take my proposal with a pinch of salt, because
that's remarkably similar to ARM's build attributes, and I'm not
sure
that's the best idea either. There is probably a smarter way of doing
this, I just didn't think hard enough to find it... ;)

But either way, you will need some sort of guidelines on how passes
should treat metadata with stronger guarantees than today, or your LTO
will still not see the info it needs...

-- 
cheers,
--renato

http://systemcall.org/

David Chisnall

2012-Apr-30 09:41 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

On 29 Apr 2012, at 23:44, Bill Wendling wrote:
> - Module-level flags would be used for those options which affect the whole
module and which prevent two modules that do not have that flag set from being
merged together. For example, `-msoft-float' changes the calling convention
in the output file. Therefore, it's only useful if the whole program is
compiled with it. It would be a module-level IR flag:
It seems that the correct solution for this would be to make the softfloat and
hardfloat calling conventions into... calling conventions.  This would require
sinking some logic for defining calling conventions down into LLVM, rather than
requiring every single front end to duplicate the same logic, but reduced code
duplication might just be a price worth paying for making writing front ends and
optimisation passes easier...

David

Renato Golin

2012-Apr-30 09:50 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

On 30 April 2012 10:41, David Chisnall <csdavec at swan.ac.uk>
wrote:> It seems that the correct solution for this would be to make the softfloat
and hardfloat calling conventions into... calling conventions.  This would
require sinking some logic for defining calling conventions down into LLVM,
rather than requiring every single front end to duplicate the same logic, but
reduced code duplication might just be a price worth paying for making writing
front ends and optimisation passes easier...
There is already a function attribute for soft/hard float (in ARM is
AAPCS_VFP), which is generally (or should be) produced when the
compiler specifies hard-float.


-- 
cheers,
--renato

http://systemcall.org/

dag at cray.com

2012-Apr-30 20:00 UTC

head link

[LLVMdev] [RFC] Encoding Compile Flags into the IR

Bill Wendling <wendling at apple.com> writes:
> Link-Time Optimization has a problem. We need to preserve some of the
> flags with which the modules were compiled so that the semantics of
> the resulting program are correct. For example, a module compiled with
> `-msoft-float' should use library calls for floating point. And
that's
> only the tip of the proverbial iceberg.
This is an important missing feature.
> - Named metadata would be used for those options which affect code
> generation for the functions, but which doesn't prevent two modules
> from being merged together. For example, `-fomit-frame-pointer'
> applies to individual functions, but it doesn't prevent a module
> compiled with `-fno-omit-frame-pointer' from being merged with one
> compiled with `-fomit-frame-pointer'. We would use a named metadata
> flag:
Doesn't this violate the "no semantics" requirement of metadata? 
What
happens if the metadata gets dropped?

                           -Dave

Bill Wendling

2012-Apr-30 23:10 UTC

head link

[LLVMdev] [RFC] Encoding Compile Flags into the IR

On Apr 30, 2012, at 1:03 AM, Renato Golin wrote:
> On 29 April 2012 23:44, Bill Wendling <wendling at apple.com> wrote:
>> Hi,
>> 
>> Link-Time Optimization has a problem. We need to preserve some of the
flags with which the modules were compiled so that the semantics of the
resulting program are correct. For example, a module compiled with
`-msoft-float' should use library calls for floating point. And that's
only the tip of the proverbial iceberg.
> 
> Hi Bill,
> 
> While it's true that knowing compiler flags will help you with linking
> problems (including optimisations), I don't think they're 1:1 with
> link issues, nor I think storing all compilation options on all
> modules every time is a fair price to pay for something that specific.
> 
> You have a goal to correct link-time optimisations, or as we discussed
> earlier in the fp-math thread, even code generation could be broken
> without the knowledge of the user's intent. That can be accomplished
> now by putting "-msoft-float" as a global metadata, yes, but does
that
> fit a general solution for the general problem? I literally don't
> know.
> I'm not familiar with the fp-math thread. Could you summarize it for me?
> What you need to do, if your intent to create a long-lasting framework
> - not just a quick fix for the LTO, is to analyse the biggest problems
> and the information you need. If you have problems in multiple domains
> (I'm guessing fp is not the only one), and could get information from
> multiple sources (again, guessing compile options is not the only
> one), then your solution is lacking.
> Nothing about the proposal is meant to be a quick fix. The process of adding new
flags that we care about would be a formal process, just not one that modifies
the IR every time. (Yes, I'm fixated on that one aspect of it. I find it too
heavyweight for the problem at hand.)
> I'm guessing linker scripts could have a lot to say about link-time
> issues, as well as environment, ABI, chipset, ISA and so on. If you
> put all compiler flags in metadata now, we'll end up putting all
> options of all sources in global metadata, and well, that's far from
> desirable.
> The information is important for correct code generation and linking. I need
alternatives to putting it in metadata. :)
> I propose a more general scheme of global metadata, similar to yours
> (one global for each big problem, multiple options inside for each
> user intent), but generated from cherry-picked sources and put into
> specific global metadata baskets (duplication could occur, if the
> semantics is different). So each further step reads its own basket
> (LTO reads @llvm.metadata.lto {...}) and so on. Of course LTO could
> read other baskets, but it'll have to be for a precise reason, with a
> precise meaning.
> 
> While merging modules (inlining included) with different metadata, you
> have to have a specific well defined merge rule, with warnings and
> errors in case they mismatch. We were discussing the merge semantics
> for fp models earlier, that kind of analysis should happen for every
> new flag you put in.
> Yup! The module-level flags has these abilities. :-)
> Though you have to take my proposal with a pinch of salt, because
> that's remarkably similar to ARM's build attributes, and I'm
not sure
> that's the best idea either. There is probably a smarter way of doing
> this, I just didn't think hard enough to find it... ;)
> 
> But either way, you will need some sort of guidelines on how passes
> should treat metadata with stronger guarantees than today, or your LTO
> will still not see the info it needs...
> 
Could you give an example of how yours would look like in a sample Module?

-bw

Bill Wendling

2012-Apr-30 23:12 UTC

head link

[LLVMdev] [RFC] Encoding Compile Flags into the IR

On Apr 30, 2012, at 1:00 PM, dag at cray.com wrote:
> Bill Wendling <wendling at apple.com> writes:
> 
>> Link-Time Optimization has a problem. We need to preserve some of the
>> flags with which the modules were compiled so that the semantics of
>> the resulting program are correct. For example, a module compiled with
>> `-msoft-float' should use library calls for floating point. And
that's
>> only the tip of the proverbial iceberg.
> 
> This is an important missing feature.
> 
>> - Named metadata would be used for those options which affect code
>> generation for the functions, but which doesn't prevent two modules
>> from being merged together. For example, `-fomit-frame-pointer'
>> applies to individual functions, but it doesn't prevent a module
>> compiled with `-fno-omit-frame-pointer' from being merged with one
>> compiled with `-fomit-frame-pointer'. We would use a named metadata
>> flag:
> 
> Doesn't this violate the "no semantics" requirement of
metadata?  What
> happens if the metadata gets dropped?
> Named metadata cannot be stripped by normal methods.

-bw

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Apr 2012 - [LLVMdev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [cfe-dev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [RFC] Encoding Compile Flags into the IR

[LLVMdev] [RFC] Encoding Compile Flags into the IR

Reasonably Related Threads