thr3ads.net - llvm dev - [LLVMdev] [RFC] Embedding command line options in bitcode (PR21471) [Dec 2014]

If this information is useful, please help other people find it:
Share via:

Bob Wilson

2014-Nov-20 02:24 UTC

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
wrote:
> 
> 
> 
> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com
<mailto:ahatanak at gmail.com>> wrote:
> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.com
<mailto:echristo at gmail.com>> wrote:
> So my general concern here is that lots of command line options that
don't need to be or shouldn't be IR level constructs become oddly named
string options. It's bad enough that we're doing that with the target
cpu and target feature string let alone the rest of it.
> 
> If we're talking about things that end up changing the subtarget used
we definitely need to make this part of the lookup key - at which point
we're going to want a unified interface to all of them. I haven't
applied your patch to see what the IR changes look like here.
> 
> 
> With the patches applied, the attributes clang embeds in the IR look like
this:
> 
> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
-menable-no-nans
> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
"arm-long-calls"="1"
"arm-reserve-r9"="1"
"less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
"stack-protector-buffer-size"="8"
"target-cpu"="arm7tdmi"
"target-features"="-neon,-crypto,+apcs"
"unsafe-fp-math"="false"
"use-soft-float"="true" }
> 
> I'm guessing you want the front-end to convert command line options
like "arm-long-calls" or "arm-reserve-r9" to something that
doesn't look as odd?
> 
> 
> Right, that was part of the idea, the rest is that we define things that
actually make sense as front end command line options and use those to
communicate via IR constructs or api calls I.e. reserved registers don't
need to be a specific back end option, but rather can be passed along when
creating a target machine, etc.
> 
> -eric
No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s
example) don’t necessarily need to be backend options. But, those kinds of
options do need to be recorded in the IR and the target machine needs to reflect
the relevant IR settings. Otherwise, we’ll still be left with a broken LTO
model. E.G., I build one LTO object with “arm-reserve-r9” and another without
it, but the LTO compilation still needs to respect those options for the code in
each object. The details of how a particular option should be handled will vary
depending on the option, and we need to discuss each one separately.

It’s pretty clear, though, that there is a class of backend options that
translate nicely to function attributes. This RFC is about the mechanism that we
use to handle those (without specifying which specific options will be handled
that way).
>  
> From looking at the patch it seems that the command line option for NaNs
should be a general attribute on the function rather than a TargetMachine option
anyhow, so it seems that should be migrated down. Whether or not it is exposed
as an llvm attribute or a string based key value pair is an interesting
discussion. Also, would we want to, instead, expose the backend options as clang
IR level options? Seems to make sense from that perspective and then they can
add the appropriate attributes to the functions themselves. (-mlongcalls for the
long calls option for example).
> 
> Thanks!
> 
> -eric
Again, I think we can evaluate specific options on a case-by-case basis. For
some of them, it might make sense to expose them as clang options. Like you, I
had previously assumed we would teach the front-end about all the backend
options. I was a little surprised by the approach Akira suggested here, but
after giving it some thought, I think it’s a great idea. There are a LOT of
target-specific backend options. Some of them do not need to be IR, but many do.
Teaching clang about all of them would be a lot of work, and I don’t see any
clear benefit from that. This approach lets clang remain relatively independent
of the backend options. We still need to think carefully about which backend
options use this mechanism, of course.

Are there other reasons why you think we should add all those options to clang?
I just don’t see much to be gained by that.

Do you have objections to the use of string-based key/value pairs? I believe the
intention all along, going back to Bill’s work on this a few years ago, was that
we would add new attributes for target-independent things and use strings for
the target-specific options. That seems like a reasonable starting point to me.
> 
> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at gmail.com
<mailto:ahatanak at gmail.com>> wrote:
> Updated patch is attached. Note this is just a work-in-progress patch and I
plan to address the feedback comments later if this patch is in the right
direction.
> 
> This is how the command line options are parsed and used by the backend
passes:
> 
> 1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the
options that should be written to the bitcode as function or module attributes
(these options are declared as CodeGenOpt, which is a subclass of cl::opt) are
saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
> 2. setFunctionAttribute is called to move the function and module options
saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved
to the bitcode too. Pre-existing attributes in the bitcode are overwritten by
the options in the command line.
> 3. The backend passes and subtarget classes call getFunctionAttribute to
read the attributes stored to the bitcode in step 2. Function::hasFnAttribute
can be called directly (for example, NoNaNsFPMath in the patch), if the default
value is known to be "false".
> 
> I also made the following changes in the patch:
> 
> 1. In the constructor of MachineFunction, call the version of
getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget
specific to the Function being compiled.
> 2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*,
unique_ptr<ARMSubtarget>>. This is just a temporary change to ease
debugging and should be reverted to a StringMap or changed to another type of
map later.
> 
> 
> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at
gmail.com <mailto:echristo at gmail.com>> wrote:
> 
> 
> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com
<mailto:bob.wilson at apple.com>> wrote:
> 
> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com <mailto:dexonsmith at apple.com>> wrote:
> >
> > +chrisb
> >
> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com
<mailto:ahatanak at gmail.com>> wrote:
> >>
> >> I'm working on fixing PR21471, which is about embedding
codegen command line options into the bitcode as function or module-level
attributes so that they don't get ignored when doing LTO.
> >>
> >> http://llvm.org/bugs/show_bug.cgi?id=21471
<http://llvm.org/bugs/show_bug.cgi?id=21471>
> >>
> >> I have an initial patch (attached to this email) which enables
clang/llvm to recognize one command line option, write it to the IR, and read it
out in a backend pass. I'm looking to get feedback from the community on
whether I'm headed in the right direction or whether there are alternate
ideas before I go all the way on fixing the PR. Specifically, I'd like to
know the answers to the following questions:
> >>
> >> 1. How do we make sure we continue to be able to use the command
line options we've been using for llc and other tools?
> >> 2. How to handle cases where two functions in a module have
different sets of command line options?
> >> 3. Where should the command line options or module/function
attributes be stored once they are read out from the IR?
> >>
> >> The short description of the approach I took in my patch is that
command line options that are important to codegen are collected by
cl::ParseCommandLineOptions, written to the bitcode as function or module
attributes, and read out directly by the optimization passes that need them.
cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt),
which are needed only to parse the command line and provide the default value
when the corresponding options are not in the bitcode.
> >
> > I like this approach, since it means the frontend doesn't have to
understand
> > options in order to pass them on to the backend.
> 
> I agree. It’s not the approach that was I was expecting, but after reading
the patches, I like it.
> 
> If we can get consensus on the approach, I suggest that we stage the work
to first adopt Akira’s patches and then we can incrementally convert various
options over to use it. There may be complications for some options, so I’d like
to see separate patches for each one (possibly with a few similar options
combined in one patch).
> 
> 
> I'm not sold quite yet :)
> 
> Akira: Can you show an example of how you expect this to work in the IR and
how the backend is going to process?
> 
> Thanks.
> 
> -eric
>  
> >
> > The static variables should be straightforward to migrate to an
LLVMContext
> > once ParseCommandLineOptions stores things there instead of in
globals.
> >
> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
b/lib/CodeGen/CodeGenOption.cpp
> >> new file mode 100644
> >> index 0000000..2d74c2f
> >> --- /dev/null
> >> +++ b/lib/CodeGen/CodeGenOption.cpp
> >> @@ -0,0 +1,59 @@
> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.          
--*- C++ -*-===//
> >> +//
> >> +//                     The LLVM Compiler Infrastructure
> >> +//
> >> +// This file is distributed under the University of Illinois Open
Source
> >> +// License. See LICENSE.TXT for details.
> >> +//
> >>
+//===----------------------------------------------------------------------===//
> >> +//
> >>
+//===----------------------------------------------------------------------===//
> >> +
> >> +#include "llvm/CodeGen/CodeGenOption.h"
> >> +#include "llvm/IR/Attributes.h"
> >> +#include "llvm/IR/Module.h"
> >> +
> >> +using namespace llvm;
> >> +
> >> +static std::map<std::string, bool> FunctionBoolAttrs;
> >> +static std::map<std::string, bool> ModuleBoolAttrs;
> >> +
> >
> > @Chris, should these be using ManagedStatic?
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141119/f920794a/attachment.html>

Eric Christopher

2014-Dec-01 18:53 UTC

head link

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com>
wrote:
> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
wrote:
>
>
>
> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at
gmail.com>
> wrote:
>
>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at
gmail.com>
>> wrote:
>>
>>> So my general concern here is that lots of command line options
that
>>> don't need to be or shouldn't be IR level constructs become
oddly named
>>> string options. It's bad enough that we're doing that with
the target cpu
>>> and target feature string let alone the rest of it.
>>>
>>> If we're talking about things that end up changing the
subtarget used we
>>> definitely need to make this part of the lookup key - at which
point we're
>>> going to want a unified interface to all of them. I haven't
applied your
>>> patch to see what the IR changes look like here.
>>>
>>>
>> With the patches applied, the attributes clang embeds in the IR look
like
>> this:
>>
>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>> -menable-no-nans
>>
>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>> "arm-long-calls"="1"
"arm-reserve-r9"="1"
"less-precise-fpmad"="false"
>> "no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
>> "no-infs-fp-math"="false"
"stack-protector-buffer-size"="8"
>> "target-cpu"="arm7tdmi"
"target-features"="-neon,-crypto,+apcs"
>> "unsafe-fp-math"="false"
"use-soft-float"="true" }
>>
>> I'm guessing you want the front-end to convert command line options
like
>> "arm-long-calls" or "arm-reserve-r9" to something
that doesn't look as odd?
>>
>
> Right, that was part of the idea, the rest is that we define things that
> actually make sense as front end command line options and use those to
> communicate via IR constructs or api calls I.e. reserved registers
don't
> need to be a specific back end option, but rather can be passed along when
> creating a target machine, etc.
>
> -eric
>
>
> No, reserved registers (I assume you’re referring to “arm-reserve-r9” in
> Akira’s example) don’t necessarily need to be backend options. But, those
> kinds of options do need to be recorded in the IR and the target machine
> needs to reflect the relevant IR settings. Otherwise, we’ll still be left
> with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9”
> and another without it, but the LTO compilation still needs to respect
> those options for the code in each object. The details of how a particular
> option should be handled will vary depending on the option, and we need to
> discuss each one separately.
>
>ABI breaking options should probably be considered part of
TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
creation it can be on the function as a feature string attribute, if it
affects global code generation it needs to be at TargetMachine creation
time).

> It’s pretty clear, though, that there is a class of backend options that
> translate nicely to function attributes. This RFC is about the mechanism
> that we use to handle those (without specifying which specific options will
> be handled that way).
>
>Fair enough. For function attributes I can see a desire to have (as Chris
posted in an email from someone at synopsys) a generic target specific
dictionary mapping on functions for backend specific uses. I just haven't
seen anything in the thread that would be applicable for this rather than
an actual function attribute. That said, I would like some sort of
delineation for target function attributes as well. Thoughts on that?
Perhaps it could look like the string dictionary we've mentioned, maybe not.

>
>
>> From looking at the patch it seems that the command line option for
NaNs
>>> should be a general attribute on the function rather than a
TargetMachine
>>> option anyhow, so it seems that should be migrated down. Whether or
not it
>>> is exposed as an llvm attribute or a string based key value pair is
an
>>> interesting discussion. Also, would we want to, instead, expose the
backend
>>> options as clang IR level options? Seems to make sense from that
>>> perspective and then they can add the appropriate attributes to the
>>> functions themselves. (-mlongcalls for the long calls option for
example).
>>>
>>> Thanks!
>>>
>>> -eric
>>>
>>
> Again, I think we can evaluate specific options on a case-by-case basis.
> For some of them, it might make sense to expose them as clang options. Like
> you, I had previously assumed we would teach the front-end about all the
> backend options. I was a little surprised by the approach Akira suggested
> here, but after giving it some thought, I think it’s a great idea. There
> are a LOT of target-specific backend options. Some of them do not need to
> be IR, but many do. Teaching clang about all of them would be a lot of
> work, and I don’t see any clear benefit from that. This approach lets clang
> remain relatively independent of the backend options. We still need to
> think carefully about which backend options use this mechanism, of course.
>
>I'm unconvinced here...

> Are there other reasons why you think we should add all those options to
> clang? I just don’t see much to be gained by that.
>
>.... I think that anything you'd like to expose should probably be a front
end option. If nothing else I don't see how you plan on annotating the IR.
Basically you're constructing a parallel option hierarchy and that seems...
redundant. I agree that it's nice for tools to be able to override things
in the backend - for testing if nothing else. We do this a lot by checking
options and values in the code. A way of streamlining this would be nice -
but I don't think that's the goal of these patches no?

> Do you have objections to the use of string-based key/value pairs? I
> believe the intention all along, going back to Bill’s work on this a few
> years ago, was that we would add new attributes for target-independent
> things and use strings for the target-specific options. That seems like a
> reasonable starting point to me.
>
>Only some as I mentioned above - I haven't seen anything that needs to be
target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for
non-function/module level attributes and I think that's going to look like
API calls into the backend for TargetMachine creation - there may be a list
of things that would turn into module level attributes, but that is a
direction fraught with danger and I'd like to avoid that as much as
possible.

(As a note, the particular patch that Akira has is unacceptable because of
the changes to the subtarget lookup, but that's a separate issue :)

-eric

>
>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at
gmail.com>
>>> wrote:
>>>
>>>> Updated patch is attached. Note this is just a work-in-progress
patch
>>>> and I plan to address the feedback comments later if this patch
is in the
>>>> right direction.
>>>>
>>>> This is how the command line options are parsed and used by the
backend
>>>> passes:
>>>>
>>>> 1. Tools such as clang and llc call
cl::ParseCommandLineOptions. Any of
>>>> the options that should be written to the bitcode as function
or module
>>>> attributes (these options are declared as CodeGenOpt, which is
a subclass
>>>> of cl::opt) are saved to Function or ModuleIntAttrs in
CodeGenOptions.cpp.
>>>> 2. setFunctionAttribute is called to move the function and
module
>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>> TargetOptions are moved to the bitcode too. Pre-existing
attributes in the
>>>> bitcode are overwritten by the options in the command line.
>>>> 3. The backend passes and subtarget classes call
getFunctionAttribute
>>>> to read the attributes stored to the bitcode in step 2.
>>>> Function::hasFnAttribute can be called directly (for example,
NoNaNsFPMath
>>>> in the patch), if the default value is known to be
"false".
>>>>
>>>> I also made the following changes in the patch:
>>>>
>>>> 1. In the constructor of MachineFunction, call the version of
>>>> getSubtargetImpl that takes a Function parameter, so that it
gets the
>>>> Subtarget specific to the Function being compiled.
>>>> 2. Change ARMTargetMachine::SubtargetMap to be a
DenseMap<Function*,
>>>> unique_ptr<ARMSubtarget>>. This is just a temporary
change to ease
>>>> debugging and should be reverted to a StringMap or changed to
another type
>>>> of map later.
>>>>
>>>>
>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo
at gmail.com>
>>>>  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson
at apple.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon
Smith <
>>>>>> dexonsmith at apple.com> wrote:
>>>>>> >
>>>>>> > +chrisb
>>>>>> >
>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka
<ahatanak at gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> I'm working on fixing PR21471, which is
about embedding codegen
>>>>>> command line options into the bitcode as function or
module-level
>>>>>> attributes so that they don't get ignored when
doing LTO.
>>>>>> >>
>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>> >>
>>>>>> >> I have an initial patch (attached to this
email) which enables
>>>>>> clang/llvm to recognize one command line option, write
it to the IR, and
>>>>>> read it out in a backend pass. I'm looking to get
feedback from the
>>>>>> community on whether I'm headed in the right
direction or whether there are
>>>>>> alternate ideas before I go all the way on fixing the
PR. Specifically, I'd
>>>>>> like to know the answers to the following questions:
>>>>>> >>
>>>>>> >> 1. How do we make sure we continue to be able
to use the command
>>>>>> line options we've been using for llc and other
tools?
>>>>>> >> 2. How to handle cases where two functions in
a module have
>>>>>> different sets of command line options?
>>>>>> >> 3. Where should the command line options or
module/function
>>>>>> attributes be stored once they are read out from the
IR?
>>>>>> >>
>>>>>> >> The short description of the approach I took
in my patch is that
>>>>>> command line options that are important to codegen are
collected by
>>>>>> cl::ParseCommandLineOptions, written to the bitcode as
function or module
>>>>>> attributes, and read out directly by the optimization
passes that need
>>>>>> them. cl::opt options are replaced with CodeGenOpt
options (subclass of
>>>>>> cl::opt), which are needed only to parse the command
line and provide the
>>>>>> default value when the corresponding options are not in
the bitcode.
>>>>>> >
>>>>>> > I like this approach, since it means the frontend
doesn't have to
>>>>>> understand
>>>>>> > options in order to pass them on to the backend.
>>>>>>
>>>>>> I agree. It’s not the approach that was I was
expecting, but after
>>>>>> reading the patches, I like it.
>>>>>>
>>>>>> If we can get consensus on the approach, I suggest that
we stage the
>>>>>> work to first adopt Akira’s patches and then we can
incrementally convert
>>>>>> various options over to use it. There may be
complications for some
>>>>>> options, so I’d like to see separate patches for each
one (possibly with a
>>>>>> few similar options combined in one patch).
>>>>>>
>>>>>>
>>>>> I'm not sold quite yet :)
>>>>>
>>>>> Akira: Can you show an example of how you expect this to
work in the
>>>>> IR and how the backend is going to process?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>>> >
>>>>>> > The static variables should be straightforward to
migrate to an
>>>>>> LLVMContext
>>>>>> > once ParseCommandLineOptions stores things there
instead of in
>>>>>> globals.
>>>>>> >
>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>> >> new file mode 100644
>>>>>> >> index 0000000..2d74c2f
>>>>>> >> --- /dev/null
>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>> >> @@ -0,0 +1,59 @@
>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen
option.
>>>>>>  --*- C++ -*-===//
>>>>>> >> +//
>>>>>> >> +//                     The LLVM Compiler
Infrastructure
>>>>>> >> +//
>>>>>> >> +// This file is distributed under the
University of Illinois Open
>>>>>> Source
>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>> >> +//
>>>>>> >>
+//===------------------------------------------------------
>>>>>> ----------------===//
>>>>>> >> +//
>>>>>> >>
+//===------------------------------------------------------
>>>>>> ----------------===//
>>>>>> >> +
>>>>>> >> +#include
"llvm/CodeGen/CodeGenOption.h"
>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>> >> +
>>>>>> >> +using namespace llvm;
>>>>>> >> +
>>>>>> >> +static std::map<std::string, bool>
FunctionBoolAttrs;
>>>>>> >> +static std::map<std::string, bool>
ModuleBoolAttrs;
>>>>>> >> +
>>>>>> >
>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141201/58e6b1cb/attachment.html>

Bob Wilson

2014-Dec-02 00:22 UTC

head link

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than
delving further into the details right now, I’ve suggested to Akira that he look
into how we should handle other kinds of options. I’m hoping that as we look at
more of them, we will gain some insight into the approach that we want to take
here. This patch really only deals with the easy cases, and whether we use
Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for
some other options…..
> On Dec 1, 2014, at 10:53 AM, Eric Christopher <echristo at gmail.com>
wrote:
> 
> Hi Bob,
> 
> I've been giving this some thought, comments inline:
> 
> On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com
<mailto:bob.wilson at apple.com>> wrote:
>> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com
<mailto:echristo at gmail.com>> wrote:
>> 
>> 
>> 
>> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at
gmail.com <mailto:ahatanak at gmail.com>> wrote:
>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at
gmail.com <mailto:echristo at gmail.com>> wrote:
>> So my general concern here is that lots of command line options that
don't need to be or shouldn't be IR level constructs become oddly named
string options. It's bad enough that we're doing that with the target
cpu and target feature string let alone the rest of it.
>> 
>> If we're talking about things that end up changing the subtarget
used we definitely need to make this part of the lookup key - at which point
we're going to want a unified interface to all of them. I haven't
applied your patch to see what the IR changes look like here.
>> 
>> 
>> With the patches applied, the attributes clang embeds in the IR look
like this:
>> 
>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
-menable-no-nans
>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
"arm-long-calls"="1"
"arm-reserve-r9"="1"
"less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
"stack-protector-buffer-size"="8"
"target-cpu"="arm7tdmi"
"target-features"="-neon,-crypto,+apcs"
"unsafe-fp-math"="false"
"use-soft-float"="true" }
>> 
>> I'm guessing you want the front-end to convert command line options
like "arm-long-calls" or "arm-reserve-r9" to something that
doesn't look as odd?
>> 
>> 
>> Right, that was part of the idea, the rest is that we define things
that actually make sense as front end command line options and use those to
communicate via IR constructs or api calls I.e. reserved registers don't
need to be a specific back end option, but rather can be passed along when
creating a target machine, etc.
>> 
>> -eric
> 
> No, reserved registers (I assume you’re referring to “arm-reserve-r9” in
Akira’s example) don’t necessarily need to be backend options. But, those kinds
of options do need to be recorded in the IR and the target machine needs to
reflect the relevant IR settings. Otherwise, we’ll still be left with a broken
LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another
without it, but the LTO compilation still needs to respect those options for the
code in each object. The details of how a particular option should be handled
will vary depending on the option, and we need to discuss each one separately.
> 
> 
> ABI breaking options should probably be considered part of
TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
creation it can be on the function as a feature string attribute, if it affects
global code generation it needs to be at TargetMachine creation time).
>  
> It’s pretty clear, though, that there is a class of backend options that
translate nicely to function attributes. This RFC is about the mechanism that we
use to handle those (without specifying which specific options will be handled
that way).
> 
> 
> Fair enough. For function attributes I can see a desire to have (as Chris
posted in an email from someone at synopsys) a generic target specific
dictionary mapping on functions for backend specific uses. I just haven't
seen anything in the thread that would be applicable for this rather than an
actual function attribute. That said, I would like some sort of delineation for
target function attributes as well. Thoughts on that? Perhaps it could look like
the string dictionary we've mentioned, maybe not.
>  
>>  
>> From looking at the patch it seems that the command line option for
NaNs should be a general attribute on the function rather than a TargetMachine
option anyhow, so it seems that should be migrated down. Whether or not it is
exposed as an llvm attribute or a string based key value pair is an interesting
discussion. Also, would we want to, instead, expose the backend options as clang
IR level options? Seems to make sense from that perspective and then they can
add the appropriate attributes to the functions themselves. (-mlongcalls for the
long calls option for example).
>> 
>> Thanks!
>> 
>> -eric
> 
> Again, I think we can evaluate specific options on a case-by-case basis.
For some of them, it might make sense to expose them as clang options. Like you,
I had previously assumed we would teach the front-end about all the backend
options. I was a little surprised by the approach Akira suggested here, but
after giving it some thought, I think it’s a great idea. There are a LOT of
target-specific backend options. Some of them do not need to be IR, but many do.
Teaching clang about all of them would be a lot of work, and I don’t see any
clear benefit from that. This approach lets clang remain relatively independent
of the backend options. We still need to think carefully about which backend
options use this mechanism, of course.
> 
> 
> I'm unconvinced here...
>  
> Are there other reasons why you think we should add all those options to
clang? I just don’t see much to be gained by that.
> 
> 
> .... I think that anything you'd like to expose should probably be a
front end option. If nothing else I don't see how you plan on annotating the
IR. Basically you're constructing a parallel option hierarchy and that
seems... redundant. I agree that it's nice for tools to be able to override
things in the backend - for testing if nothing else. We do this a lot by
checking options and values in the code. A way of streamlining this would be
nice - but I don't think that's the goal of these patches no?
>  
> Do you have objections to the use of string-based key/value pairs? I
believe the intention all along, going back to Bill’s work on this a few years
ago, was that we would add new attributes for target-independent things and use
strings for the target-specific options. That seems like a reasonable starting
point to me.
> 
> 
> Only some as I mentioned above - I haven't seen anything that needs to
be target specific in this way, but I'm open to the idea :)
> 
> As a note, this is all for function level things. We need a solution for
non-function/module level attributes and I think that's going to look like
API calls into the backend for TargetMachine creation - there may be a list of
things that would turn into module level attributes, but that is a direction
fraught with danger and I'd like to avoid that as much as possible.
> 
> (As a note, the particular patch that Akira has is unacceptable because of
the changes to the subtarget lookup, but that's a separate issue :)
> 
> -eric
>  
>> 
>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at
gmail.com <mailto:ahatanak at gmail.com>> wrote:
>> Updated patch is attached. Note this is just a work-in-progress patch
and I plan to address the feedback comments later if this patch is in the right
direction.
>> 
>> This is how the command line options are parsed and used by the backend
passes:
>> 
>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of
the options that should be written to the bitcode as function or module
attributes (these options are declared as CodeGenOpt, which is a subclass of
cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
>> 2. setFunctionAttribute is called to move the function and module
options saved in step 1 to the bitcode. Also, options stored in TargetOptions
are moved to the bitcode too. Pre-existing attributes in the bitcode are
overwritten by the options in the command line.
>> 3. The backend passes and subtarget classes call getFunctionAttribute
to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute
can be called directly (for example, NoNaNsFPMath in the patch), if the default
value is known to be "false".
>> 
>> I also made the following changes in the patch:
>> 
>> 1. In the constructor of MachineFunction, call the version of
getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget
specific to the Function being compiled.
>> 2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*,
unique_ptr<ARMSubtarget>>. This is just a temporary change to ease
debugging and should be reverted to a StringMap or changed to another type of
map later.
>> 
>> 
>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at
gmail.com <mailto:echristo at gmail.com>> wrote:
>> 
>> 
>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com
<mailto:bob.wilson at apple.com>> wrote:
>> 
>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith
<dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote:
>> >
>> > +chrisb
>> >
>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at
gmail.com <mailto:ahatanak at gmail.com>> wrote:
>> >>
>> >> I'm working on fixing PR21471, which is about embedding
codegen command line options into the bitcode as function or module-level
attributes so that they don't get ignored when doing LTO.
>> >>
>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
<http://llvm.org/bugs/show_bug.cgi?id=21471>
>> >>
>> >> I have an initial patch (attached to this email) which enables
clang/llvm to recognize one command line option, write it to the IR, and read it
out in a backend pass. I'm looking to get feedback from the community on
whether I'm headed in the right direction or whether there are alternate
ideas before I go all the way on fixing the PR. Specifically, I'd like to
know the answers to the following questions:
>> >>
>> >> 1. How do we make sure we continue to be able to use the
command line options we've been using for llc and other tools?
>> >> 2. How to handle cases where two functions in a module have
different sets of command line options?
>> >> 3. Where should the command line options or module/function
attributes be stored once they are read out from the IR?
>> >>
>> >> The short description of the approach I took in my patch is
that command line options that are important to codegen are collected by
cl::ParseCommandLineOptions, written to the bitcode as function or module
attributes, and read out directly by the optimization passes that need them.
cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt),
which are needed only to parse the command line and provide the default value
when the corresponding options are not in the bitcode.
>> >
>> > I like this approach, since it means the frontend doesn't have
to understand
>> > options in order to pass them on to the backend.
>> 
>> I agree. It’s not the approach that was I was expecting, but after
reading the patches, I like it.
>> 
>> If we can get consensus on the approach, I suggest that we stage the
work to first adopt Akira’s patches and then we can incrementally convert
various options over to use it. There may be complications for some options, so
I’d like to see separate patches for each one (possibly with a few similar
options combined in one patch).
>> 
>> 
>> I'm not sold quite yet :)
>> 
>> Akira: Can you show an example of how you expect this to work in the IR
and how the backend is going to process?
>> 
>> Thanks.
>> 
>> -eric
>>  
>> >
>> > The static variables should be straightforward to migrate to an
LLVMContext
>> > once ParseCommandLineOptions stores things there instead of in
globals.
>> >
>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
b/lib/CodeGen/CodeGenOption.cpp
>> >> new file mode 100644
>> >> index 0000000..2d74c2f
>> >> --- /dev/null
>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>> >> @@ -0,0 +1,59 @@
>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.         
--*- C++ -*-===//
>> >> +//
>> >> +//                     The LLVM Compiler Infrastructure
>> >> +//
>> >> +// This file is distributed under the University of Illinois
Open Source
>> >> +// License. See LICENSE.TXT for details.
>> >> +//
>> >>
+//===----------------------------------------------------------------------===//
>> >> +//
>> >>
+//===----------------------------------------------------------------------===//
>> >> +
>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>> >> +#include "llvm/IR/Attributes.h"
>> >> +#include "llvm/IR/Module.h"
>> >> +
>> >> +using namespace llvm;
>> >> +
>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>> >> +
>> >
>> > @Chris, should these be using ManagedStatic?
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>      
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141201/29654a34/attachment.html>

llvm dev - Dec 2014 - [LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)