thr3ads.net - llvm dev - [LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Bill Wendling

2012-Nov-26 21:20 UTC

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

On Nov 20, 2012, at 11:03 AM, Meador Inge <meadori at codesourcery.com>
wrote:
> On Nov 13, 2012, at 12:20 AM, Bill Wendling wrote:
> 
>> IR Changes
>> ----------
>> 
>> The attributes will be specified within the IR. This allows us to
generate code
>> that the user wants. This also has the advantage that it will no longer
be
>> necessary to specify all of the command line options when compiling the
bit code
>> (via 'llc' or 'clang'). E.g., '-mcpu=cortex-a8'
will be an attribute and won't
>> be required on llc's command line. However, explicit flags (like
`-mcpu') on the
>> llc command line will override flags specified in the module.
>> 
>> The core of this proposal is the idea of an "attribute
group". As the name
>> implies, it's a group of attributes that are then referenced by
objects within
>> the IR. An attribute group is a module-level object. The BNF of the
syntax is:
>> 
>> attribute_group := attrgroup <attrgroup_id> = {
<attribute_list> }
>> attrgroup_id    := #<number>
>> attribute_list  := <attribute> (, <attribute>)*
>> attribute       := <name> (= <value>)?
>> 
>> To use an attribute group, an object references the attribute
group's ID:
>> 
>> attribute_group_ref := attrgroup(<attrgroup_id>)
>> 
>> This is an example of an attribute group for a function that should
always be
>> inlined, has stack alignment of 4, and doesn't unwind:
>> 
>> attrgroup #1 = { alwaysinline, nounwind, alignstack=4 }
>> 
>> void @foo() attrgroup(#1) { ret void }
>> 
>> An object may refer to more than one attribute group. In that
situation, the
>> attributes are merged.
>> 
>> Attribute groups are important for keeping `.ll' files readable,
because a lot
>> of functions will use the same attributes. In the degenerative case of
a `.ll'
>> file that corresponds to a single `.c' file, the single
`attrgroup' will capture
>> the command line flags used to build that file.
> 
> A few comments on the new syntax:
> 
>   1. I think most folks will understand what 'attrgroup' means, but
it is a little cryptic.
>      How about just 'attributes'?  The following reads easier to my
eyes:
> 
>         attributes #1 = { alwaysinline, nounwind, alignstack=4 }
>         void @foo() attributes(#1) { ret void }
> I don't have a very strong opinion on this.
>   2. Are group references allowed in all attribute contexts (parameter,
return value, function)?
>      I think the answer should be yes.
It would seem a natural expansion of the attribute groups concept. But I want to
make these changes incrementally. So at the beginning this won't happen.
> Also, it might be worth considering using the same attribute
>      list syntax in the current context and the new attribute group
definition (i.e. comma-separated
>      v.s. space-separated).  This way we have a consistent syntax for
groups of attributes and the
>      main addition this proposal adds is to give a name to those attributes
for later reference.
> I also prefer comma separated lists of things. But this could cause some
confusion if we expand the concept to parameter attributes. But see below for a
potential alternative syntax for the attribute groups.
>   3. Can attribute groups and single attributes be inter-mixed?
>      For example:
> 
>         void @foo attrgroup(#1) alwaysinline attrgroup(#2) nounwind
> This will be necessary for backwards compatibility. However, running this
through this sequence:

	$ llvm-as < foo.ll | llvm-dis

would produce:

	attrgroup #1 = { ... }
	attrgroup #2 = { ... }
	attrgroup #3 = { alwaysinline, nounwind }

	void @foo() attrgroup(#1) attrgroup(#2) attrgroup(#3)

This is because of how the attributes will be represented internally to LLVM.
Let me know if you have strong objections to this.
>   4. Do we really want the attribute references limited to a number?  Code
will be more readable
>      if you can use actual names that indicate the intent.  For example:
> 
>         attrgroup #compile_options = { … }
>         void @foo attrgroup(#compile_options)
> The problem with this is it limits the number of attribute groups to a specific
set -- compile options, non-compile options, etc.. There could be many different
attribute groups involved, especially during LTO. I realize that the names will
be uniqued. But that just adds a number to the existing name. I also want to
avoid partitioning of the attributes into arbitrary groups -- i.e., groups with
specific names which imply their usage or type.
>   5. Can attributes be nested?  For example:
> 
>         attrgroup #1 = { foo, bar }
>         attrgroup #2 = { #1, baz }
> 
>      Might be nice.
> I'm not a big fan of this idea. This could open it up to circular attribute
groups:

	attrgroup #1 = { foo, #2 }
	attrgroup #2 = { #1, bar }

which I'm opposed to on moral groups. ;-) A less compelling (but IMHO valid)
argument is that it makes the internal representation of attributes that much
more complex.
>   6. Do we really need to specify the attrgroup keyword twice? (Once in the
group definition and once in the use)
>      ISTM, that the hash-mark is enough to announce a group reference in
the use.  For example:
> 
>         void @foo #1 alwaysinline #2 no unwind
> Looking at my example above, my syntax can get a bit wordy. How about this
alternative representation?

	define void @foo() attrgroup(#1, #2, #3) { ret void }

I don't have a strong opinion though. You're correct that the
hash-number combo unambiguously defines an attribute group's use. If others
are amenable to this, I can drop the keyword here.
> In other words, I think something like the following might be nicer:
> 
> attribute_group := attributes <attrgroup_id> = {
<attribute_list> }
> attrgroup_id    := #<id>
> attribute_list  := <attribute> ( <attribute>)*
> attribute       := <name> (= <value>)?
>                 | <attribuge_id>
> 
> …
> 
> function_def    := <attribute_list> <result_type> @<id>
([argument_list]) <attribute_list>
> So something like this (no references inside of the 'attributes'
statement allowed, cf. above)?

	attributes #1 = { noinline, alignstack=4 }
	attributes #2 = { "no-sse" }

	define void @foo() #1 #2 { ret void }

This seems reasonable to me.
>> Target-Dependent Attributes in IR
>> ---------------------------------
>> 
>> The front-end is responsible for knowing which target-dependent options
are
>> interesting to the target. Target-dependent attributes are specified as
strings,
>> which are understood by the target's back-end. E.g.:
>> 
>> attrgroup #0 = { "long-calls", "cpu=cortex-a8",
"thumb" }
>> 
>> define void @func() attrgroup(#0) { ret void }
>> 
>> The ARM back-end is the only target that knows about these options and
what to
>> do with them.
>> 
>> Some of the `cl::opt' options in the backend could move into
attribute groups.
>> This will clean up the compiler.
>> 
> 
> Isn't calling these "target-dependent" a little artificial? 
Surely there are many uses
> for string attributes one of which is for target-specific data.  I think
organizing the
> proposal to add these new arbitrary string attributes and using the
target-specific bits
> as examples will be clearer.
> It's a bit artificial. I basically want to make a small distinction here
where anything not target-specific will be defined inside of LangRef.html. So
anything that could be used by all targets should be defined there.
>> Updating IR
>> -----------
>> 
>> The current attributes that are specified on functions will be moved
into an
>> attribute group. The LLVM assembly reader will still honor those but
when the
>> assembly file is emitted, those attributes will be output as an
attribute group
>> by the assembly writer. As usual, LLVM 3.3 will be able to read and
auto-upgrade
>> previous bitcode and `.ll' files.
>> 
>> Querying
>> --------
>> 
>> The attributes are attached to the function. It's therefore trivial
to access
>> the attributes within the middle- and the back-ends. Here's an
example of how
>> attributes are queried:
>> 
>> Attributes &A = F.getAttributes();
>> 
>> // Target-independent attribute query.
>> A.hasAttribute(Attributes::NoInline);
>> 
>> // Target-dependent attribute query.
>> A.hasAttribute("no-sse");
>> 
>> // Retrieving value of a target-independent attribute.
>> int Alignment = A.getIntValue(Attributes::Alignment);
>> 
>> // Retrieving value of a target-dependent attribute.
>> StringRef CPU = A.getStringValue("cpu");
> 
> Maybe some set attribute examples too?
> That would be done through the current AttrBuilder class:

	AttrBuilder B;

	// Add a target-independent attribute.
	B.addAttribute(Attributes::NoInline);

	// Add a target-dependent attribute.
	B.addAttribute("no-sse");

	// Create the attribute object.
	Attributes A = Attributes::get(Context, B);
> Overall, I think this is a nice addition!
> Thanks!

-bw

Bill Wendling

2012-Nov-26 21:41 UTC

head link

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

On Nov 26, 2012, at 1:20 PM, Bill Wendling <wendling at apple.com> wrote:
> On Nov 20, 2012, at 11:03 AM, Meador Inge <meadori at
codesourcery.com> wrote:
> 
>>  3. Can attribute groups and single attributes be inter-mixed?
>>     For example:
>> 
>>        void @foo attrgroup(#1) alwaysinline attrgroup(#2) nounwind
>> 
> This will be necessary for backwards compatibility. However, running this
through this sequence:
> 
> 	$ llvm-as < foo.ll | llvm-dis
> 
> would produce:
> 
> 	attrgroup #1 = { ... }
> 	attrgroup #2 = { ... }
> 	attrgroup #3 = { alwaysinline, nounwind }
> 
> 	void @foo() attrgroup(#1) attrgroup(#2) attrgroup(#3)
> 
> This is because of how the attributes will be represented internally to
LLVM. Let me know if you have strong objections to this.
> Now that I think about it, this isn't the output. Here's what it would
look like:

a.ll:
	attributes #1 = { "no-sse" }
	attributes #2 = { noredzone }

	define void @foo() #1 #2 alwaysinline nounwind { ret void }

Here's the output:

	$ llvm-as < a.ll | llvm-dis

	attributes #1 = { "no-sse" }
	attributes #2 = { noredzone }
	attributes #3 = { "no-sse", noredzone, alwaysinline, nounwind }

	define void @foo() #3 { ret void }

This is because all of the attribute groups that a function references will be
merged into one attribute object. When we output the attribute object, we
don't know that the original function referred to two attribute groups and
had a couple of extra attributes defined.

In practice, I expect this to happen rarely in non-LTO mode.

-bw

Meador Inge

2012-Dec-05 04:12 UTC

head link

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

On Nov 26, 2012, at 3:20 PM, Bill Wendling wrote:
>>  4. Do we really want the attribute references limited to a number? 
Code will be more readable
>>     if you can use actual names that indicate the intent.  For example:
>> 
>>        attrgroup #compile_options = { … }
>>        void @foo attrgroup(#compile_options)
>> 
> The problem with this is it limits the number of attribute groups to a
specific set -- compile options, non-compile options, etc.. There could be many
different attribute groups involved, especially during LTO. I realize that the
names will be uniqued. But that just adds a number to the existing name. I also
want to avoid partitioning of the attributes into arbitrary groups -- i.e.,
groups with specific names which imply their usage or type.
My main concern is that I see no reason to limit the id to just numbers in the
*language definition*.
That doesn't mean you can't always generate #<number> (in the same
way that we do for variable names).
This way it leaves open the possibility of hand-writing nice names.
>>  5. Can attributes be nested?  For example:
>> 
>>        attrgroup #1 = { foo, bar }
>>        attrgroup #2 = { #1, baz }
>> 
>>     Might be nice.
>> 
> I'm not a big fan of this idea. This could open it up to circular
attribute groups:
> 
> 	attrgroup #1 = { foo, #2 }
> 	attrgroup #2 = { #1, bar }
> 
> which I'm opposed to on moral groups. ;-) A less compelling (but IMHO
valid) argument is that it makes the internal representation of attributes that
much more complex.
Fair enough.
>> In other words, I think something like the following might be nicer:
>> 
>> attribute_group := attributes <attrgroup_id> = {
<attribute_list> }
>> attrgroup_id    := #<id>
>> attribute_list  := <attribute> ( <attribute>)*
>> attribute       := <name> (= <value>)?
>>                | <attribuge_id>
>> 
>> …
>> 
>> function_def    := <attribute_list> <result_type>
@<id> ([argument_list]) <attribute_list>
>> 
> So something like this (no references inside of the 'attributes'
statement allowed, cf. above)?
> 
> 	attributes #1 = { noinline, alignstack=4 }
> 	attributes #2 = { "no-sse" }
> 
> 	define void @foo() #1 #2 { ret void }
> 
> This seems reasonable to me.
Me too.  This seem pretty close to what was implemented in the patches posted on
llvm-commits.  I review those in a bit.

--
Meador Inge
CodeSourcery / Mentor Embedded

Bill Wendling

2012-Dec-05 06:05 UTC

head link

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

On Dec 4, 2012, at 8:12 PM, Meador Inge <meadori at codesourcery.com>
wrote:
> On Nov 26, 2012, at 3:20 PM, Bill Wendling wrote:
> 
>>> 4. Do we really want the attribute references limited to a number? 
Code will be more readable
>>>    if you can use actual names that indicate the intent.  For
example:
>>> 
>>>       attrgroup #compile_options = { … }
>>>       void @foo attrgroup(#compile_options)
>>> 
>> The problem with this is it limits the number of attribute groups to a
specific set -- compile options, non-compile options, etc.. There could be many
different attribute groups involved, especially during LTO. I realize that the
names will be uniqued. But that just adds a number to the existing name. I also
want to avoid partitioning of the attributes into arbitrary groups -- i.e.,
groups with specific names which imply their usage or type.
> 
> My main concern is that I see no reason to limit the id to just numbers in
the *language definition*.
> That doesn't mean you can't always generate #<number> (in the
same way that we do for variable names).
> This way it leaves open the possibility of hand-writing nice names.
> Okay. It shouldn't be too difficult to do that.
>>> In other words, I think something like the following might be
nicer:
>>> 
>>> attribute_group := attributes <attrgroup_id> = {
<attribute_list> }
>>> attrgroup_id    := #<id>
>>> attribute_list  := <attribute> ( <attribute>)*
>>> attribute       := <name> (= <value>)?
>>>               | <attribuge_id>
>>> 
>>> …
>>> 
>>> function_def    := <attribute_list> <result_type>
@<id> ([argument_list]) <attribute_list>
>>> 
>> So something like this (no references inside of the
'attributes' statement allowed, cf. above)?
>> 
>> 	attributes #1 = { noinline, alignstack=4 }
>> 	attributes #2 = { "no-sse" }
>> 
>> 	define void @foo() #1 #2 { ret void }
>> 
>> This seems reasonable to me.
> 
> Me too.  This seem pretty close to what was implemented in the patches
posted on
> llvm-commits.  I review those in a bit.
> I made one change. I made the attributes in the attribute groups non-comma
separated. Otherwise, it's pretty much the same.

-bw

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Dec 2012 - [LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

Maybe Matching Threads