thr3ads.net - llvm dev - [LLVMdev] [RFC] Section Declarations in LLVM IR [Apr 2014]

If this information is useful, please help other people find it:
Share via:

David Majnemer

2014-Mar-11 07:44 UTC

[LLVMdev] [RFC] Section Declarations in LLVM IR

Hi all,

I'd like to propose that LLVM IR have a mechanism to describe sections in a
more explicit way than we can today.

Currently, we provide an attribute called "section" on GlobalVariables
and
Functions.  This attribute will choose which section the Value will end up
in.
However, it does not describe the attributes of the section.
Without a way of describing the section, we try to infer the section's
attribute from the first Value from that section that MC comes across.

This means that if the first value is constant, the rest of the values
will end up in .rodata even if the intention was for them to be mutable.

Equally problematic is our inability to verify the appropriate use of
sections, consider the following:

- One global value is defined to be thread_local and in section ".foo"
- Another value is *not* defined to be thread_local and in section
".foo"

The IR verifier does not catch this nonsensical arrangement of IR.

Further motivation stems from being able to represent the MS ABI's RTTI
data:

- A single COMDAT section is created which holds both the RTTI data and the
vftable for a type.
- If there is no vftable, the section will start with just the vftable.
- The entire section is marked with a linkage that indicates the linker to
pick the largest.

I think LLVM needs new IR to represent these semantics properly.

I propose that we do the following:

- Sections are represented at module scope and have an identifier that
starts with '$'
- Sections have linkage, all Values inside of a section must agree with the
section's linkage.
- Sections don't have visibility, Values may disagree with one another
about how visible they are.
- Sections have attributes annotating what semantics they provide (read,
write, execute, etc.)

A concrete example of a const variable inside of a read-only section:

$.my_section = appending read
@my_var = constant float 1.0, section $.my_section, align 4

The following is how I imagine MS RTTI would look like if we had this IR
construct:

$.vdata_for_type = pick_largest read
@my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty
@rtti_complete_object_locator, section $.vdata_for_type, align 4
@vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8*
bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to i8*)], section
$.vdata_for_type, align 4

Attached is a patch to the LangRef.

Thanks for reading!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140311/0c06bae6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SectionIRLangRef.patch
Type: application/octet-stream
Size: 2245 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140311/0c06bae6/attachment.obj>

Rafael Espíndola

2014-Mar-11 19:17 UTC

head link

[LLVMdev] [RFC] Section Declarations in LLVM IR

On 11 March 2014 03:44, David Majnemer <david.majnemer at gmail.com>
wrote:> Hi all,
>
> I'd like to propose that LLVM IR have a mechanism to describe sections
in a
> more explicit way than we can today.
>
> Currently, we provide an attribute called "section" on
GlobalVariables and
> Functions.  This attribute will choose which section the Value will end up
> in.
> However, it does not describe the attributes of the section.
> Without a way of describing the section, we try to infer the section's
> attribute from the first Value from that section that MC comes across.
>
> This means that if the first value is constant, the rest of the values
> will end up in .rodata even if the intention was for them to be mutable.
>
> Equally problematic is our inability to verify the appropriate use of
> sections, consider the following:
>
> - One global value is defined to be thread_local and in section
".foo"
> - Another value is *not* defined to be thread_local and in section
".foo"
>
> The IR verifier does not catch this nonsensical arrangement of IR.
>
> Further motivation stems from being able to represent the MS ABI's RTTI
> data:
>
> - A single COMDAT section is created which holds both the RTTI data and the
> vftable for a type.
> - If there is no vftable, the section will start with just the vftable.
> - The entire section is marked with a linkage that indicates the linker to
> pick the largest.
>
> I think LLVM needs new IR to represent these semantics properly.
>
> I propose that we do the following:
>
> - Sections are represented at module scope and have an identifier that
> starts with '$'
OK until here.
> - Sections have linkage, all Values inside of a section must agree with the
> section's linkage.
This seems way too restrictive and I don't think it maps to what
object files actually do. We should be able to for example do

$foo = section "bar", ..... comdat "zed", select_largest,
etc

@baz = private ... section $foo
@bah = linkonce_odr alias baz, offset 4

This fully general I think. In particular, the above example
represents a section name bar, in a comdat represented by the symbol
zed. The contents of the section is that of @baz and the only visible
symbol in the section is bah, which is at offset 4.

I think we can simply require that every global object in a section
that is a comdat must be isDiscardableIfUnused.

This should also be usable to represent sections that are not comdats,
and those can have any mix of global values as they do now. We just
gain the ability to define more information about the section. In
fact, I would probably suggest getting this in first, for making the
patches incremental.
> - Sections don't have visibility, Values may disagree with one another
about
> how visible they are.
> - Sections have attributes annotating what semantics they provide (read,
> write, execute, etc.)
Both good.
> A concrete example of a const variable inside of a read-only section:
>
> $.my_section = appending read
> @my_var = constant float 1.0, section $.my_section, align 4
>
> The following is how I imagine MS RTTI would look like if we had this IR
> construct:
>
> $.vdata_for_type = pick_largest read
> @my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty
> @rtti_complete_object_locator, section $.vdata_for_type, align 4
> @vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8*
> bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to i8*)],
section
> $.vdata_for_type, align 4
Part of the problem is that it seems that the order is important. We
really should not require that at the llvm IR level. The above global
values could be output in any order.
> Attached is a patch to the LangRef.
>
> Thanks for reading!
Thanks for working on this! It is an excellent step towards getting
better comdat support in LLVM!

Cheers,
Rafael

Reid Kleckner

2014-Apr-01 21:14 UTC

head link

[LLVMdev] [RFC] Section Declarations in LLVM IR

On Tue, Mar 11, 2014 at 12:17 PM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> > I think LLVM needs new IR to represent these semantics properly.
>
Cool!  This proposal makes a lot of sense to me.

> > - Sections have linkage, all Values inside of a section must agree
with
> the
> > section's linkage.
>
> This seems way too restrictive and I don't think it maps to what
> object files actually do. We should be able to for example do
>
> $foo = section "bar", ..... comdat "zed",
select_largest, etc
>
> @baz = private ... section $foo
> @bah = linkonce_odr alias baz, offset 4
>
> This fully general I think. In particular, the above example
> represents a section name bar, in a comdat represented by the symbol
> zed. The contents of the section is that of @baz and the only visible
> symbol in the section is bah, which is at offset 4.
>
I like this proposal.  Any reason to use an explicit offset rather than
allow GEPs into aliases?
> The following is how I imagine MS RTTI would look like if we had this IR
> > construct:
> >
> > $.vdata_for_type = pick_largest read
> > @my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty
> > @rtti_complete_object_locator, section $.vdata_for_type, align 4
> > @vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8*
> > bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to
i8*)], section
> > $.vdata_for_type, align 4
>
> Part of the problem is that it seems that the order is important. We
> really should not require that at the llvm IR level. The above global
> values could be output in any order.

Yeah, let's not rely on order of the IR.

----

+The linkage must be one of ``appending``, ``linkonce_odr``, ``linkonce``.
+
+#. All values inside of a section must have ``external`` linkage if the
section has
+   ``appending``.

It's kind of cute to use appending here, but why not just let
'external'
mean the same as 'appending' for sections?  Then we can say something
simple like "the linkage of all symbols in a section must be private,
internal, or the linkage of the section."

+#. All values inside of a section must have ``linkonce_odr`` linkage if the
+   section has ``linkonce_odr``.
+
+#. All values inside of a section must have ``linkonce`` linkage if the
+   section has ``linkonce``.

Sounds like we're going to drop some of these restrictions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140401/570729df/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Apr 2014 - [LLVMdev] [RFC] Section Declarations in LLVM IR

[LLVMdev] [RFC] Section Declarations in LLVM IR

[LLVMdev] [RFC] Section Declarations in LLVM IR

[LLVMdev] [RFC] Section Declarations in LLVM IR

Possibly Parallel Threads