Hi all, I'd like to propose that LLVM IR have a mechanism to describe sections in a more explicit way than we can today. Currently, we provide an attribute called "section" on GlobalVariables and Functions. This attribute will choose which section the Value will end up in. However, it does not describe the attributes of the section. Without a way of describing the section, we try to infer the section's attribute from the first Value from that section that MC comes across. This means that if the first value is constant, the rest of the values will end up in .rodata even if the intention was for them to be mutable. Equally problematic is our inability to verify the appropriate use of sections, consider the following: - One global value is defined to be thread_local and in section ".foo" - Another value is *not* defined to be thread_local and in section ".foo" The IR verifier does not catch this nonsensical arrangement of IR. Further motivation stems from being able to represent the MS ABI's RTTI data: - A single COMDAT section is created which holds both the RTTI data and the vftable for a type. - If there is no vftable, the section will start with just the vftable. - The entire section is marked with a linkage that indicates the linker to pick the largest. I think LLVM needs new IR to represent these semantics properly. I propose that we do the following: - Sections are represented at module scope and have an identifier that starts with '$' - Sections have linkage, all Values inside of a section must agree with the section's linkage. - Sections don't have visibility, Values may disagree with one another about how visible they are. - Sections have attributes annotating what semantics they provide (read, write, execute, etc.) A concrete example of a const variable inside of a read-only section: $.my_section = appending read @my_var = constant float 1.0, section $.my_section, align 4 The following is how I imagine MS RTTI would look like if we had this IR construct: $.vdata_for_type = pick_largest read @my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty @rtti_complete_object_locator, section $.vdata_for_type, align 4 @vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8* bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to i8*)], section $.vdata_for_type, align 4 Attached is a patch to the LangRef. Thanks for reading! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140311/0c06bae6/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: SectionIRLangRef.patch Type: application/octet-stream Size: 2245 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140311/0c06bae6/attachment.obj>
On 11 March 2014 03:44, David Majnemer <david.majnemer at gmail.com> wrote:> Hi all, > > I'd like to propose that LLVM IR have a mechanism to describe sections in a > more explicit way than we can today. > > Currently, we provide an attribute called "section" on GlobalVariables and > Functions. This attribute will choose which section the Value will end up > in. > However, it does not describe the attributes of the section. > Without a way of describing the section, we try to infer the section's > attribute from the first Value from that section that MC comes across. > > This means that if the first value is constant, the rest of the values > will end up in .rodata even if the intention was for them to be mutable. > > Equally problematic is our inability to verify the appropriate use of > sections, consider the following: > > - One global value is defined to be thread_local and in section ".foo" > - Another value is *not* defined to be thread_local and in section ".foo" > > The IR verifier does not catch this nonsensical arrangement of IR. > > Further motivation stems from being able to represent the MS ABI's RTTI > data: > > - A single COMDAT section is created which holds both the RTTI data and the > vftable for a type. > - If there is no vftable, the section will start with just the vftable. > - The entire section is marked with a linkage that indicates the linker to > pick the largest. > > I think LLVM needs new IR to represent these semantics properly. > > I propose that we do the following: > > - Sections are represented at module scope and have an identifier that > starts with '$'OK until here.> - Sections have linkage, all Values inside of a section must agree with the > section's linkage.This seems way too restrictive and I don't think it maps to what object files actually do. We should be able to for example do $foo = section "bar", ..... comdat "zed", select_largest, etc @baz = private ... section $foo @bah = linkonce_odr alias baz, offset 4 This fully general I think. In particular, the above example represents a section name bar, in a comdat represented by the symbol zed. The contents of the section is that of @baz and the only visible symbol in the section is bah, which is at offset 4. I think we can simply require that every global object in a section that is a comdat must be isDiscardableIfUnused. This should also be usable to represent sections that are not comdats, and those can have any mix of global values as they do now. We just gain the ability to define more information about the section. In fact, I would probably suggest getting this in first, for making the patches incremental.> - Sections don't have visibility, Values may disagree with one another about > how visible they are. > - Sections have attributes annotating what semantics they provide (read, > write, execute, etc.)Both good.> A concrete example of a const variable inside of a read-only section: > > $.my_section = appending read > @my_var = constant float 1.0, section $.my_section, align 4 > > The following is how I imagine MS RTTI would look like if we had this IR > construct: > > $.vdata_for_type = pick_largest read > @my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty > @rtti_complete_object_locator, section $.vdata_for_type, align 4 > @vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8* > bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to i8*)], section > $.vdata_for_type, align 4Part of the problem is that it seems that the order is important. We really should not require that at the llvm IR level. The above global values could be output in any order.> Attached is a patch to the LangRef. > > Thanks for reading!Thanks for working on this! It is an excellent step towards getting better comdat support in LLVM! Cheers, Rafael
On Tue, Mar 11, 2014 at 12:17 PM, Rafael EspĂndola < rafael.espindola at gmail.com> wrote:> > I think LLVM needs new IR to represent these semantics properly. >Cool! This proposal makes a lot of sense to me.> > - Sections have linkage, all Values inside of a section must agree with > the > > section's linkage. > > This seems way too restrictive and I don't think it maps to what > object files actually do. We should be able to for example do > > $foo = section "bar", ..... comdat "zed", select_largest, etc > > @baz = private ... section $foo > @bah = linkonce_odr alias baz, offset 4 > > This fully general I think. In particular, the above example > represents a section name bar, in a comdat represented by the symbol > zed. The contents of the section is that of @baz and the only visible > symbol in the section is bah, which is at offset 4. >I like this proposal. Any reason to use an explicit offset rather than allow GEPs into aliases?> The following is how I imagine MS RTTI would look like if we had this IR > > construct: > > > > $.vdata_for_type = pick_largest read > > @my_rtti_for_type = pick_largest unnamed_addr constant %rtti_ptr_ty > > @rtti_complete_object_locator, section $.vdata_for_type, align 4 > > @vftable_for_type = pick_largest unnamed_addr constant [1 x i8*] [i8* > > bitcast (void (%struct.S*)* @"\01?fun at S@@UAEXXZ" to i8*)], section > > $.vdata_for_type, align 4 > > Part of the problem is that it seems that the order is important. We > really should not require that at the llvm IR level. The above global > values could be output in any order.Yeah, let's not rely on order of the IR. ---- +The linkage must be one of ``appending``, ``linkonce_odr``, ``linkonce``. + +#. All values inside of a section must have ``external`` linkage if the section has + ``appending``. It's kind of cute to use appending here, but why not just let 'external' mean the same as 'appending' for sections? Then we can say something simple like "the linkage of all symbols in a section must be private, internal, or the linkage of the section." +#. All values inside of a section must have ``linkonce_odr`` linkage if the + section has ``linkonce_odr``. + +#. All values inside of a section must have ``linkonce`` linkage if the + section has ``linkonce``. Sounds like we're going to drop some of these restrictions. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140401/570729df/attachment.html>