Petr Hosek via llvm-dev
2021-Feb-11 08:00 UTC
[llvm-dev] Support for zero flag ELF section groups in LLVM IR
D95851 introduces support for zero flag ELF section groups to LLVM. LLVM already supports COMDAT sections, which in ELF are a special type of ELF section groups. These are generally useful to enable linker GC where you want a group of sections to always travel together, that is to be either retained or discarded as a whole, but without the COMDAT semantics. Other ELF assemblers and linkers already support zero flag ELF section groups and this change helps us reach feature parity. An open question is how to best represent these in LLVM IR. We represent COMDAT sections as global variables and other global variables can be included in COMDAT sections, see https://llvm.org/docs/LangRef.html#comdats for details. We want to capture the fact that COMDAT sections are a special type of ELF section groups and we also want to preserve the existing syntax and API for backwards compatibility, but also because other formats like COFF support COMDAT sections, but not section groups. Our proposal is to introduce ELF section groups as a new type of global variable akin to COMDAT sections. We would extend the language by changing: [, comdat[($name)]] when declaring a global variable to: [, \(group[($name)] | [group] comdat[($name)]\)] When it comes to C++ API, we would introduce Group as a superclass of Comdat: class Group { StringRef getName() const; }; class Comdat : public Group { ... }; class GlobalObject : public GlobalValue { ... bool hasGroup(); Group *getGroup(); void setGroup(Group G); // has/get/setComdat functions re-implemented in terms of has/get/setGroup ... }; Does this make sense? Can anyone think of a better representation? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210211/4cae0fdb/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3996 bytes Desc: S/MIME Cryptographic Signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210211/4cae0fdb/attachment.bin>
Reid Kleckner via llvm-dev
2021-Feb-11 22:04 UTC
[llvm-dev] Support for zero flag ELF section groups in LLVM IR
We are already using LLVM IR comdat groups for the same purpose, linker GC association, on COFF. I think we just need a flag to mark ELF comdat groups as, essentially, not actually being common data that the linker should deduplicate by name, aka a zero flag group. See how Windows ASan uses comdat groups on internal globals for metadata registration: $ cat t.cpp int f(); static int gv = f(); $ clang -S t.cpp --target=x86_64-windows-msvc -o - -emit-llvm -fsanitize=address ... $gv = comdat noduplicates ... @gv = internal global { i32, [60 x i8] } zeroinitializer, comdat, align 32 ... @__asan_global_gv = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint ({ i32, [60 x i8] }* @gv to i64), i64 4, i64 64, i64 ptrtoint ([3 x i8]* @___asan_gen_.1 to i64), i64 ptrtoint ([6 x i8]* @___asan_gen_ to i64), i64 1, i64 ptrtoint ({ [6 x i8]*, i32, i32 }* @___asan_gen_.3 to i64), i64 -1 }, section ".ASAN$GL", comdat($gv), align 64, !associated !0 We are using the "noduplicates" comdat flag here, but @gv has internal linkage, and COFF linkers merge symbols, not section group names, so this code does what we want it to. Maybe it would make more sense if we used some kind of portable flag, like "internal" or "unique" on the comdat group to indicate that the group doesn't participate in merging. On COFF, we'd have the limitation that this feature only works for comdat groups named after internal linkage globals, but on ELF, the group could have any name. You could rename the Comdat class to Group or SectionGroup or something, but I'm not sure there's much value in it. The terminology as it is makes sense for COFF, if not for ELF. ELF makes the distinction between comdat section groups and non-comdat section groups, but MSVC and clang-cl use the IMAGE_SCN_COMDAT symbol flag and the IMAGE_COMDAT_SELECT_ASSOCIATIVE selection flag to implement these types of groups. Then, there's the cost of churning the textual IR spellings and method names. We have the freedom to change these things, but we should acknowledge that it does create work for ourselves and others. IMO, it is worth living with COFF-centric naming of an IR feature to avoid paying these costs. However, I am probably biased, as I have been calling this idea of a group of sections that travel together a "comdat" for a while now. On Thu, Feb 11, 2021 at 12:00 AM Petr Hosek via llvm-dev < llvm-dev at lists.llvm.org> wrote:> D95851 introduces support for zero flag ELF section groups to LLVM. LLVM > already supports COMDAT sections, which in ELF are a special type of ELF > section groups. These are generally useful to enable linker GC where you > want a group of sections to always travel together, that is to be either > retained or discarded as a whole, but without the COMDAT semantics. Other > ELF assemblers and linkers already support zero flag ELF section groups and > this change helps us reach feature parity. > > An open question is how to best represent these in LLVM IR. > > We represent COMDAT sections as global variables and other global > variables can be included in COMDAT sections, see > https://llvm.org/docs/LangRef.html#comdats for details. > > We want to capture the fact that COMDAT sections are a special type of ELF > section groups and we also want to preserve the existing syntax and API for > backwards compatibility, but also because other formats like COFF support > COMDAT sections, but not section groups. > > Our proposal is to introduce ELF section groups as a new type of global > variable akin to COMDAT sections. We would extend the language by changing: > > [, comdat[($name)]] > > when declaring a global variable to: > > [, \(group[($name)] | [group] comdat[($name)]\)] > > When it comes to C++ API, we would introduce Group as a superclass of > Comdat: > > class Group { > StringRef getName() const; > }; > class Comdat : public Group { > ... > }; > class GlobalObject : public GlobalValue { > ... > bool hasGroup(); > Group *getGroup(); > void setGroup(Group G); > // has/get/setComdat functions re-implemented in terms of > has/get/setGroup > ... > }; > > Does this make sense? Can anyone think of a better representation? > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210211/dd9247e0/attachment.html>