Zachary Turner via llvm-dev
2020-Jul-25 06:32 UTC
[llvm-dev] Any LLD guarantees on section alignment across TUs?
Suppose i write // foo.cpp __attribute__((section(“foo”))) int x; // bar.cpp __attribute__((section(“foo”))) int y; And i compile and link these two object files together using lld. What assumptions can I make regarding alignment/padding between the two symbols? I’m comfortable getting an answer by reading the source, but that won’t tell if any properties i discover are guaranteed or just happenstance. Are all of the following guaranteed ? A) relative order of symbols within a TU is not modified by the linker B) No padding is inserted by the linker between symbols in a TU aside from that which was already inserted by the compiler/assembler C) When merging section A from inputs B and C, the minimal amount of padding necessary so that the first symbol from C is properly aligned is inserted. I think(?) these conditions would be sufficient to guarantee, for example, that I could implement my own .ctor / @init_array logic using only the front end. And are there any subtle interactions here with -fdata-sections or behavioral differences across COFF/ELF/MachO? Is there anyone who can provide some insight on this? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200724/4ed41c9c/attachment.html>
Fangrui Song via llvm-dev
2020-Jul-25 16:39 UTC
[llvm-dev] Any LLD guarantees on section alignment across TUs?
On 2020-07-24, Zachary Turner via llvm-dev wrote:>Suppose i write > >// foo.cpp >__attribute__((section(“foo”))) int x; > >// bar.cpp >__attribute__((section(“foo”))) int y; > >And i compile and link these two object files together using lld. What >assumptions can I make regarding alignment/padding between the two symbols? > >I’m comfortable getting an answer by reading the source, but that won’t >tell if any properties i discover are guaranteed or just happenstance.I am only relatively confident about ELF semantics. The ELF specification defines says (http://www.sco.com/developers/gabi/latest/ch4.sheader.html) that: ... When not otherwise constrained, sections should be emitted in input order. For object files this is clear. For archives without --start-group this can be considered clear as well. For archives with --start-group (LLD implicitly adds --start-group for every archive)) the input order is the time point its member is fetched. ld.lld ... foo.a bar.o # bar.o(foo) may come before foo.a(foo.o(foo))>Are all of the following guaranteed ? >A) relative order of symbols within a TU is not modified by the linkerIf you have more symbols defined relative to foo, i.e. __attribute__((section('foo"))) int x; __attribute__((section("foo"))) int z; Yes, the order of x and z in the input section is their relative order in the output section. The input section is handled as a whole. A linker cannot reorder content within an input section.>B) No padding is inserted by the linker between symbols in a TU aside from >that which was already inserted by the compiler/assemblerYes. The ELF spec says "... it must honor the alignment constraints of the input sections (asserted by the sh_addralign field)">C) When merging section A from inputs B and C, the minimal amount of >padding necessary so that the first symbol from C is properly aligned is >inserted.Yes (unless you use linker scripts: foo : { foo.o(foo) BYTE(0) bar.o(foo) })>I think(?) these conditions would be sufficient to guarantee, for example, >that I could implement my own .ctor / @init_array logic using only the >front end.You can implement .ctor, which is of type SHT_PROGBITS. For .init_array, GCC __attribute__((section(name))) cannot set the section type to SHT_INIT_ARRAY. GNU as will set SHT_PROGBITS anyway but also emit a warning: Warning: setting incorrect section attributes for .init_array You can use module level inline asm (.pushsection ... .popsection), though...>And are there any subtle interactions here with -fdata-sections or >behavioral differences across COFF/ELF/MachO? > >Is there anyone who can provide some insight on this?
Zachary Turner via llvm-dev
2020-Jul-25 17:21 UTC
[llvm-dev] Any LLD guarantees on section alignment across TUs?
Thanks for your answers! To clarify about .ctor / init_array, I didn’t mean necessarily those exact sections, but rather something like it. Basically, i want to write some data into a section from multiple TUs, and then at runtime parse the ELF/MachO/COFF image via its load address in memory to find my section and iterate its contents. In the simple case the contents will just be a list of function pointers, but in a more complicated case i can imagine some size-prefixed variable length records. So I mostly just wanted some confidence that the linker is not allowed to change the final output section in such a way as to break this kind of thing. It sounds like I’m safe though (for ELF at least). Thank you! On Sat, Jul 25, 2020 at 9:39 AM Fangrui Song <maskray at google.com> wrote:> On 2020-07-24, Zachary Turner via llvm-dev wrote: > >Suppose i write > > > >// foo.cpp > >__attribute__((section(“foo”))) int x; > > > >// bar.cpp > >__attribute__((section(“foo”))) int y; > > > >And i compile and link these two object files together using lld. What > >assumptions can I make regarding alignment/padding between the two > symbols? > > > >I’m comfortable getting an answer by reading the source, but that won’t > >tell if any properties i discover are guaranteed or just happenstance. > > I am only relatively confident about ELF semantics. > > The ELF specification defines says > (http://www.sco.com/developers/gabi/latest/ch4.sheader.html) that: > > ... When not otherwise constrained, sections should be emitted in input > order. > > For object files this is clear. For archives without --start-group this > can be considered clear as well. For archives with --start-group (LLD > implicitly adds --start-group for every archive)) the input order is the > time point its member is fetched. > > ld.lld ... foo.a bar.o # bar.o(foo) may come before foo.a(foo.o(foo)) > > >Are all of the following guaranteed ? > >A) relative order of symbols within a TU is not modified by the linker > > If you have more symbols defined relative to foo, i.e. > > __attribute__((section('foo"))) int x; > __attribute__((section("foo"))) int z; > > Yes, the order of x and z in the input section is their relative order > in the output section. The input section is handled as a whole. A linker > cannot reorder content within an input section. > > >B) No padding is inserted by the linker between symbols in a TU aside from > >that which was already inserted by the compiler/assembler > > Yes. The ELF spec says "... it must honor the alignment constraints of > the input sections (asserted by the sh_addralign field)" > > >C) When merging section A from inputs B and C, the minimal amount of > >padding necessary so that the first symbol from C is properly aligned is > >inserted. > > Yes (unless you use linker scripts: foo : { foo.o(foo) BYTE(0) bar.o(foo) > }) > > >I think(?) these conditions would be sufficient to guarantee, for example, > >that I could implement my own .ctor / @init_array logic using only the > >front end. > > You can implement .ctor, which is of type SHT_PROGBITS. > > For .init_array, GCC __attribute__((section(name))) cannot set the > section type to SHT_INIT_ARRAY. GNU as will set SHT_PROGBITS anyway but > also emit a warning: Warning: setting incorrect section attributes for > .init_array > > You can use module level inline asm (.pushsection ... .popsection), > though... > > >And are there any subtle interactions here with -fdata-sections or > >behavioral differences across COFF/ELF/MachO? > > > >Is there anyone who can provide some insight on this? >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200725/4860cd0d/attachment.html>