bd1976 llvm via llvm-dev
2021-Sep-15 02:51 UTC
[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output
Hi All, Sony maintains a downstream patchset to optionally emit additional informational sections to the ELF output file created by LLD. These sections describe LLD's output and the transformations applied during Linking. These additional sections are used with the static symbol table (.symtab) to facilitate the operation of hot-patching tools. Our preferences are that: - The information required for hot-patching is stored in the ELF output file as ELF sections, as opposed to being emitted into auxiliary files. Otherwise, customers have to adjust their processes to keep the ELF output file and auxiliary files together when packing/moving the ELF output file and ensure they are correctly matched. - These metadata sections are created by LLD, rather than derived via a post-link procedure. Performance is important, as customers want to be able to enable the emission of hot-patching metadata by default, and having LLD directly emit the required sections is more efficient and a simpler work-flow. The contents of these sections could be seen as debugging information for the linking process. Certainly, we would want to handle these sections with the same rules that apply to debugging sections when manipulating a linked ELF with binary utility tools. For that reason the sections are all named .debug_lld_* e.g. .debug_lld_linkmap. Currently, Sony would like to emit the following sections and we believe that they are generally useful: - A linkmap section that contains a subset of the information contained in a linker -Map file. This section specifies the linked address for each input section. - A section which specifies the list of wrapped symbols. - A section that describes the GOT. This provides: -- A category for each entry, examples: GOT entry, PLTGOT entry, TLS GD entry, LD TLS tls_index structure entry etc.. -- A slot index at which the entry starts. -- A size for the entry, as GOT entries may take more than one GOT slot (e.g. a TLS GD entry takes two slots). -- An optional static symbol index to which the GOT entry is associated (some entries e.g. the LD TLS tls_index structure are not associated with a particular symbol). - A section describing the PLT. This section needs to be somewhat flexible to deal with the many different PLT's that exist on ELF toolchains. However, for a fixed size entry PLT description the section will supply: -- Which range of bytes comprises the PLT header. -- The size of a PLT entry. -- For each PLT entry, the GOT slot index of the associated GOT entry. Combined with the information on GOT entries from the GOT description section this allows for the association of a PLT entry with a symbol. Similar to DWARF sections these are non-alloc sections. They are encoded as sequences of ULEB128 values. As these are debugging sections, not core ELF sections, a compact representation is justifiable, even if the encoding is more complex. In order to anchor this discussion I have created https://reviews.llvm.org/D109804 which contains a prototype implementation of the linkmap section referenced above. I would like to ascertain whether the LLVM community would be supportive of adding the ability to generate such sections to LLD? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210915/98f4a0f9/attachment.html>
Peter Smith via llvm-dev
2021-Sep-16 08:30 UTC
[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output
Thanks for sending this out. My initial reaction is that this would be most useful for post linking tools. For human readable output only I expect that we’d be comfortable with existing map file output and a disassembly. I have a small concern of upstream maintainability without the binary patching tools themselves. For example it may be that all we have is the llvm-readobj/llvm-objdump to textually dump the output. It is possible that we could make modifications with corresponding changes to the text dumps that could break assumptions the binary patching tools are making. I think this is likely to be rare, but I couldn’t rule it out. While I wouldn’t object as I think the extra debug output is not likely to need a lot of maintenance I think it would be good to get someone actively interested in binary patching or some other post-link tool to comment. Peter From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of bd1976 llvm via llvm-dev Sent: 15 September 2021 03:52 To: llvm-dev <llvm-dev at lists.llvm.org> Subject: [llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output Hi All, Sony maintains a downstream patchset to optionally emit additional informational sections to the ELF output file created by LLD. These sections describe LLD's output and the transformations applied during Linking. These additional sections are used with the static symbol table (.symtab) to facilitate the operation of hot-patching tools. Our preferences are that: - The information required for hot-patching is stored in the ELF output file as ELF sections, as opposed to being emitted into auxiliary files. Otherwise, customers have to adjust their processes to keep the ELF output file and auxiliary files together when packing/moving the ELF output file and ensure they are correctly matched. - These metadata sections are created by LLD, rather than derived via a post-link procedure. Performance is important, as customers want to be able to enable the emission of hot-patching metadata by default, and having LLD directly emit the required sections is more efficient and a simpler work-flow. The contents of these sections could be seen as debugging information for the linking process. Certainly, we would want to handle these sections with the same rules that apply to debugging sections when manipulating a linked ELF with binary utility tools. For that reason the sections are all named .debug_lld_* e.g. .debug_lld_linkmap. Currently, Sony would like to emit the following sections and we believe that they are generally useful: - A linkmap section that contains a subset of the information contained in a linker -Map file. This section specifies the linked address for each input section. - A section which specifies the list of wrapped symbols. - A section that describes the GOT. This provides: -- A category for each entry, examples: GOT entry, PLTGOT entry, TLS GD entry, LD TLS tls_index structure entry etc.. -- A slot index at which the entry starts. -- A size for the entry, as GOT entries may take more than one GOT slot (e.g. a TLS GD entry takes two slots). -- An optional static symbol index to which the GOT entry is associated (some entries e.g. the LD TLS tls_index structure are not associated with a particular symbol). - A section describing the PLT. This section needs to be somewhat flexible to deal with the many different PLT's that exist on ELF toolchains. However, for a fixed size entry PLT description the section will supply: -- Which range of bytes comprises the PLT header. -- The size of a PLT entry. -- For each PLT entry, the GOT slot index of the associated GOT entry. Combined with the information on GOT entries from the GOT description section this allows for the association of a PLT entry with a symbol. Similar to DWARF sections these are non-alloc sections. They are encoded as sequences of ULEB128 values. As these are debugging sections, not core ELF sections, a compact representation is justifiable, even if the encoding is more complex. In order to anchor this discussion I have created https://reviews.llvm.org/D109804 which contains a prototype implementation of the linkmap section referenced above. I would like to ascertain whether the LLVM community would be supportive of adding the ability to generate such sections to LLD? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/7ecb41d6/attachment-0001.html>
bd1976 llvm via llvm-dev
2021-Sep-21 01:05 UTC
[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output
As mentioned Sony would like LLD to optionally emit sections that describe the GOT and PLT. The proposed binary format of these sections is as follows: .debug_lld_got ============= The .debug_lld_got section contains a GOT description. The GOT description begins with a header composed of the following fields: length (uleb) - The length in bytes of the GOT description not including the length field itself. - This allows for padding to be added to the section, useful for purposes such as slop for incremental linking. - The value cannot exceed Elf_Off. version (uleb) - The version of the description information. - Currently, 0. - The value cannot exceed Elf_Word. The header is then followed by list of entry descriptions. Each entry description describes the GOT entry with the same index. Each entry description starts with three ulebs: - The first uleb gives the number of ulebs used by this description (so that the description can be skipped if the category isn't understood). The value cannot exceed Elf_Word. - The second uleb gives the number of GOT slots* used by this GOT entry. The value cannot exceed Elf_Word. - The third uleb encodes the category of the GOT entry. The value cannot exceed Elf_Word. * Except for GOT_CAT_PADDING entries where this field gives the number of bytes of padding (the value cannot exceed Elf_Off) not the number of GOT slots. A category encoding can specify multiple associated arguments. Argument interpretation is specified by the encoding. If an encoding requires arguments, the bytes for those follow the bytes for the second uleb in the entry description. Categories are: Encoding Argument * Size (slots) Notes GOT_CAT_UNKNOWN none 1 Unknown area of the GOT. GOT_CAT_PADDING none <variable> Padding between GOT regions. The size field gives the padding size in bytes not the number of GOT slots. GOT_CAT_GOTPLT_HEADER none <target dependent> The .got.plt header. x86_64 size = 3 slots. GOT_CAT_GOT symbol index 1 Normal entry for a symbol. GOT_CAT_PLTGOT symbol index 1 .got.plt Entry for a PLT reference to a symbol. GOT_CAT_IGOTPLT symbol index 1 .igot.plt entry for an ifunc. GOT_CAT_IGOTCANONICAL symbol index 1 GOT entry for canonical PLT entry for non-preemptible ifunc case. GOT_CAT_TLSDESC symbol index 2 GOT entry for a TLSDESC slot. GOT_CAT_TLS_GD symbol index 2 GOT entry for a GD TLS reference. GOT_CAT_TLS_LD none 2 GOT entry for tls_index structure for an LD TLS reference. GOT_CAT_TLS_IE symbol index 1 GOT entry for a IE TLS reference. GOT_CAT_PPC64_V2_ABI_TLSLD_GOT_OFF symbol index 1 PPC64 specific TLSLD GOT slot. .debug_lld_plt ============= The .debug_lld_plt section contains a PLT description. A PLT description begins with a generic header composed of the following 3 ulebs: length (uleb) - The length in bytes of this PLT description not including the length field itself. - This allows for padding to be added to the section, useful for purposes such as slop for incremental linking. - The value cannot exceed Elf_Off. version (uleb) - The version of this description information. Currently, 0. The value cannot exceed Elf_Word. type (uleb) - The type of the PLT being described. - This affects the interpretation of the remaining description. - Currently, only PLT_FIXSZ_ENT(value = 0) is defined for describing PLT sections composed of a header and N fixed size entries. - The value cannot exceed Elf_Word; although, currently as there is only one value specified a smaller representation is sufficient. PLT_FIXSZ_ENT interpretation Following the generic header is the PLT_FIXSZ_ENT description header which is composed of the following 2 ulebs: PLT header size (uleb) - The size of the PLT header in bytes. - The value cannot exceed Elf_Off. PLT entry size (uleb) - The size of a PLT entry. - The value cannot exceed Elf_Word. The header is then followed by list of entry descriptions. - Each entry description is a single uleb and describes the PLT entry with the same index. - The value of the uleb gives the index of the associated GOT entry. - The value cannot exceed Elf_Off. In addition to allowing hot-patching tools to work with the GOT and PLT the information in these sections is of use to any tool that needs to display information on the GOT and PLT sections. For example, debuggers and binary tools synthesize labels of the form <symbol>@plt to label the PLT sections. The information in these sections could be used to simplify such tasks. On Wed, Sep 15, 2021 at 3:51 AM bd1976 llvm <bd1976llvm at gmail.com> wrote:> Hi All, > > Sony maintains a downstream patchset to optionally emit additional > informational sections to the ELF output file created by LLD. These > sections describe LLD's output and the transformations applied during > Linking. These additional sections are used with the static symbol > table (.symtab) to facilitate the operation of hot-patching tools. > > Our preferences are that: > > - The information required for hot-patching is stored in the ELF > output file as ELF sections, as opposed to being emitted into > auxiliary files. Otherwise, customers have to adjust their processes > to keep the ELF output file and auxiliary files together when > packing/moving the ELF output file and ensure they are correctly > matched. > > - These metadata sections are created by LLD, rather than derived via > a post-link procedure. Performance is important, as customers want > to be able to enable the emission of hot-patching metadata by > default, and having LLD directly emit the required sections is more > efficient and a simpler work-flow. > > The contents of these sections could be seen as debugging information > for the linking process. Certainly, we would want to handle these > sections with the same rules that apply to debugging sections when > manipulating a linked ELF with binary utility tools. For that reason > the sections are all named .debug_lld_* e.g. .debug_lld_linkmap. > > Currently, Sony would like to emit the following sections and we > believe that they are generally useful: > > - A linkmap section that contains a subset of the information contained > in a linker -Map file. This section specifies the linked address for > each input section. > > - A section which specifies the list of wrapped symbols. > > - A section that describes the GOT. This provides: > -- A category for each entry, examples: GOT entry, PLTGOT entry, TLS GD > entry, LD TLS tls_index structure entry etc.. > -- A slot index at which the entry starts. > -- A size for the entry, as GOT entries may take more than one GOT > slot (e.g. a TLS GD entry takes two slots). > -- An optional static symbol index to which the GOT entry is associated > (some entries e.g. the LD TLS tls_index structure are not associated > with a particular symbol). > > - A section describing the PLT. This section needs to be somewhat > flexible to deal with the many different PLT's that exist on ELF > toolchains. However, for a fixed size entry PLT description the section > will supply: > -- Which range of bytes comprises the PLT header. > -- The size of a PLT entry. > -- For each PLT entry, the GOT slot index of the associated GOT entry. > Combined with the information on GOT entries from the GOT description > section this allows for the association of a PLT entry with a symbol. > > Similar to DWARF sections these are non-alloc sections. They are encoded > as sequences of ULEB128 values. As these are debugging sections, not core > ELF sections, a compact representation is justifiable, even if the encoding > is more complex. > > In order to anchor this discussion I have created > https://reviews.llvm.org/D109804 > which contains a prototype implementation of the linkmap section referenced > above. > > I would like to ascertain whether the LLVM community would be > supportive of adding the ability to generate such sections to LLD? > > Thanks. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210921/39f4e7f1/attachment.html>