thr3ads.net - llvm dev - [llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output [Sep 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Sep-21 01:21 UTC

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

(minor quibble: I'd probably avoid using the ".debug_*" namespace
for
things that seem pretty separate from/not a clear extension to DWARF - but
maybe there's precedent for this? Not sure)

On Mon, Sep 20, 2021 at 6:06 PM bd1976 llvm via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> As mentioned Sony would like LLD to optionally emit sections that describe
> the GOT and PLT.
>
> The proposed binary format of these sections is as follows:
>
> .debug_lld_got
> =============>
> The .debug_lld_got section contains a GOT description. The GOT description
> begins with a header composed of the following fields:
>
> length (uleb)
> - The length in bytes of the GOT description not including the length
> field itself.
> - This allows for padding to be added to the section, useful for purposes
> such as slop for incremental linking.
> - The value cannot exceed Elf_Off.
>
> version (uleb)
> - The version of the description information.
> - Currently, 0.
> - The value cannot exceed Elf_Word.
>
> The header is then followed by list of entry descriptions.
> Each entry description describes the GOT entry with the same index.
> Each entry description starts with three ulebs:
>
> - The first uleb gives the number of ulebs used by this description (so
> that the description can be skipped if the category isn't understood).
The
> value cannot exceed Elf_Word.
> - The second uleb gives the number of GOT slots* used by this GOT entry.
> The value cannot exceed Elf_Word.
> - The third uleb encodes the category of the GOT entry. The value cannot
> exceed Elf_Word.
>
> * Except for GOT_CAT_PADDING entries where this field gives the number of
> bytes of padding (the value cannot exceed Elf_Off) not the number of GOT
> slots.
>
> A category encoding can specify multiple associated arguments. Argument
> interpretation is specified by the encoding. If an encoding requires
> arguments, the bytes for those follow the bytes for the second uleb in the
> entry description.
>
> Categories are:
>
> Encoding                             Argument *      Size (slots)
>  Notes
> GOT_CAT_UNKNOWN                      none            1
> Unknown area of the GOT.
> GOT_CAT_PADDING                      none            <variable>
>  Padding between GOT regions.
>
> The size field gives the padding size in bytes not the number of GOT slots.
> GOT_CAT_GOTPLT_HEADER                none            <target
dependent>
> The .got.plt header. x86_64 size = 3 slots.
> GOT_CAT_GOT                          symbol index    1
> Normal entry for a symbol.
> GOT_CAT_PLTGOT                       symbol index    1
> .got.plt Entry for a PLT reference to a symbol.
> GOT_CAT_IGOTPLT                      symbol index    1
> .igot.plt entry for an ifunc.
> GOT_CAT_IGOTCANONICAL                symbol index    1
> GOT entry for canonical PLT entry for non-preemptible ifunc case.
> GOT_CAT_TLSDESC                      symbol index    2
> GOT entry for a TLSDESC slot.
> GOT_CAT_TLS_GD                       symbol index    2
> GOT entry for a GD TLS reference.
> GOT_CAT_TLS_LD                       none            2
> GOT entry for tls_index structure for an LD TLS reference.
> GOT_CAT_TLS_IE                       symbol index    1
> GOT entry for a IE TLS reference.
> GOT_CAT_PPC64_V2_ABI_TLSLD_GOT_OFF   symbol index    1
> PPC64 specific TLSLD GOT slot.
>
> .debug_lld_plt
> =============>
> The .debug_lld_plt section contains a PLT description. A PLT description
> begins with a generic header composed of the following 3 ulebs:
>
> length (uleb)
> - The length in bytes of this PLT description not including the length
> field itself.
> - This allows for padding to be added to the section, useful for purposes
> such as slop for incremental linking.
> - The value cannot exceed Elf_Off.
>
> version (uleb)
> - The version of this description information. Currently, 0. The value
> cannot exceed Elf_Word.
>
> type (uleb)
> - The type of the PLT being described.
> - This affects the interpretation of the remaining description.
> - Currently, only PLT_FIXSZ_ENT(value = 0) is defined for describing PLT
> sections composed of a header and N fixed size entries.
> - The value cannot exceed Elf_Word; although, currently as there is only
> one value specified a smaller representation is sufficient.
>
> PLT_FIXSZ_ENT interpretation
> Following the generic header is the PLT_FIXSZ_ENT description header which
> is composed of the following 2 ulebs:
>
> PLT header size (uleb)
> - The size of the PLT header in bytes.
> - The value cannot exceed Elf_Off.
>
> PLT entry size (uleb)
> - The size of a PLT entry.
> - The value cannot exceed Elf_Word.
>
> The header is then followed by list of entry descriptions.
> - Each entry description is a single uleb and describes the PLT entry with
> the same index.
> - The value of the uleb gives the index of the associated GOT entry.
> - The value cannot exceed Elf_Off.
>
> In addition to allowing hot-patching tools to work with the GOT and PLT
> the information in these sections is of use to any tool that needs to
> display information on the GOT and PLT sections. For example, debuggers and
> binary tools synthesize labels of the form <symbol>@plt to label the
PLT
> sections. The information in these sections could be used to simplify such
> tasks.
>
> On Wed, Sep 15, 2021 at 3:51 AM bd1976 llvm <bd1976llvm at gmail.com>
wrote:
>
>> Hi All,
>>
>> Sony maintains a downstream patchset to optionally emit additional
>> informational sections to the ELF output file created by LLD. These
>> sections describe LLD's output and the transformations applied
during
>> Linking. These additional sections are used with the static symbol
>> table (.symtab) to facilitate the operation of hot-patching tools.
>>
>> Our preferences are that:
>>
>> - The information required for hot-patching is stored in the ELF
>>   output file as ELF sections, as opposed to being emitted into
>>   auxiliary files. Otherwise, customers have to adjust their processes
>>   to keep the ELF output file and auxiliary files together when
>>   packing/moving the ELF output file and ensure they are correctly
>>   matched.
>>
>> - These metadata sections are created by LLD, rather than derived via
>>   a post-link procedure. Performance is important, as customers want
>>   to be able to enable the emission of hot-patching metadata by
>>   default, and having LLD directly emit the required sections is more
>>   efficient and a simpler work-flow.
>>
>> The contents of these sections could be seen as debugging information
>> for the linking process. Certainly, we would want to handle these
>> sections with the same rules that apply to debugging sections when
>> manipulating a linked ELF with binary utility tools. For that reason
>> the sections are all named .debug_lld_* e.g. .debug_lld_linkmap.
>>
>> Currently, Sony would like to emit the following sections and we
>> believe that they are generally useful:
>>
>> - A linkmap section that contains a subset of the information contained
>>   in a linker -Map file. This section specifies the linked address for
>>   each input section.
>>
>> - A section which specifies the list of wrapped symbols.
>>
>> - A section that describes the GOT. This provides:
>> -- A category for each entry, examples: GOT entry, PLTGOT entry, TLS GD
>>    entry, LD TLS tls_index structure entry etc..
>> -- A slot index at which the entry starts.
>> -- A size for the entry, as GOT entries may take more than one GOT
>>    slot (e.g. a TLS GD entry takes two slots).
>> -- An optional static symbol index to which the GOT entry is associated
>>    (some entries e.g. the LD TLS tls_index structure are not associated
>>    with a particular symbol).
>>
>> - A section describing the PLT. This section needs to be somewhat
>>   flexible to deal with the many different PLT's that exist on ELF
>>   toolchains. However, for a fixed size entry PLT description the
section
>>   will supply:
>> -- Which range of bytes comprises the PLT header.
>> -- The size of a PLT entry.
>> -- For each PLT entry, the GOT slot index of the associated GOT entry.
>>    Combined with the information on GOT entries from the GOT
description
>>    section this allows for the association of a PLT entry with a
symbol.
>>
>> Similar to DWARF sections these are non-alloc sections. They are
encoded
>> as sequences of ULEB128 values. As these are debugging sections, not
core
>> ELF sections, a compact representation is justifiable, even if the
>> encoding
>> is more complex.
>>
>> In order to anchor this discussion I have created
>> https://reviews.llvm.org/D109804
>> which contains a prototype implementation of the linkmap section
>> referenced
>> above.
>>
>> I would like to ascertain whether the LLVM community would be
>> supportive of adding the ability to generate such sections to LLD?
>>
>> Thanks.
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210920/c1edb3e0/attachment.html>

Petr Hosek via llvm-dev

2021-Sep-21 01:29 UTC

head link

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

Related to naming, is there a chance that other linkers might adopt this
feature as well? If so, maybe we should avoid including "lld" in the
name
and use a more generic name like .debug_linker_got and .debug_linker_plt?

On Mon, Sep 20, 2021 at 6:22 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> (minor quibble: I'd probably avoid using the ".debug_*"
namespace for
> things that seem pretty separate from/not a clear extension to DWARF - but
> maybe there's precedent for this? Not sure)
>
> On Mon, Sep 20, 2021 at 6:06 PM bd1976 llvm via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> As mentioned Sony would like LLD to optionally emit sections that
>> describe the GOT and PLT.
>>
>> The proposed binary format of these sections is as follows:
>>
>> .debug_lld_got
>> =============>>
>> The .debug_lld_got section contains a GOT description. The GOT
>> description begins with a header composed of the following fields:
>>
>> length (uleb)
>> - The length in bytes of the GOT description not including the length
>> field itself.
>> - This allows for padding to be added to the section, useful for
purposes
>> such as slop for incremental linking.
>> - The value cannot exceed Elf_Off.
>>
>> version (uleb)
>> - The version of the description information.
>> - Currently, 0.
>> - The value cannot exceed Elf_Word.
>>
>> The header is then followed by list of entry descriptions.
>> Each entry description describes the GOT entry with the same index.
>> Each entry description starts with three ulebs:
>>
>> - The first uleb gives the number of ulebs used by this description (so
>> that the description can be skipped if the category isn't
understood). The
>> value cannot exceed Elf_Word.
>> - The second uleb gives the number of GOT slots* used by this GOT
entry.
>> The value cannot exceed Elf_Word.
>> - The third uleb encodes the category of the GOT entry. The value
cannot
>> exceed Elf_Word.
>>
>> * Except for GOT_CAT_PADDING entries where this field gives the number
of
>> bytes of padding (the value cannot exceed Elf_Off) not the number of
GOT
>> slots.
>>
>> A category encoding can specify multiple associated arguments. Argument
>> interpretation is specified by the encoding. If an encoding requires
>> arguments, the bytes for those follow the bytes for the second uleb in
the
>> entry description.
>>
>> Categories are:
>>
>> Encoding                             Argument *      Size (slots)
>>  Notes
>> GOT_CAT_UNKNOWN                      none            1
>> Unknown area of the GOT.
>> GOT_CAT_PADDING                      none            <variable>
>>  Padding between GOT regions.
>>
>> The size field gives the padding size in bytes not the number of GOT
slots.
>> GOT_CAT_GOTPLT_HEADER                none            <target
dependent>
>> The .got.plt header. x86_64 size = 3 slots.
>> GOT_CAT_GOT                          symbol index    1
>> Normal entry for a symbol.
>> GOT_CAT_PLTGOT                       symbol index    1
>> .got.plt Entry for a PLT reference to a symbol.
>> GOT_CAT_IGOTPLT                      symbol index    1
>> .igot.plt entry for an ifunc.
>> GOT_CAT_IGOTCANONICAL                symbol index    1
>> GOT entry for canonical PLT entry for non-preemptible ifunc case.
>> GOT_CAT_TLSDESC                      symbol index    2
>> GOT entry for a TLSDESC slot.
>> GOT_CAT_TLS_GD                       symbol index    2
>> GOT entry for a GD TLS reference.
>> GOT_CAT_TLS_LD                       none            2
>> GOT entry for tls_index structure for an LD TLS reference.
>> GOT_CAT_TLS_IE                       symbol index    1
>> GOT entry for a IE TLS reference.
>> GOT_CAT_PPC64_V2_ABI_TLSLD_GOT_OFF   symbol index    1
>> PPC64 specific TLSLD GOT slot.
>>
>> .debug_lld_plt
>> =============>>
>> The .debug_lld_plt section contains a PLT description. A PLT
description
>> begins with a generic header composed of the following 3 ulebs:
>>
>> length (uleb)
>> - The length in bytes of this PLT description not including the length
>> field itself.
>> - This allows for padding to be added to the section, useful for
purposes
>> such as slop for incremental linking.
>> - The value cannot exceed Elf_Off.
>>
>> version (uleb)
>> - The version of this description information. Currently, 0. The value
>> cannot exceed Elf_Word.
>>
>> type (uleb)
>> - The type of the PLT being described.
>> - This affects the interpretation of the remaining description.
>> - Currently, only PLT_FIXSZ_ENT(value = 0) is defined for describing
PLT
>> sections composed of a header and N fixed size entries.
>> - The value cannot exceed Elf_Word; although, currently as there is
only
>> one value specified a smaller representation is sufficient.
>>
>> PLT_FIXSZ_ENT interpretation
>> Following the generic header is the PLT_FIXSZ_ENT description header
>> which is composed of the following 2 ulebs:
>>
>> PLT header size (uleb)
>> - The size of the PLT header in bytes.
>> - The value cannot exceed Elf_Off.
>>
>> PLT entry size (uleb)
>> - The size of a PLT entry.
>> - The value cannot exceed Elf_Word.
>>
>> The header is then followed by list of entry descriptions.
>> - Each entry description is a single uleb and describes the PLT entry
>> with the same index.
>> - The value of the uleb gives the index of the associated GOT entry.
>> - The value cannot exceed Elf_Off.
>>
>> In addition to allowing hot-patching tools to work with the GOT and PLT
>> the information in these sections is of use to any tool that needs to
>> display information on the GOT and PLT sections. For example, debuggers
and
>> binary tools synthesize labels of the form <symbol>@plt to label
the PLT
>> sections. The information in these sections could be used to simplify
such
>> tasks.
>>
>> On Wed, Sep 15, 2021 at 3:51 AM bd1976 llvm <bd1976llvm at
gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> Sony maintains a downstream patchset to optionally emit additional
>>> informational sections to the ELF output file created by LLD. These
>>> sections describe LLD's output and the transformations applied
during
>>> Linking. These additional sections are used with the static symbol
>>> table (.symtab) to facilitate the operation of hot-patching tools.
>>>
>>> Our preferences are that:
>>>
>>> - The information required for hot-patching is stored in the ELF
>>>   output file as ELF sections, as opposed to being emitted into
>>>   auxiliary files. Otherwise, customers have to adjust their
processes
>>>   to keep the ELF output file and auxiliary files together when
>>>   packing/moving the ELF output file and ensure they are correctly
>>>   matched.
>>>
>>> - These metadata sections are created by LLD, rather than derived
via
>>>   a post-link procedure. Performance is important, as customers
want
>>>   to be able to enable the emission of hot-patching metadata by
>>>   default, and having LLD directly emit the required sections is
more
>>>   efficient and a simpler work-flow.
>>>
>>> The contents of these sections could be seen as debugging
information
>>> for the linking process. Certainly, we would want to handle these
>>> sections with the same rules that apply to debugging sections when
>>> manipulating a linked ELF with binary utility tools. For that
reason
>>> the sections are all named .debug_lld_* e.g. .debug_lld_linkmap.
>>>
>>> Currently, Sony would like to emit the following sections and we
>>> believe that they are generally useful:
>>>
>>> - A linkmap section that contains a subset of the information
contained
>>>   in a linker -Map file. This section specifies the linked address
for
>>>   each input section.
>>>
>>> - A section which specifies the list of wrapped symbols.
>>>
>>> - A section that describes the GOT. This provides:
>>> -- A category for each entry, examples: GOT entry, PLTGOT entry,
TLS GD
>>>    entry, LD TLS tls_index structure entry etc..
>>> -- A slot index at which the entry starts.
>>> -- A size for the entry, as GOT entries may take more than one GOT
>>>    slot (e.g. a TLS GD entry takes two slots).
>>> -- An optional static symbol index to which the GOT entry is
associated
>>>    (some entries e.g. the LD TLS tls_index structure are not
associated
>>>    with a particular symbol).
>>>
>>> - A section describing the PLT. This section needs to be somewhat
>>>   flexible to deal with the many different PLT's that exist on
ELF
>>>   toolchains. However, for a fixed size entry PLT description the
section
>>>   will supply:
>>> -- Which range of bytes comprises the PLT header.
>>> -- The size of a PLT entry.
>>> -- For each PLT entry, the GOT slot index of the associated GOT
entry.
>>>    Combined with the information on GOT entries from the GOT
description
>>>    section this allows for the association of a PLT entry with a
symbol.
>>>
>>> Similar to DWARF sections these are non-alloc sections. They are
encoded
>>> as sequences of ULEB128 values. As these are debugging sections,
not core
>>> ELF sections, a compact representation is justifiable, even if the
>>> encoding
>>> is more complex.
>>>
>>> In order to anchor this discussion I have created
>>> https://reviews.llvm.org/D109804
>>> which contains a prototype implementation of the linkmap section
>>> referenced
>>> above.
>>>
>>> I would like to ascertain whether the LLVM community would be
>>> supportive of adding the ability to generate such sections to LLD?
>>>
>>> Thanks.
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210920/8aa2d6b8/attachment.html>

llvm dev - Sep 2021 - [RFC] Debug sections for hot-patching LLD's ELF output

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output