thr3ads.net - llvm dev - [llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Petr Hosek via llvm-dev

2021-Sep-21 01:29 UTC

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

Related to naming, is there a chance that other linkers might adopt this
feature as well? If so, maybe we should avoid including "lld" in the
name
and use a more generic name like .debug_linker_got and .debug_linker_plt?

On Mon, Sep 20, 2021 at 6:22 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> (minor quibble: I'd probably avoid using the ".debug_*"
namespace for
> things that seem pretty separate from/not a clear extension to DWARF - but
> maybe there's precedent for this? Not sure)
>
> On Mon, Sep 20, 2021 at 6:06 PM bd1976 llvm via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> As mentioned Sony would like LLD to optionally emit sections that
>> describe the GOT and PLT.
>>
>> The proposed binary format of these sections is as follows:
>>
>> .debug_lld_got
>> =============>>
>> The .debug_lld_got section contains a GOT description. The GOT
>> description begins with a header composed of the following fields:
>>
>> length (uleb)
>> - The length in bytes of the GOT description not including the length
>> field itself.
>> - This allows for padding to be added to the section, useful for
purposes
>> such as slop for incremental linking.
>> - The value cannot exceed Elf_Off.
>>
>> version (uleb)
>> - The version of the description information.
>> - Currently, 0.
>> - The value cannot exceed Elf_Word.
>>
>> The header is then followed by list of entry descriptions.
>> Each entry description describes the GOT entry with the same index.
>> Each entry description starts with three ulebs:
>>
>> - The first uleb gives the number of ulebs used by this description (so
>> that the description can be skipped if the category isn't
understood). The
>> value cannot exceed Elf_Word.
>> - The second uleb gives the number of GOT slots* used by this GOT
entry.
>> The value cannot exceed Elf_Word.
>> - The third uleb encodes the category of the GOT entry. The value
cannot
>> exceed Elf_Word.
>>
>> * Except for GOT_CAT_PADDING entries where this field gives the number
of
>> bytes of padding (the value cannot exceed Elf_Off) not the number of
GOT
>> slots.
>>
>> A category encoding can specify multiple associated arguments. Argument
>> interpretation is specified by the encoding. If an encoding requires
>> arguments, the bytes for those follow the bytes for the second uleb in
the
>> entry description.
>>
>> Categories are:
>>
>> Encoding                             Argument *      Size (slots)
>>  Notes
>> GOT_CAT_UNKNOWN                      none            1
>> Unknown area of the GOT.
>> GOT_CAT_PADDING                      none            <variable>
>>  Padding between GOT regions.
>>
>> The size field gives the padding size in bytes not the number of GOT
slots.
>> GOT_CAT_GOTPLT_HEADER                none            <target
dependent>
>> The .got.plt header. x86_64 size = 3 slots.
>> GOT_CAT_GOT                          symbol index    1
>> Normal entry for a symbol.
>> GOT_CAT_PLTGOT                       symbol index    1
>> .got.plt Entry for a PLT reference to a symbol.
>> GOT_CAT_IGOTPLT                      symbol index    1
>> .igot.plt entry for an ifunc.
>> GOT_CAT_IGOTCANONICAL                symbol index    1
>> GOT entry for canonical PLT entry for non-preemptible ifunc case.
>> GOT_CAT_TLSDESC                      symbol index    2
>> GOT entry for a TLSDESC slot.
>> GOT_CAT_TLS_GD                       symbol index    2
>> GOT entry for a GD TLS reference.
>> GOT_CAT_TLS_LD                       none            2
>> GOT entry for tls_index structure for an LD TLS reference.
>> GOT_CAT_TLS_IE                       symbol index    1
>> GOT entry for a IE TLS reference.
>> GOT_CAT_PPC64_V2_ABI_TLSLD_GOT_OFF   symbol index    1
>> PPC64 specific TLSLD GOT slot.
>>
>> .debug_lld_plt
>> =============>>
>> The .debug_lld_plt section contains a PLT description. A PLT
description
>> begins with a generic header composed of the following 3 ulebs:
>>
>> length (uleb)
>> - The length in bytes of this PLT description not including the length
>> field itself.
>> - This allows for padding to be added to the section, useful for
purposes
>> such as slop for incremental linking.
>> - The value cannot exceed Elf_Off.
>>
>> version (uleb)
>> - The version of this description information. Currently, 0. The value
>> cannot exceed Elf_Word.
>>
>> type (uleb)
>> - The type of the PLT being described.
>> - This affects the interpretation of the remaining description.
>> - Currently, only PLT_FIXSZ_ENT(value = 0) is defined for describing
PLT
>> sections composed of a header and N fixed size entries.
>> - The value cannot exceed Elf_Word; although, currently as there is
only
>> one value specified a smaller representation is sufficient.
>>
>> PLT_FIXSZ_ENT interpretation
>> Following the generic header is the PLT_FIXSZ_ENT description header
>> which is composed of the following 2 ulebs:
>>
>> PLT header size (uleb)
>> - The size of the PLT header in bytes.
>> - The value cannot exceed Elf_Off.
>>
>> PLT entry size (uleb)
>> - The size of a PLT entry.
>> - The value cannot exceed Elf_Word.
>>
>> The header is then followed by list of entry descriptions.
>> - Each entry description is a single uleb and describes the PLT entry
>> with the same index.
>> - The value of the uleb gives the index of the associated GOT entry.
>> - The value cannot exceed Elf_Off.
>>
>> In addition to allowing hot-patching tools to work with the GOT and PLT
>> the information in these sections is of use to any tool that needs to
>> display information on the GOT and PLT sections. For example, debuggers
and
>> binary tools synthesize labels of the form <symbol>@plt to label
the PLT
>> sections. The information in these sections could be used to simplify
such
>> tasks.
>>
>> On Wed, Sep 15, 2021 at 3:51 AM bd1976 llvm <bd1976llvm at
gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> Sony maintains a downstream patchset to optionally emit additional
>>> informational sections to the ELF output file created by LLD. These
>>> sections describe LLD's output and the transformations applied
during
>>> Linking. These additional sections are used with the static symbol
>>> table (.symtab) to facilitate the operation of hot-patching tools.
>>>
>>> Our preferences are that:
>>>
>>> - The information required for hot-patching is stored in the ELF
>>>   output file as ELF sections, as opposed to being emitted into
>>>   auxiliary files. Otherwise, customers have to adjust their
processes
>>>   to keep the ELF output file and auxiliary files together when
>>>   packing/moving the ELF output file and ensure they are correctly
>>>   matched.
>>>
>>> - These metadata sections are created by LLD, rather than derived
via
>>>   a post-link procedure. Performance is important, as customers
want
>>>   to be able to enable the emission of hot-patching metadata by
>>>   default, and having LLD directly emit the required sections is
more
>>>   efficient and a simpler work-flow.
>>>
>>> The contents of these sections could be seen as debugging
information
>>> for the linking process. Certainly, we would want to handle these
>>> sections with the same rules that apply to debugging sections when
>>> manipulating a linked ELF with binary utility tools. For that
reason
>>> the sections are all named .debug_lld_* e.g. .debug_lld_linkmap.
>>>
>>> Currently, Sony would like to emit the following sections and we
>>> believe that they are generally useful:
>>>
>>> - A linkmap section that contains a subset of the information
contained
>>>   in a linker -Map file. This section specifies the linked address
for
>>>   each input section.
>>>
>>> - A section which specifies the list of wrapped symbols.
>>>
>>> - A section that describes the GOT. This provides:
>>> -- A category for each entry, examples: GOT entry, PLTGOT entry,
TLS GD
>>>    entry, LD TLS tls_index structure entry etc..
>>> -- A slot index at which the entry starts.
>>> -- A size for the entry, as GOT entries may take more than one GOT
>>>    slot (e.g. a TLS GD entry takes two slots).
>>> -- An optional static symbol index to which the GOT entry is
associated
>>>    (some entries e.g. the LD TLS tls_index structure are not
associated
>>>    with a particular symbol).
>>>
>>> - A section describing the PLT. This section needs to be somewhat
>>>   flexible to deal with the many different PLT's that exist on
ELF
>>>   toolchains. However, for a fixed size entry PLT description the
section
>>>   will supply:
>>> -- Which range of bytes comprises the PLT header.
>>> -- The size of a PLT entry.
>>> -- For each PLT entry, the GOT slot index of the associated GOT
entry.
>>>    Combined with the information on GOT entries from the GOT
description
>>>    section this allows for the association of a PLT entry with a
symbol.
>>>
>>> Similar to DWARF sections these are non-alloc sections. They are
encoded
>>> as sequences of ULEB128 values. As these are debugging sections,
not core
>>> ELF sections, a compact representation is justifiable, even if the
>>> encoding
>>> is more complex.
>>>
>>> In order to anchor this discussion I have created
>>> https://reviews.llvm.org/D109804
>>> which contains a prototype implementation of the linkmap section
>>> referenced
>>> above.
>>>
>>> I would like to ascertain whether the LLVM community would be
>>> supportive of adding the ability to generate such sections to LLD?
>>>
>>> Thanks.
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210920/8aa2d6b8/attachment.html>

David Blaikie via llvm-dev

2021-Sep-21 01:34 UTC

head link

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

On Mon, Sep 20, 2021 at 6:29 PM Petr Hosek <phosek at google.com> wrote:
> Related to naming, is there a chance that other linkers might adopt this
> feature as well? If so, maybe we should avoid including "lld" in
the name
> and use a more generic name like .debug_linker_got and .debug_linker_plt?
>
Yeah, mixed feelings - using lld/llvm/something ensures we don't collide
with someone else's ideas, but may reduce the possibility of uptake
elsewhere. I'd usually err on a non-colliding name at first, and generalize
if there's interest, but it's possible the non-colliding name just
encourages other people to go make there own thing before anyone has a
chance to generalize it.

>
> On Mon, Sep 20, 2021 at 6:22 PM David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> (minor quibble: I'd probably avoid using the ".debug_*"
namespace for
>> things that seem pretty separate from/not a clear extension to DWARF -
but
>> maybe there's precedent for this? Not sure)
>>
>> On Mon, Sep 20, 2021 at 6:06 PM bd1976 llvm via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> As mentioned Sony would like LLD to optionally emit sections that
>>> describe the GOT and PLT.
>>>
>>> The proposed binary format of these sections is as follows:
>>>
>>> .debug_lld_got
>>> =============>>>
>>> The .debug_lld_got section contains a GOT description. The GOT
>>> description begins with a header composed of the following fields:
>>>
>>> length (uleb)
>>> - The length in bytes of the GOT description not including the
length
>>> field itself.
>>> - This allows for padding to be added to the section, useful for
>>> purposes such as slop for incremental linking.
>>> - The value cannot exceed Elf_Off.
>>>
>>> version (uleb)
>>> - The version of the description information.
>>> - Currently, 0.
>>> - The value cannot exceed Elf_Word.
>>>
>>> The header is then followed by list of entry descriptions.
>>> Each entry description describes the GOT entry with the same index.
>>> Each entry description starts with three ulebs:
>>>
>>> - The first uleb gives the number of ulebs used by this description
(so
>>> that the description can be skipped if the category isn't
understood). The
>>> value cannot exceed Elf_Word.
>>> - The second uleb gives the number of GOT slots* used by this GOT
entry.
>>> The value cannot exceed Elf_Word.
>>> - The third uleb encodes the category of the GOT entry. The value
cannot
>>> exceed Elf_Word.
>>>
>>> * Except for GOT_CAT_PADDING entries where this field gives the
number
>>> of bytes of padding (the value cannot exceed Elf_Off) not the
number of GOT
>>> slots.
>>>
>>> A category encoding can specify multiple associated arguments.
Argument
>>> interpretation is specified by the encoding. If an encoding
requires
>>> arguments, the bytes for those follow the bytes for the second uleb
in the
>>> entry description.
>>>
>>> Categories are:
>>>
>>> Encoding                             Argument *      Size (slots)
>>>  Notes
>>> GOT_CAT_UNKNOWN                      none            1
>>> Unknown area of the GOT.
>>> GOT_CAT_PADDING                      none           
<variable>
>>>  Padding between GOT regions.
>>>
>>> The size field gives the padding size in bytes not the number of
GOT slots.
>>> GOT_CAT_GOTPLT_HEADER                none            <target
dependent>
>>> The .got.plt header. x86_64 size = 3 slots.
>>> GOT_CAT_GOT                          symbol index    1
>>> Normal entry for a symbol.
>>> GOT_CAT_PLTGOT                       symbol index    1
>>> .got.plt Entry for a PLT reference to a symbol.
>>> GOT_CAT_IGOTPLT                      symbol index    1
>>> .igot.plt entry for an ifunc.
>>> GOT_CAT_IGOTCANONICAL                symbol index    1
>>> GOT entry for canonical PLT entry for non-preemptible ifunc case.
>>> GOT_CAT_TLSDESC                      symbol index    2
>>> GOT entry for a TLSDESC slot.
>>> GOT_CAT_TLS_GD                       symbol index    2
>>> GOT entry for a GD TLS reference.
>>> GOT_CAT_TLS_LD                       none            2
>>> GOT entry for tls_index structure for an LD TLS reference.
>>> GOT_CAT_TLS_IE                       symbol index    1
>>> GOT entry for a IE TLS reference.
>>> GOT_CAT_PPC64_V2_ABI_TLSLD_GOT_OFF   symbol index    1
>>> PPC64 specific TLSLD GOT slot.
>>>
>>> .debug_lld_plt
>>> =============>>>
>>> The .debug_lld_plt section contains a PLT description. A PLT
description
>>> begins with a generic header composed of the following 3 ulebs:
>>>
>>> length (uleb)
>>> - The length in bytes of this PLT description not including the
length
>>> field itself.
>>> - This allows for padding to be added to the section, useful for
>>> purposes such as slop for incremental linking.
>>> - The value cannot exceed Elf_Off.
>>>
>>> version (uleb)
>>> - The version of this description information. Currently, 0. The
value
>>> cannot exceed Elf_Word.
>>>
>>> type (uleb)
>>> - The type of the PLT being described.
>>> - This affects the interpretation of the remaining description.
>>> - Currently, only PLT_FIXSZ_ENT(value = 0) is defined for
describing PLT
>>> sections composed of a header and N fixed size entries.
>>> - The value cannot exceed Elf_Word; although, currently as there is
only
>>> one value specified a smaller representation is sufficient.
>>>
>>> PLT_FIXSZ_ENT interpretation
>>> Following the generic header is the PLT_FIXSZ_ENT description
header
>>> which is composed of the following 2 ulebs:
>>>
>>> PLT header size (uleb)
>>> - The size of the PLT header in bytes.
>>> - The value cannot exceed Elf_Off.
>>>
>>> PLT entry size (uleb)
>>> - The size of a PLT entry.
>>> - The value cannot exceed Elf_Word.
>>>
>>> The header is then followed by list of entry descriptions.
>>> - Each entry description is a single uleb and describes the PLT
entry
>>> with the same index.
>>> - The value of the uleb gives the index of the associated GOT
entry.
>>> - The value cannot exceed Elf_Off.
>>>
>>> In addition to allowing hot-patching tools to work with the GOT and
PLT
>>> the information in these sections is of use to any tool that needs
to
>>> display information on the GOT and PLT sections. For example,
debuggers and
>>> binary tools synthesize labels of the form <symbol>@plt to
label the PLT
>>> sections. The information in these sections could be used to
simplify such
>>> tasks.
>>>
>>> On Wed, Sep 15, 2021 at 3:51 AM bd1976 llvm <bd1976llvm at
gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> Sony maintains a downstream patchset to optionally emit
additional
>>>> informational sections to the ELF output file created by LLD.
These
>>>> sections describe LLD's output and the transformations
applied during
>>>> Linking. These additional sections are used with the static
symbol
>>>> table (.symtab) to facilitate the operation of hot-patching
tools.
>>>>
>>>> Our preferences are that:
>>>>
>>>> - The information required for hot-patching is stored in the
ELF
>>>>   output file as ELF sections, as opposed to being emitted into
>>>>   auxiliary files. Otherwise, customers have to adjust their
processes
>>>>   to keep the ELF output file and auxiliary files together when
>>>>   packing/moving the ELF output file and ensure they are
correctly
>>>>   matched.
>>>>
>>>> - These metadata sections are created by LLD, rather than
derived via
>>>>   a post-link procedure. Performance is important, as customers
want
>>>>   to be able to enable the emission of hot-patching metadata by
>>>>   default, and having LLD directly emit the required sections
is more
>>>>   efficient and a simpler work-flow.
>>>>
>>>> The contents of these sections could be seen as debugging
information
>>>> for the linking process. Certainly, we would want to handle
these
>>>> sections with the same rules that apply to debugging sections
when
>>>> manipulating a linked ELF with binary utility tools. For that
reason
>>>> the sections are all named .debug_lld_* e.g.
.debug_lld_linkmap.
>>>>
>>>> Currently, Sony would like to emit the following sections and
we
>>>> believe that they are generally useful:
>>>>
>>>> - A linkmap section that contains a subset of the information
contained
>>>>   in a linker -Map file. This section specifies the linked
address for
>>>>   each input section.
>>>>
>>>> - A section which specifies the list of wrapped symbols.
>>>>
>>>> - A section that describes the GOT. This provides:
>>>> -- A category for each entry, examples: GOT entry, PLTGOT
entry, TLS GD
>>>>    entry, LD TLS tls_index structure entry etc..
>>>> -- A slot index at which the entry starts.
>>>> -- A size for the entry, as GOT entries may take more than one
GOT
>>>>    slot (e.g. a TLS GD entry takes two slots).
>>>> -- An optional static symbol index to which the GOT entry is
associated
>>>>    (some entries e.g. the LD TLS tls_index structure are not
associated
>>>>    with a particular symbol).
>>>>
>>>> - A section describing the PLT. This section needs to be
somewhat
>>>>   flexible to deal with the many different PLT's that exist
on ELF
>>>>   toolchains. However, for a fixed size entry PLT description
the
>>>> section
>>>>   will supply:
>>>> -- Which range of bytes comprises the PLT header.
>>>> -- The size of a PLT entry.
>>>> -- For each PLT entry, the GOT slot index of the associated GOT
entry.
>>>>    Combined with the information on GOT entries from the GOT
>>>> description
>>>>    section this allows for the association of a PLT entry with
a symbol.
>>>>
>>>> Similar to DWARF sections these are non-alloc sections. They
are encoded
>>>> as sequences of ULEB128 values. As these are debugging
sections, not
>>>> core
>>>> ELF sections, a compact representation is justifiable, even if
the
>>>> encoding
>>>> is more complex.
>>>>
>>>> In order to anchor this discussion I have created
>>>> https://reviews.llvm.org/D109804
>>>> which contains a prototype implementation of the linkmap
section
>>>> referenced
>>>> above.
>>>>
>>>> I would like to ascertain whether the LLVM community would be
>>>> supportive of adding the ability to generate such sections to
LLD?
>>>>
>>>> Thanks.
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210920/eeebc7a0/attachment-0001.html>

llvm dev - Sep 2021 - [RFC] Debug sections for hot-patching LLD's ELF output

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output

[llvm-dev] [RFC] Debug sections for hot-patching LLD's ELF output