thr3ads.net - llvm dev - [llvm-dev] [LLD] Support DWARF64, debug

If this information is useful, please help other people find it:
Share via:

Fangrui Song via llvm-dev

2020-Nov-12 02:10 UTC

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On 2020-11-12, Alexander Yermolovich wrote:>Thanks for feedback.
>
>I agree with patch and numbers this will be a more concrete discussion, but
I wanted to judge overall receptiveness to this approach and see maybe there was
a better way.
>
>"Whilst the majority of objects will only have a single CU in them,
there will be exceptions (LTO-generated objects, -r merged objects etc), so we
do need to consider this approach."
>David can you elaborate under which conditions LTO-generated objects will
have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was
implemented same flag will be used for the entirety of the dwarf output, even if
multiple CUs are included.
>
>I think if object does have a mix of CUs that are 32/64, linker can do a
best effort ordering, and output a warning. My approach to this is from covering
common cases while solving a problem with relocations overflow in large
libraries/binaries.
>
>
>@Fangrui Song<mailto:maskray at google.com>
>That's a good point with relocations. Although is it always a guarantee
a first one will be representative of entire relocation record?
>For debug_info even with DWARF32 there can be 64bit relocations.
>0000000000000c57  0000001800000001 R_X86_64_64            0000000000000000
.text._"some_mangeled_name" + 0
It may be weaker than "guaranteed": working in practice.

Let's look at sections that reference these large .debug_* sections
(.debug_info, .debug_str, .debug_loclists, .debug_rnglists, ...):

* .debug_info: the first relocation references .debug_abbrev, good indicator
* .debug_names references .debug_info: the first relocation (CU offset) is a
good indicator
* .debug_aranges references .debug_info: the first relocation
(debug_info_offset) is a good indicator
* .debug_str_offsets references .debug_str: the first relocation (.debug_str
offset) is a good indicator
* ...

So checking the first relocation is probably sufficient. Even if we miss
something, we can adjust the heuristic, or rather let the compiler generate an
artificial relocation (R_*_NONE), which will always work.
>On one hand since this is only applicable for when DWARF64 is used, special
option would be the way to go. Although the user will need to be aware of yet
another LLD option. Maybe an error when relocations overflow occur can be
modified to display this option along with -fdebug-types-section
I forgot to mention another drawback with .debug_* parsing. In the
presence of compressed debugging information, currently we uncompress
.debug_* on demand. We usually do it when writing the content of the
output section, which means we can potentially discard the uncompressed
buffers after we have done processing with one output section and move
to the next. This trick can potentially save peak memory usage.

However, if we do .debug_* parsing (to decide ordering among DWARF32/DWARF64),
we either cache the result (lose the trick) or end up uncompressing twice.
Neither is good.



I am quite happy with the relocation approach under a linker option. I'd
still
want to know generic-abi folks's thoughts, though. James may have prepared
something
he wants to share with generic-abi:) Let's wait...

James Henderson via llvm-dev

2020-Nov-12 10:20 UTC

head link

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On Thu, 12 Nov 2020 at 02:10, Fangrui Song <maskray at google.com> wrote:
> On 2020-11-12, Alexander Yermolovich wrote:
> >Thanks for feedback.
> >
> >I agree with patch and numbers this will be a more concrete discussion,
> but I wanted to judge overall receptiveness to this approach and see maybe
> there was a better way.
> >
> >"Whilst the majority of objects will only have a single CU in
them, there
> will be exceptions (LTO-generated objects, -r merged objects etc), so we do
> need to consider this approach."
> >David can you elaborate under which conditions LTO-generated objects
will
> have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was
> implemented same flag will be used for the entirety of the dwarf output,
> even if multiple CUs are included.
>
Thinking about it, I wouldn't expect an LTO generated object itself to have
a mixture of DWARF32/64, although I guess the 32/64 bit state could be
encoded in the IR (I am not familiar enough with it to know if it actually
is or not). It might be necessary to find ways to configure LTO to generate
DWARF64, possibly via a link-time option.
>
> >On one hand since this is only applicable for when DWARF64 is used,
> special option would be the way to go. Although the user will need to be
> aware of yet another LLD option. Maybe an error when relocations overflow
> occur can be modified to display this option along with
> -fdebug-types-section
>
> I am quite happy with the relocation approach under a linker option.
I'd
> still
> want to know generic-abi folks's thoughts, though. James may have
prepared
> something
> he wants to share with generic-abi:) Let's wait...
>
I hadn't prepared anything if I'm honest (though if there's
widespread
agreement that this would be useful, I certainly can - it would have other
positive improvements too, reducing the need for tools to rely on section
names to identify debug data for example). It was more a case of bouncing
ideas off of people to see what they thought. Any discussion we have will
probably also need circulating on the DWARF mailing list too, since it is
more a DWARF issue than a gABI issue (unless the solution is a new section
type). Further refinements to this idea that might make it more appealing
to the generic group: `SHT_DEBUG` for the section type name, with the first
N bytes of the sh_info used to specify the variant of debug data it
represents (e.g. 0x1 for DWARF, 0x2 for SOME_OTHER_STANDARD etc), and the
remainder for use as flags as defined by the standard (I'm thinking for
DWARF you could encode the 64-bit/32-bit state in there, possibly the
section variant (info/rnglists/line etc) and the DWARF version too), on the
understanding that consumers like the linker wouldn't combine sections in a
potentially broken way. This has the advantage that it could be retrofitted
to the existing standard versions, but as has been pointed out, this won't
help those with linker scripts - that could only be solved with a new DWARF
standard and separate names for 64/32 bit sections, at least if we wanted
to avoid the linker needing to do anything beyond reading the section
header.

The relocation approach sounds like a reasonable solution for the current
situation - even if we do decide to go the route of changing producers to
start emitting a new section type/update the standard etc, it doesn't
resolve the problem people may currently face.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201112/aa3f87aa/attachment.html>

Alexander Yermolovich via llvm-dev

2020-Nov-13 00:43 UTC

head link

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

Looks like there is an agreement that this path, modifying lld to order sections
using relocations, should be explored.
If Igor doesn't object, since he was primary one driving DWARF64 so far, I
would like to give it a shot at implementing and collecting some performance
numbers. 🙂

Alex

________________________________
From: James Henderson <jh7370.2008 at my.bristol.ac.uk>
Sent: Thursday, November 12, 2020 2:20 AM
To: Fangrui Song <maskray at google.com>
Cc: Alexander Yermolovich <ayermolo at fb.com>; Robinson, Paul
<paul.robinson at sony.com>; David Blaikie <dblaikie at gmail.com>;
Eric Christopher <echristo at gmail.com>; Igor Kudrin <ikudrin at
accesssoftek.com>; llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On Thu, 12 Nov 2020 at 02:10, Fangrui Song <maskray at
google.com<mailto:maskray at google.com>> wrote:
On 2020-11-12, Alexander Yermolovich wrote:>Thanks for feedback.
>
>I agree with patch and numbers this will be a more concrete discussion, but
I wanted to judge overall receptiveness to this approach and see maybe there was
a better way.
>
>"Whilst the majority of objects will only have a single CU in them,
there will be exceptions (LTO-generated objects, -r merged objects etc), so we
do need to consider this approach."
>David can you elaborate under which conditions LTO-generated objects will
have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was
implemented same flag will be used for the entirety of the dwarf output, even if
multiple CUs are included.
Thinking about it, I wouldn't expect an LTO generated object itself to have
a mixture of DWARF32/64, although I guess the 32/64 bit state could be encoded
in the IR (I am not familiar enough with it to know if it actually is or not).
It might be necessary to find ways to configure LTO to generate DWARF64,
possibly via a link-time option.
>
>On one hand since this is only applicable for when DWARF64 is used, special
option would be the way to go. Although the user will need to be aware of yet
another LLD option. Maybe an error when relocations overflow occur can be
modified to display this option along with -fdebug-types-section
I am quite happy with the relocation approach under a linker option. I'd
still
want to know generic-abi folks's thoughts, though. James may have prepared
something
he wants to share with generic-abi:) Let's wait...

I hadn't prepared anything if I'm honest (though if there's
widespread agreement that this would be useful, I certainly can - it would have
other positive improvements too, reducing the need for tools to rely on section
names to identify debug data for example). It was more a case of bouncing ideas
off of people to see what they thought. Any discussion we have will probably
also need circulating on the DWARF mailing list too, since it is more a DWARF
issue than a gABI issue (unless the solution is a new section type). Further
refinements to this idea that might make it more appealing to the generic group:
`SHT_DEBUG` for the section type name, with the first N bytes of the sh_info
used to specify the variant of debug data it represents (e.g. 0x1 for DWARF, 0x2
for SOME_OTHER_STANDARD etc), and the remainder for use as flags as defined by
the standard (I'm thinking for DWARF you could encode the 64-bit/32-bit
state in there, possibly the section variant (info/rnglists/line etc) and the
DWARF version too), on the understanding that consumers like the linker
wouldn't combine sections in a potentially broken way. This has the
advantage that it could be retrofitted to the existing standard versions, but as
has been pointed out, this won't help those with linker scripts - that could
only be solved with a new DWARF standard and separate names for 64/32 bit
sections, at least if we wanted to avoid the linker needing to do anything
beyond reading the section header.

The relocation approach sounds like a reasonable solution for the current
situation - even if we do decide to go the route of changing producers to start
emitting a new section type/update the standard etc, it doesn't resolve the
problem people may currently face.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201113/fba9b02c/attachment.html>

Wenlei He via llvm-dev

2020-Nov-13 16:35 UTC

head link

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

>  Thinking about it, I wouldn't expect an LTO generated object itself to
have a mixture of DWARF32/64, although I guess the 32/64 bit state could be
encoded in the IR (I am not familiar enough with it to know if it actually is or
not). It might be necessary to find ways to configure LTO to generate DWARF64,
possibly via a link-time option.
I don’t think we need to encode dwarf32/64 in IR as attribute for each module.
We’re not going to emit mixed dwarf32/64 for merged LTO module anyways, so
allowing each module to express its dwarf setting would only introduce burden
for LTO to deal with inconsistency (warning?) among input modules. Having a
linker switch to pass the setting from driver to LTO sounds better to me.

From: llvm-dev <llvm-dev-bounces at lists.llvm.org>
Date: Thursday, November 12, 2020 at 2:21 AM
To: Fangrui Song <maskray at google.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On Thu, 12 Nov 2020 at 02:10, Fangrui Song <maskray at
google.com<mailto:maskray at google.com>> wrote:
On 2020-11-12, Alexander Yermolovich wrote:>Thanks for feedback.
>
>I agree with patch and numbers this will be a more concrete discussion, but
I wanted to judge overall receptiveness to this approach and see maybe there was
a better way.
>
>"Whilst the majority of objects will only have a single CU in them,
there will be exceptions (LTO-generated objects, -r merged objects etc), so we
do need to consider this approach."
>David can you elaborate under which conditions LTO-generated objects will
have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was
implemented same flag will be used for the entirety of the dwarf output, even if
multiple CUs are included.
Thinking about it, I wouldn't expect an LTO generated object itself to have
a mixture of DWARF32/64, although I guess the 32/64 bit state could be encoded
in the IR (I am not familiar enough with it to know if it actually is or not).
It might be necessary to find ways to configure LTO to generate DWARF64,
possibly via a link-time option.
>
>On one hand since this is only applicable for when DWARF64 is used, special
option would be the way to go. Although the user will need to be aware of yet
another LLD option. Maybe an error when relocations overflow occur can be
modified to display this option along with -fdebug-types-section
I am quite happy with the relocation approach under a linker option. I'd
still
want to know generic-abi folks's thoughts, though. James may have prepared
something
he wants to share with generic-abi:) Let's wait...

I hadn't prepared anything if I'm honest (though if there's
widespread agreement that this would be useful, I certainly can - it would have
other positive improvements too, reducing the need for tools to rely on section
names to identify debug data for example). It was more a case of bouncing ideas
off of people to see what they thought. Any discussion we have will probably
also need circulating on the DWARF mailing list too, since it is more a DWARF
issue than a gABI issue (unless the solution is a new section type). Further
refinements to this idea that might make it more appealing to the generic group:
`SHT_DEBUG` for the section type name, with the first N bytes of the sh_info
used to specify the variant of debug data it represents (e.g. 0x1 for DWARF, 0x2
for SOME_OTHER_STANDARD etc), and the remainder for use as flags as defined by
the standard (I'm thinking for DWARF you could encode the 64-bit/32-bit
state in there, possibly the section variant (info/rnglists/line etc) and the
DWARF version too), on the understanding that consumers like the linker
wouldn't combine sections in a potentially broken way. This has the
advantage that it could be retrofitted to the existing standard versions, but as
has been pointed out, this won't help those with linker scripts - that could
only be solved with a new DWARF standard and separate names for 64/32 bit
sections, at least if we wanted to avoid the linker needing to do anything
beyond reading the section header.

The relocation approach sounds like a reasonable solution for the current
situation - even if we do decide to go the route of changing producers to start
emitting a new section type/update the standard etc, it doesn't resolve the
problem people may currently face.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201113/46386761/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Nov 2020 - [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

Possibly Parallel Threads