thr3ads.net - llvm dev - [llvm-dev] [LLD] Support DWARF64, debug

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2020-Nov-11 05:41 UTC

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

+James for context too (always good to include the folks from the
original threads for continuity)

Yeah, my general attitude there was just twofold, one that the
discussion had strayed fairly far from the review (so interested
parties might not see it, both because it's a targeted review thread
on the noisy llvm-commits, and because fo the title not having much
connection to the discussion) and it seemed to be somewhat
abstract/general - and there's a balance there. "We should do this
because I need it" (we shouldn't be implementing features for
especially niche use cases/if they don't generalize) isn't always a
compelling motivation but "we should do this because someone might
need it" isn't either (we shouldn't be implementing features that
have
no users).

The major drawback in sorting, is the need to parse DWARF, even a
little bit of it (only the first 4 bytes of a section to tell which
version it is - first 12 if you want to be able to jump over
contributions and check /all/ contributions coming from a given input
object file (it might contain a combination of DWARFv4 and DWARFv5)
and then the hairy uncertainty of which sections to check (do you
check them all? well, all the ones with length prefixes that
communicate DWARF32/64 - some sections don't
(debug_ranges/loc/str/macro for instance, if I recall correctly)...
and if something has some 4 and 5, does it get sorted to the start? I
guess so.

On Tue, Nov 10, 2020 at 9:30 PM Eric Christopher via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
>
>
> On Wed, Nov 11, 2020 at 12:19 AM Alexander Yermolovich via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>
>> This year Igor Kudrin put in a lot of work in enabling DWARF64 support
in LLVM. At Facebook we are looking into it as one of the options for handling
debug information over 4gigs in production environment. One concern is that due
to mix of third party libraries and llvm compiled code the final library/binary
will have a mix of CU that are DWARF32/64. This is supported by DWARF format.
With this mix it is possible that even with DWARF64 enabled one can still
encounter relocation overflows errors in LLD if DWARF32 sections happen to be
processed towards the end.
>>
>> One proposal that was discussed in https://reviews.llvm.org/D87011, is
to modify LLD linker to arrange debug_info sections so that DWARF32 comes first,
and DWARF64 after them. This way as long as DWARF32 sections don't
themselves go over 4gigs, the final binary can contain debug information that
exceeds 4gig. Which I think will be the common case.
>>
>> An alternative approach that was proposed by James Henderson is for
build system to take care of it, and to use -u to enforce order.
>
>
> +Fangrui Song here for thread visibility
>
> Of these two approaches I think that the linker sorting is probably the one
I'd go with for the reasons you list below - I'm particularly
sympathetic to not wanting the unintended consequences of using -u here :)
>
> I do worry about slowing down general debug links so a "debug info
sorting" option may make sense, or it may not be worth it after measuring
the speed difference.
>
> Thanks for bringing this up on the list! :)
>
> -eric
>
>>
>>
>> As, I would imagine, most projects of scale are using configurable
build system that pulls in all the various dependencies automatically in a
multi-language environment. I think the alternative approach will be more
fragile than modifying LLD as it relies on a more complex system, and each
customer of LLD will have to implement this "sorting" in their own
build systems. The use of -u also kind of abuses this flag, and might have
unintended consequences. As was pointed out by Wen Lei.
>> From overhead perspective we only need to access few bytes of DWARF to
determine if it's 32 or 64 bits. Customers who need DWARF64, already accept
the overhead that it entails.
>>
>> Any thoughts?
>>
>> Thank You
>> Alex
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

James Henderson via llvm-dev

2020-Nov-11 08:55 UTC

head link

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On Wed, 11 Nov 2020 at 05:41, David Blaikie <dblaikie at gmail.com> wrote:
> +James for context too (always good to include the folks from the
> original threads for continuity)
>
> Yeah, my general attitude there was just twofold, one that the
> discussion had strayed fairly far from the review (so interested
> parties might not see it, both because it's a targeted review thread
> on the noisy llvm-commits, and because fo the title not having much
> connection to the discussion) and it seemed to be somewhat
> abstract/general - and there's a balance there. "We should do this
> because I need it" (we shouldn't be implementing features for
> especially niche use cases/if they don't generalize) isn't always a
> compelling motivation but "we should do this because someone might
> need it" isn't either (we shouldn't be implementing features
that have
> no users).
>
> The major drawback in sorting, is the need to parse DWARF, even a
> little bit of it (only the first 4 bytes of a section to tell which
> version it is - first 12 if you want to be able to jump over
> contributions and check /all/ contributions coming from a given input
> object file (it might contain a combination of DWARFv4 and DWARFv5)
> and then the hairy uncertainty of which sections to check (do you
> check them all? well, all the ones with length prefixes that
> communicate DWARF32/64 - some sections don't
> (debug_ranges/loc/str/macro for instance, if I recall correctly)...
> and if something has some 4 and 5, does it get sorted to the start? I
> guess so.
>
> I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 andDWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is
irrelevant to this, I believe, at least for the current known DWARF
standards. Whilst the majority of objects will only have a single CU in
them, there will be exceptions (LTO-generated objects, -r merged objects
etc), so we do need to consider this approach. Mixtures would certainly be
possible, and there's no guarantee the CUs would be in a nice order with
32-bit blocks before 64-bit blocks. If I follow this to its full
conclusion, you could potentially end up with a single .debug_info
(.debug_line, .debug_rnglists etc) input section with a mixture of
DWARF32/DWARF64 sub-sections, which, if following the reordering approach,
the linker might have to split up internally in order to rearrange (aside -
there's some interesting crossover with ideas I've been considering
regarding the Fragmented DWARF topic discussed elsewhere). Maybe the
solution here would be to change producers to produce separate .debug_info
sections containing DWARF32 and DWARF64. This would require other tools,
like llvm-dwarfdump, to be updated too to handle multiple input .debug_info
sections.

I used the -u option more as an example that it might be possible to get
things to work the way we want without needing to have the linker do the
work. The linker currently has a --symbol-ordering-file option which can be
used to request an order for the specified list of symbols. The linker does
this by rearranging the input sections to get as close as it can to the
requested order. We could maybe implement the same on a file/section basis.
It would avoid needing to read the sections themselves, but doesn't solve
the "what to do about mixed single input" case directly (though might
allow
the user to dodge the decision at least).

Other ideas I had involved changing the section header properties.
Currently DWARF sections are all SHT_PROGBITS, but we could change that to
e.g. SHT_DWARF_32 or similar, and/or use the sh_info field to contain a
value that would indicate the 32/64 bit nature. I'm not convinced by these
ideas though, as a) I don't know if it translates well to other non-ELF
formats, and b) we can't really control the producers of DWARF at this
stage to conform.

It would be nice if there was a solution that could be consistently applied
across all build systems, linkers and DWARF producers. I don't have one as
yet though.

> On Tue, Nov 10, 2020 at 9:30 PM Eric Christopher via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> >
> >
> > On Wed, Nov 11, 2020 at 12:19 AM Alexander Yermolovich via llvm-dev
<
> llvm-dev at lists.llvm.org> wrote:
> >>
> >> This year Igor Kudrin put in a lot of work in enabling DWARF64
support
> in LLVM. At Facebook we are looking into it as one of the options for
> handling debug information over 4gigs in production environment. One
> concern is that due to mix of third party libraries and llvm compiled code
> the final library/binary will have a mix of CU that are DWARF32/64. This is
> supported by DWARF format. With this mix it is possible that even with
> DWARF64 enabled one can still encounter relocation overflows errors in LLD
> if DWARF32 sections happen to be processed towards the end.
> >>
> >> One proposal that was discussed in
https://reviews.llvm.org/D87011, is
> to modify LLD linker to arrange debug_info sections so that DWARF32 comes
> first, and DWARF64 after them. This way as long as DWARF32 sections
don't
> themselves go over 4gigs, the final binary can contain debug information
> that exceeds 4gig. Which I think will be the common case.
> >>
> >> An alternative approach that was proposed by James Henderson is
for
> build system to take care of it, and to use -u to enforce order.
> >
> >
> > +Fangrui Song here for thread visibility
> >
> > Of these two approaches I think that the linker sorting is probably
the
> one I'd go with for the reasons you list below - I'm particularly
> sympathetic to not wanting the unintended consequences of using -u here :)
> >
> > I do worry about slowing down general debug links so a "debug
info
> sorting" option may make sense, or it may not be worth it after
measuring
> the speed difference.
> >
> > Thanks for bringing this up on the list! :)
> >
> > -eric
> >
> >>
> >>
> >> As, I would imagine, most projects of scale are using configurable
> build system that pulls in all the various dependencies automatically in a
> multi-language environment. I think the alternative approach will be more
> fragile than modifying LLD as it relies on a more complex system, and each
> customer of LLD will have to implement this "sorting" in their
own build
> systems. The use of -u also kind of abuses this flag, and might have
> unintended consequences. As was pointed out by Wen Lei.
> >> From overhead perspective we only need to access few bytes of
DWARF to
> determine if it's 32 or 64 bits. Customers who need DWARF64, already
accept
> the overhead that it entails.
> >>
> >> Any thoughts?
> >>
> >> Thank You
> >> Alex
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201111/b1d6c064/attachment.html>

David Blaikie via llvm-dev

2020-Nov-11 17:46 UTC

head link

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

On Wed, Nov 11, 2020 at 12:55 AM James Henderson
<jh7370.2008 at my.bristol.ac.uk> wrote:>
>
>
> On Wed, 11 Nov 2020 at 05:41, David Blaikie <dblaikie at gmail.com>
wrote:
>>
>> +James for context too (always good to include the folks from the
>> original threads for continuity)
>>
>> Yeah, my general attitude there was just twofold, one that the
>> discussion had strayed fairly far from the review (so interested
>> parties might not see it, both because it's a targeted review
thread
>> on the noisy llvm-commits, and because fo the title not having much
>> connection to the discussion) and it seemed to be somewhat
>> abstract/general - and there's a balance there. "We should do
this
>> because I need it" (we shouldn't be implementing features for
>> especially niche use cases/if they don't generalize) isn't
always a
>> compelling motivation but "we should do this because someone might
>> need it" isn't either (we shouldn't be implementing
features that have
>> no users).
>>
>> The major drawback in sorting, is the need to parse DWARF, even a
>> little bit of it (only the first 4 bytes of a section to tell which
>> version it is - first 12 if you want to be able to jump over
>> contributions and check /all/ contributions coming from a given input
>> object file (it might contain a combination of DWARFv4 and DWARFv5)
>> and then the hairy uncertainty of which sections to check (do you
>> check them all? well, all the ones with length prefixes that
>> communicate DWARF32/64 - some sections don't
>> (debug_ranges/loc/str/macro for instance, if I recall correctly)...
>> and if something has some 4 and 5, does it get sorted to the start? I
>> guess so.
>>
> I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and
DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is irrelevant
to this, I believe, at least for the current known DWARF standards.
Yep! thanks for the correction - had a lot of DWARFv4/v5 on my mind
due to other work, so got the terms jumbled up.
>  Whilst the majority of objects will only have a single CU in them, there
will be exceptions (LTO-generated objects, -r merged objects etc), so we do need
to consider this approach. Mixtures would certainly be possible, and there's
no guarantee the CUs would be in a nice order with 32-bit blocks before 64-bit
blocks. If I follow this to its full conclusion, you could potentially end up
with a single .debug_info (.debug_line, .debug_rnglists etc) input section with
a mixture of DWARF32/DWARF64 sub-sections, which, if following the reordering
approach, the linker might have to split up internally in order to rearrange
(aside - there's some interesting crossover with ideas I've been
considering regarding the Fragmented DWARF topic discussed elsewhere).
I think given this is a pragmatic feature I'd be inclined to say "eh,
sort any input object containing at least one DWARFv4 contribution
before input objects not containing any v4 contribution" - if that
doesn't solve some real world issues/situations, I'd be willing to
revisit this direction/consider more invasive/expensive solutions.

Though, as Eric said - some of this conversation might be better had
in terms of concrete patches with concrete performance measurements.
> Maybe the solution here would be to change producers to produce separate
.debug_info sections containing DWARF32 and DWARF64.
That'd involve changing how certain objects were generated - if that's
possible, then I assume it'd be possible to change that generation to
use DWARF64 anyway - in the limit: one might have precompiled binaries
with debug info that one cannot recompile, so any new format options I
doubt are able to address the original/likely use case for this
functionality.
> I used the -u option more as an example that it might be possible to get
things to work the way we want without needing to have the linker do the work.
The linker currently has a --symbol-ordering-file option which can be used to
request an order for the specified list of symbols. The linker does this by
rearranging the input sections to get as close as it can to the requested order.
We could maybe implement the same on a file/section basis. It would avoid
needing to read the sections themselves, but doesn't solve the "what to
do about mixed single input" case directly (though might allow the user to
dodge the decision at least).
>
> Other ideas I had involved changing the section header properties.
Currently DWARF sections are all SHT_PROGBITS, but we could change that to e.g.
SHT_DWARF_32 or similar, and/or use the sh_info field to contain a value that
would indicate the 32/64 bit nature. I'm not convinced by these ideas
though, as a) I don't know if it translates well to other non-ELF formats,
and b) we can't really control the producers of DWARF at this stage to
conform.
>
> It would be nice if there was a solution that could be consistently applied
across all build systems, linkers and DWARF producers. I don't have one as
yet though.
>
>>
>> On Tue, Nov 10, 2020 at 9:30 PM Eric Christopher via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> >
>> >
>> >
>> > On Wed, Nov 11, 2020 at 12:19 AM Alexander Yermolovich via
llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> >>
>> >> This year Igor Kudrin put in a lot of work in enabling DWARF64
support in LLVM. At Facebook we are looking into it as one of the options for
handling debug information over 4gigs in production environment. One concern is
that due to mix of third party libraries and llvm compiled code the final
library/binary will have a mix of CU that are DWARF32/64. This is supported by
DWARF format. With this mix it is possible that even with DWARF64 enabled one
can still encounter relocation overflows errors in LLD if DWARF32 sections
happen to be processed towards the end.
>> >>
>> >> One proposal that was discussed in
https://reviews.llvm.org/D87011, is to modify LLD linker to arrange debug_info
sections so that DWARF32 comes first, and DWARF64 after them. This way as long
as DWARF32 sections don't themselves go over 4gigs, the final binary can
contain debug information that exceeds 4gig. Which I think will be the common
case.
>> >>
>> >> An alternative approach that was proposed by James Henderson
is for build system to take care of it, and to use -u to enforce order.
>> >
>> >
>> > +Fangrui Song here for thread visibility
>> >
>> > Of these two approaches I think that the linker sorting is
probably the one I'd go with for the reasons you list below - I'm
particularly sympathetic to not wanting the unintended consequences of using -u
here :)
>> >
>> > I do worry about slowing down general debug links so a "debug
info sorting" option may make sense, or it may not be worth it after
measuring the speed difference.
>> >
>> > Thanks for bringing this up on the list! :)
>> >
>> > -eric
>> >
>> >>
>> >>
>> >> As, I would imagine, most projects of scale are using
configurable build system that pulls in all the various dependencies
automatically in a multi-language environment. I think the alternative approach
will be more fragile than modifying LLD as it relies on a more complex system,
and each customer of LLD will have to implement this "sorting" in
their own build systems. The use of -u also kind of abuses this flag, and might
have unintended consequences. As was pointed out by Wen Lei.
>> >> From overhead perspective we only need to access few bytes of
DWARF to determine if it's 32 or 64 bits. Customers who need DWARF64,
already accept the overhead that it entails.
>> >>
>> >> Any thoughts?
>> >>
>> >> Thank You
>> >> Alex
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Nov 2020 - [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

[llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

Seemingly Similar Threads