thr3ads.net - llvm dev - [llvm-dev] unified debug information despite function/data sections flags [Sep 2021]

If this information is useful, please help other people find it:
Share via:

via llvm-dev

2021-Sep-30 13:30 UTC

[llvm-dev] unified debug information despite function/data sections flags

I agree with James about using `-fstack-size-section` to get static stack size
information.  Deriving that info from DWARF seems like a lot of work; I imagine
you’d have to parse all of the locations within a function, looking for frame
offsets.  Even then the result would be incomplete because it would describe
only the stack slots used by declared variables.  Temporaries and even spill
slots probably would not be accounted for.

Regarding partitioning DWARF, just for completeness I’ll say that we did also
(at least briefly) look at using DWARF partial-units, but the size overhead
seemed like it would not be a net win.
--paulr

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of James
Henderson via llvm-dev
Sent: Thursday, September 30, 2021 3:44 AM
To: David Blaikie <dblaikie at gmail.com>
Cc: llvm-dev at lists.llvm.org; Youssefi, Anna <a-youssefi at ti.com>
Subject: Re: [llvm-dev] unified debug information despite function/data sections
flags

Yep, I took a look at this last year/early this year, but never really came up
with a fully functioning prototype that was actually efficient enough, and have
since switched teams, so haven't had the time to work on it further.

You can see my lightning talk from last year on the topic here:
https://www.youtube.com/watch?v=0y6TlfFhCsU<https://urldefense.com/v3/__https:/www.youtube.com/watch?v=0y6TlfFhCsU__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoBwVBScCog$>,
and a mailing thread where I discussed it further here:
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146469.html<https://urldefense.com/v3/__https:/lists.llvm.org/pipermail/llvm-dev/2020-November/146469.html__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoByLU9AFKw$>.
The main issue I ran into was the number of hard-coded relative references
within DWARF. Every single one of these needs to be updated at link time, if any
of the data is dropped, or the DWARF will end up invalid. To do this, I had to
add relocations to the DWARF which patched the relevant fields at link time,
based on the final computed offset, but this had a serious performance cost in
the linker (not to mention any potential cost in the assembler). This approach
is certainly possible for the most part, at least for .debug_line and
.debug_info (it's not necessarily clear whether it can be done with some of
the other DWARF sections, although the benefits in most of them aren't
particularly clear), but the difficulty is getting it to be fast.

I'd be happy to discuss this further, and provide any feedback on other
ideas, if you have any, but currently have no plans to continue this work at
this time myself.

By the way, if you are using the DWARF for stack usage analysis, have you
considered the .stack_sizes section? This emits a section that contains the
stack size of every function in the output, and can be dumped using
llvm-readobj. It is split up so that the linker can strip bits that reference
dead data, so you should only end up with the actually useful information in the
output.

James


On Thu, 30 Sept 2021 at 07:51, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:
You can differentiate dead function descriptions from others on most platforms
by checking if the low_pc == 0. If 0 is a valid instruction address on your
architecture, you can use an lld feature to set a more authoritative/unambiguous
tombstone value for dead code addresses, passing something like:


 -z 'dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe'

 -z 'dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe'

 -z 'dead-reloc-in-nonalloc=.debug_*=0xffffffffffffffff'
to the linker.

As for reducing debug info size by omitting debug info descriptions of dead code
- Apple/MachO's dsymutil does this, and I believe Alexey Lapshin is working
on trying to get similar behavior into lld, possibly (or as a post-link tool).

There's also the possibility of using comdats to make the linker's job
easier - I think there might be ways to structure the DWARF into chunks that
could be deduplicated and dropped naturally by a linker's existing comdat
support, but I haven't fully prototyped it. I think there was a thread a
while back with JHenderson and others discussing this possibility further.

- Dave
On Wed, Sep 29, 2021 at 12:50 PM Youssefi, Anna via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,

I was wondering if there are any plans to separate debug information into
distinct sections accordingly when the compiler flags -ffunction-sections and/or
-fdata-sections are used.  If an unreferenced function is removed from the link,
it makes no sense for its associated debug information to still be included.  As
we rely on the debug information for stack usage analysis, we wind up displaying
stack usage statistics for unreferenced functions that were eliminated from the
link if debug information for any other referenced functions is in the same
debug section.  It seems that others have run into this problem previously so I
wanted to check whether there are any plans to change the behavior.

Thanks,
Anna Youssefi
Texas Instruments, Codegen group


_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoBwG7e4e1Q$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210930/5abfb5ca/attachment-0001.html>

Youssefi, Anna via llvm-dev

2021-Sep-30 14:49 UTC

head link

[llvm-dev] unified debug information despite function/data sections flags

We are emitting our own DWARF extensions because our object file editor and a
utility script use these for generating a call graph with stack sizes.  We are
not deriving stack sizes from DWARF but rather emitting a Vendor-specific
attribute in the subprogram DIE with the MachineFrameInfo getStackSize() value,
which appears to be the same value used for LLVM’s own stack size section.

We are also using our own linker, rather than lld.  Our linker already removes
unreferenced subsections, and in the case of our proprietary compiler, the dwarf
information is already separated by function so it also gets removed if it
pertains to an unreferenced function subsection.  So we are only having this
problem with our LLVM-based front end because the debug information is combined.

I can see Todd Snider just re-asked my question.  I believe this was already
answered as being problematic due to hard-coded addresses and size overhead?

Thanks,
Anna

From: paul.robinson at sony.com <paul.robinson at sony.com>
Sent: Thursday, September 30, 2021 8:31 AM
To: jh7370.2008 at my.bristol.ac.uk; dblaikie at gmail.com
Cc: Youssefi, Anna <a-youssefi at ti.com>; llvm-dev at lists.llvm.org
Subject: [EXTERNAL] RE: [llvm-dev] unified debug information despite
function/data sections flags

I agree with James about using `-fstack-size-section` to get static stack size
information.  Deriving that info from DWARF seems like a lot of work; I imagine
you’d have to parse all of the locations within a function, looking for frame
offsets.  Even then the result would be incomplete because it would describe
only the stack slots used by declared variables.  Temporaries and even spill
slots probably would not be accounted for.

Regarding partitioning DWARF, just for completeness I’ll say that we did also
(at least briefly) look at using DWARF partial-units, but the size overhead
seemed like it would not be a net win.
--paulr

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> On Behalf Of James Henderson via llvm-dev
Sent: Thursday, September 30, 2021 3:44 AM
To: David Blaikie <dblaikie at gmail.com<mailto:dblaikie at
gmail.com>>
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>;
Youssefi, Anna <a-youssefi at ti.com<mailto:a-youssefi at ti.com>>
Subject: Re: [llvm-dev] unified debug information despite function/data sections
flags

Yep, I took a look at this last year/early this year, but never really came up
with a fully functioning prototype that was actually efficient enough, and have
since switched teams, so haven't had the time to work on it further.

You can see my lightning talk from last year on the topic here:
https://www.youtube.com/watch?v=0y6TlfFhCsU<https://urldefense.com/v3/__https:/www.youtube.com/watch?v=0y6TlfFhCsU__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoBwVBScCog$>,
and a mailing thread where I discussed it further here:
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146469.html<https://urldefense.com/v3/__https:/lists.llvm.org/pipermail/llvm-dev/2020-November/146469.html__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoByLU9AFKw$>.
The main issue I ran into was the number of hard-coded relative references
within DWARF. Every single one of these needs to be updated at link time, if any
of the data is dropped, or the DWARF will end up invalid. To do this, I had to
add relocations to the DWARF which patched the relevant fields at link time,
based on the final computed offset, but this had a serious performance cost in
the linker (not to mention any potential cost in the assembler). This approach
is certainly possible for the most part, at least for .debug_line and
.debug_info (it's not necessarily clear whether it can be done with some of
the other DWARF sections, although the benefits in most of them aren't
particularly clear), but the difficulty is getting it to be fast.

I'd be happy to discuss this further, and provide any feedback on other
ideas, if you have any, but currently have no plans to continue this work at
this time myself.

By the way, if you are using the DWARF for stack usage analysis, have you
considered the .stack_sizes section? This emits a section that contains the
stack size of every function in the output, and can be dumped using
llvm-readobj. It is split up so that the linker can strip bits that reference
dead data, so you should only end up with the actually useful information in the
output.

James


On Thu, 30 Sept 2021 at 07:51, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:
You can differentiate dead function descriptions from others on most platforms
by checking if the low_pc == 0. If 0 is a valid instruction address on your
architecture, you can use an lld feature to set a more authoritative/unambiguous
tombstone value for dead code addresses, passing something like:

 -z 'dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe'

 -z 'dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe'

 -z 'dead-reloc-in-nonalloc=.debug_*=0xffffffffffffffff'
to the linker.

As for reducing debug info size by omitting debug info descriptions of dead code
- Apple/MachO's dsymutil does this, and I believe Alexey Lapshin is working
on trying to get similar behavior into lld, possibly (or as a post-link tool).

There's also the possibility of using comdats to make the linker's job
easier - I think there might be ways to structure the DWARF into chunks that
could be deduplicated and dropped naturally by a linker's existing comdat
support, but I haven't fully prototyped it. I think there was a thread a
while back with JHenderson and others discussing this possibility further.

- Dave
On Wed, Sep 29, 2021 at 12:50 PM Youssefi, Anna via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,

I was wondering if there are any plans to separate debug information into
distinct sections accordingly when the compiler flags -ffunction-sections and/or
-fdata-sections are used.  If an unreferenced function is removed from the link,
it makes no sense for its associated debug information to still be included.  As
we rely on the debug information for stack usage analysis, we wind up displaying
stack usage statistics for unreferenced functions that were eliminated from the
link if debug information for any other referenced functions is in the same
debug section.  It seems that others have run into this problem previously so I
wanted to check whether there are any plans to change the behavior.

Thanks,
Anna Youssefi
Texas Instruments, Codegen group


_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!tN8gFEUPCxDRSu56DvwynukFPsnIfjTun8qHS8i2OIBJTTXVldfiOutPoBwG7e4e1Q$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210930/46fd8b66/attachment.html>

llvm dev - Sep 2021 - unified debug information despite function/data sections flags

[llvm-dev] unified debug information despite function/data sections flags

[llvm-dev] unified debug information despite function/data sections flags