thr3ads.net - llvm dev - [llvm-dev] Remove obsolete debug info while garbage collecting [Sep 2019]

If this information is useful, please help other people find it:
Share via:

Alexey Lapshin via llvm-dev

2019-Sep-18 14:25 UTC

[llvm-dev] Remove obsolete debug info while garbage collecting

17.09.2019 3:12, David Blaikie пишет:>
>
> On Wed, Sep 11, 2019 at 3:32 PM Alexey Lapshin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     Debuginfo and linker folks, we (AccessSoftek) would like to
>     suggest a proposal for removing obsolete debug info. If you find
>     it useful we will be happy to work on improving it. Thank you for
>     any opinions and suggestions.
>
>     Alexey.
>
>         Currently when the linker does garbage collection a lot of
>     abandoned debug info is left behind (see Appendix A for
>     documentation). Besides inflated debug info size, we ended up with
>     overlapping address ranges and no way to say valid vs garbage
>     ranges. We propose removing debug info along with removing code.
>     This would reduce debug info size and make sure debug info accuracy.
>
>     There are several approaches which could be used to solve that
>     problem:
>
>     1.  Require dwarf producers to generate fragmented debug data
>     according to DWARF5 specification: "E.3.3
>     Single-function-per-DWARF-compilation-unit" page 388. That
>     approach assumes fragmenting the whole debug info per function
>     basis and glue fragmented sections at the link time using section
>     groups.
>
>     2.  Use an additional tool, which would optimize out unnecessary
>     debug data, something similar to dwz (dwarf compressor tool),
>     dsymutil (links the DWARF debug information). This approach
>     assumes additional post-link binaries processing.
>
>     3.  Teach the linker to parse debug data and let it remove unused
>     debug data.
>
>     In this proposal, we focus on approach #3. We show that this
>     approach is viable and discuss some preliminary results, leaving
>     particular implementation out of the scope. We attach the Proof of
>     Concept (PoC) implementation(https://reviews.llvm.org/D67469) for
>     illustrative purposes. Please keep in mind that it is not final,
>     and there is room for improvements (see Appendix B). However, the
>     achieved results look quite promising and demonstrate up to 2
>     times size reduction and performance overhead is 30% of linking
>     time (which is in the same ballpark as the already done section
>     compressing (see table 2 point F)).
>
>
> Have you considered/tried reusing the DWARF 
> minimization/deduplication/linking logic that's already in llvm's 
> dsymutil implementation? If we're going to do that having a singular 
> implementation would be desirable.
>
> (bonus points if we could do something like the dsymutil approach when 
> using Split DWARF and building a DWP - taking some address table 
> output from the linker, and using that to help trim things (or, even 
> when having no input from the linker - at least doing more aggressive 
> deduplication during DWP construction than can be currently done with 
> only type units (& potentially removing/avoiding type unit overhead
too))Generally speaking, dsymutil does a very similar thing. It parses DWARF 
DIEs, analyzes relocations, scans through references and throws out 
unused DIEs. But it`s current interface does not allow to use it at link 
stage.
  I think it would be perfect to have a singular implementation.
  Though I did not analyze how easy or is it possible to reuse its code 
at the link stage, it looked like it needs a significant rework.

  Implementation from this proposal does removing of obsolete debug info 
at link stage.
  And so has benefits of already loaded object files, already created 
liveness information,
  generating an optimized binary from scratch.


If dsymutil could be refactored in such manner that could be used at the 
link stage, then it`s implementation could be reused. I would research 
the possibility of such a refactoring.

>     1. Minimize or entirely avoid references from subprograms into
>     other parts of .debug_info section. That would simplify splitting
>     and removing subprograms out in that sense that it would minimize
>     the number of references that should be parsed and followed.
>     (DW_FORM_ref_subroutine instead of DW_FORM_ref_*, ?)
>
>
> Not sure I follow - by "other parts of the .debug_info section"
do you
> mean in the same CU, or cross CU references? Any particular references 
> you have in mind? Or encountered in practice?I mean here all kinds of references into .debug_info section. Going 
through references is the time-consuming task.
Thus the fewer references there should be followed then the faster it works.

For the cross CU references - It requires to load referenced CU. I do 
not know use cases where cross CU references are used. If that is the 
specific case and is not used inside subprograms usually, then probably 
it is possible to avoid it.

For the same CU - there could probably be cases when references could be 
ignored: https://reviews.llvm.org/P8165>
>     2. Create additional section - global types table
>     (.debug_types_table). That would significantly reduce the number
>     of references inside .debug_info section. It also makes it
>     possible to have a 4-byte reference in this section instead of
>     8-bytes reference into type unit (DW_FORM_ref_types instead of
>     DW_FORM_ref_sig8). It also makes it possible to place base types
>     into this section and avoid per-compile unit duplication of them.
>     Additionally, there could be achieved size reduction by not
>     generating type unit header. Note, that new section -
>     .debug_types_table - differs from DWARF4 section .debug_types in
>     that sense that: it contains unique type descriptors referenced by
>     offsets instead of list of type units referenced by
>     DW_FORM_ref_sig8;  all table entries share the same abbreviations
>     and do not have type unit headers.
>
>
> What do you mean when you say "global types table" the phrasing
in the
> above paragraph is present-tense, as though this thing exists but 
> doesn't seem to describe what it actually is and how it achieves the 
> things the text says it achieves. Perhaps I've missed some context
here.

The "global types table" does not exist yet. It could be created if
the
discussed approach would be considered useful.
Please check the comparison of possible "global types table" and 
currently existed type units: https://reviews.llvm.org/P8164

The benefit of using "global types table" is that it saves the space 
required to keep types comparing with type units solution.

>     3. Define the limited scope for line programs which could be
>     removed independently. I.e. currently .debug_line section contains
>     a program in byte-coded language for a state machine. That program
>     actually represents a matrix [instruction][line information]. In
>     general, it is hard to cut out part of that program and to keep
>     the whole program correct. Thus it would be good to specify
>     separate scopes (related to address ranges) which could be easily
>     removed from the program body.
>
>
> In my experience line tables are /tiny/ - have you prototyped any 
> change in this space to have a sense of whether it would have 
> significant savings? (it'd potentially help address the address 
> ambiguity issues when the linker discards code, though - so might be a 
> correctness issue rather than a size performance issue)
I did not measure the value of size reduction for line table, though I 
think that it would be a small value.
The more important thing is a correctness issue. Line table could 
contain information for overlapping address ranges.

There is another attempt to fix that issue - 
https://reviews.llvm.org/D59553.

>
>     We evaluated the approach on LLVM and Clang codebases. The results
>     obtained are summarized in the tables below:
>
>
> Memory usage statistics (& confidence intervals for the build time) 
> would probably be especially useful for comparing these tradeoffs.
> Doubly so when using compression (since the decompression would need 
> to use more memory, as would the recompression - so, two different 
> tradeoffs (compressed input, compressed output, and then both at the 
> same time))
I would measure memory impact for that PoC implementation, but I expect 
it would be significant.
Memory usage was not optimized yet. There are several things which might 
be done to reduce memory footprint:
do not load all compile units into memory, avoid adding Parent field to 
all DIEs.

Alexey.
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190918/c0d0d2ed/attachment.html>

David Blaikie via llvm-dev

2019-Sep-19 01:24 UTC

head link

[llvm-dev] Remove obsolete debug info while garbage collecting

On Wed, Sep 18, 2019 at 7:25 AM Alexey Lapshin <a.v.lapshin at mail.ru>
wrote:
>
> 17.09.2019 3:12, David Blaikie пишет:
>
>
>
> On Wed, Sep 11, 2019 at 3:32 PM Alexey Lapshin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Debuginfo and linker folks, we (AccessSoftek) would like to suggest a
>> proposal for removing obsolete debug info. If you find it useful we
will be
>> happy to work on improving it. Thank you for any opinions and
suggestions.
>>
>> Alexey.
>>
>>     Currently when the linker does garbage collection a lot of
abandoned
>> debug info is left behind (see Appendix A for documentation). Besides
>> inflated debug info size, we ended up with overlapping address ranges
and
>> no way to say valid vs garbage ranges. We propose removing debug info
along
>> with removing code. This would reduce debug info size and make sure
debug
>> info accuracy.
>>
>> There are several approaches which could be used to solve that problem:
>>
>> 1.  Require dwarf producers to generate fragmented debug data according
>> to DWARF5 specification: "E.3.3
Single-function-per-DWARF-compilation-unit"
>> page 388. That approach assumes fragmenting the whole debug info per
>> function basis and glue fragmented sections at the link time using
section
>> groups.
>>
>> 2.  Use an additional tool, which would optimize out unnecessary debug
>> data, something similar to dwz (dwarf compressor tool), dsymutil (links
the
>> DWARF debug information). This approach assumes additional post-link
>> binaries processing.
>>
>> 3.  Teach the linker to parse debug data and let it remove unused debug
>> data.
>>
>> In this proposal, we focus on approach #3. We show that this approach
is
>> viable and discuss some preliminary results, leaving particular
>> implementation out of the scope. We attach the Proof of Concept (PoC)
>> implementation(https://reviews.llvm.org/D67469) for illustrative
>> purposes. Please keep in mind that it is not final, and there is room
for
>> improvements (see Appendix B). However, the achieved results look quite
>> promising and demonstrate up to 2 times size reduction and performance
>> overhead is 30% of linking time (which is in the same ballpark as the
>> already done section compressing (see table 2 point F)).
>>
>
> Have you considered/tried reusing the DWARF
> minimization/deduplication/linking logic that's already in llvm's
dsymutil
> implementation? If we're going to do that having a singular
implementation
> would be desirable.
>
> (bonus points if we could do something like the dsymutil approach when
> using Split DWARF and building a DWP - taking some address table output
> from the linker, and using that to help trim things (or, even when having
> no input from the linker - at least doing more aggressive deduplication
> during DWP construction than can be currently done with only type units
(&
> potentially removing/avoiding type unit overhead too))
>
>
> Generally speaking, dsymutil does a very similar thing. It parses DWARF
> DIEs, analyzes relocations, scans through references and throws out unused
> DIEs. But it`s current interface does not allow to use it at link stage.
>  I think it would be perfect to have a singular implementation.
>  Though I did not analyze how easy or is it possible to reuse its code at
> the link stage, it looked like it needs a significant rework.
>
>  Implementation from this proposal does removing of obsolete debug info at
> link stage.
>  And so has benefits of already loaded object files, already created
> liveness information,
>  generating an optimized binary from scratch.
>
>
> If dsymutil could be refactored in such manner that could be used at the
> link stage, then it`s implementation could be reused. I would research the
> possibility of such a refactoring.
>Yeah, if this is going to be implemented, I think that would be strongly
preferred - though I realize it may be substantial work to refactor. The
alternative - duplicating all this work - doesn't seem like something that
would be good for the LLVM project.
> 1. Minimize or entirely avoid references from subprograms into other parts
>> of .debug_info section. That would simplify splitting and removing
>> subprograms out in that sense that it would minimize the number of
>> references that should be parsed and followed. (DW_FORM_ref_subroutine
>> instead of DW_FORM_ref_*, ?)
>>
>
> Not sure I follow - by "other parts of the .debug_info section"
do you
> mean in the same CU, or cross CU references? Any particular references you
> have in mind? Or encountered in practice?
>
> I mean here all kinds of references into .debug_info section.
>
Ah, not only references from other places /into/ .debug_info (which don't
really exist, so far as I know) but any references to locations within
debug_info.

Reducing these isn't super-viable - types being the most common examples.
Though now I understand what you're getting at partly around the
debug_type_table idea - adding a level of indirection to type references.
So it'd be easy to find only one place to fix when removing chunks of
debug_info (updating only the type table without having to find all the
places inside debug_info to touch). That indirection would come at a size
cost, of course - and an overhead for DWARF parsers having to follow that
indirection. Doesn't make it impossible - just tradeoffs to be aware of.

Though that's not the only DIE references - without removing them all
there'd still be a fair bit of overhead for finding any remaining ones and
applying them. If an indirection table is to be added, maybe a generalized
one (for any DIE reference) rather than one only for types would be good.

(aspects of this have been discusesd before - we've sometimes nicknamed it
"bag of DWARF" when discussing it in the context of type units
(currently
you can only reference the type DIE in a type unit - which adds overhead
when wanting to reference subprogram declaration DIEs, etc (or maybe
multiple types are clustered together and don't need a separate type unit
each - if only you could refer to multiple types in a type unit) - so we've
discussed generalizing the type unit header (actually it could generalize
even as far as the classic CU header) to have N type DIE offset+hash pairs
(zero for a normal CU, one for a classic type unit, and any number for more
interesting cases))

> Going through references is the time-consuming task.
> Thus the fewer references there should be followed then the faster it
> works.
>
> For the cross CU references - It requires to load referenced CU. I do not
> know use cases where cross CU references are used.
>
Cross-CU inlining due to LTO. Try something like this:

a.cpp:
  void f2();
  __attribute__((always_inline)) void f1() {
    f2();
  }

b.cpp:
  void f1();
  int main() {
    f1();
  }

$ clang++ a.cpp b.cpp -emit-llvm -S -c -g
$ llvm-link a.ll b.ll -o ab.bc
$ clang++ ab.bc -c
$ llvm-dwarfdump ab.o -v -debug-info |
0x0b: DW_TAG_compile_unit
        DW_AT_name "a.cpp"
0x2a:   DW_TAG_subprogram
          DW_AT_abstract_origin [DW_FORM_ref4] (cu + 0x0056 => {0x00000056}
"_Z2f1v")
        DW_TAG_subprogram
          DW_AT_name "f1"
0x6e: DW_TAG_compile_unit
        DW_AT_name "b.cpp"
0x8d:   DW_TAG_subprogram
          DW_AT_name "main"
0xa6:     DW_TAG_inlined_subroutine
            DW_AT_abstract_origin [DW_FORM_ref_addr] (0x0000000000000056
"_Z2f1v")

ueaueoa
ueaoueoa

Notice that the inlined_subroutine's abstract_origin uses a linker
relocation into the debug_info section to give an absolute offset within
the finally linked debug_info section (since the debugger wouldn't know
that these two compile_units are bound together and to use some particular
compile_unit as the base offset - either it's absolute across the whole
debug_info section (FORM_ref_addr) or it's local to the CU (FORM_refN (such
as FORM_ref4 above)))

> If that is the specific case and is not used inside subprograms usually,
> then probably it is possible to avoid it.
>
It's fairly specifically used inside subprograms (& would need to be
adjusted even if it wasn't inside a subprogram - when bytes are removed,
etc) - though possibly general relocation handling in the linker could be
used to implement handling ref_addr.

> For the same CU - there could probably be cases when references could be
> ignored: https://reviews.llvm.org/P8165
>
How would references be ignored while keeping them correct? Ah, by making
subprograms more self-contained - maybe, but the work to figure out which
things are only referenced from one place and structure the DWARF
differently probably wouldn't be ideal in the compiler & wouldn't
save the
debug info linker from having to haev code to handle the case where it
wasn't only used from that subprogram anyway.

>
>
>> 2. Create additional section - global types table (.debug_types_table).
>> That would significantly reduce the number of references inside
.debug_info
>> section. It also makes it possible to have a 4-byte reference in this
>> section instead of 8-bytes reference into type unit (DW_FORM_ref_types
>> instead of DW_FORM_ref_sig8). It also makes it possible to place base
types
>> into this section and avoid per-compile unit duplication of them.
>> Additionally, there could be achieved size reduction by not generating
type
>> unit header. Note, that new section - .debug_types_table - differs from
>> DWARF4 section .debug_types in that sense that: it contains unique type
>> descriptors referenced by offsets instead of list of type units
referenced
>> by DW_FORM_ref_sig8;  all table entries share the same abbreviations
and do
>> not have type unit headers.
>>
>
> What do you mean when you say "global types table" the phrasing
in the
> above paragraph is present-tense, as though this thing exists but
doesn't
> seem to describe what it actually is and how it achieves the things the
> text says it achieves. Perhaps I've missed some context here.
>
>
> The "global types table" does not exist yet. It could be created
if the
> discussed approach would be considered useful.
>
Ah, the present-tense language was a bit confusing for me when discussing a
thing that doesn't exist yet & not having provided a description of what
it
might be or might contain and why it would exist/what it would achieve.

> Please check the comparison of possible "global types table" and
currently
> existed type units: https://reviews.llvm.org/P8164
>Ah, that proposed version makes it easy to remove subprograms from
debug_info without having to fix up type references (but you still have to
have the code to fix up other cross-CU references, like abstract_origin, so
I'm not sure it provides that much value) but doesn't make it easy to
remove types (becaues you'd have to go looking through the debug_info
section to update all the type offsets (which I guess you have to do anyway
to find the type references)  and removing the types still also requires
fixing up the types that reference each other...

So I'm not seeing a big win there.
> The benefit of using "global types table" is that it saves the
space
> required to keep types comparing with type units solution.
>
>
>
>
>> 3. Define the limited scope for line programs which could be removed
>> independently. I.e. currently .debug_line section contains a program in
>> byte-coded language for a state machine. That program actually
represents a
>> matrix [instruction][line information]. In general, it is hard to cut
out
>> part of that program and to keep the whole program correct. Thus it
would
>> be good to specify separate scopes (related to address ranges) which
could
>> be easily removed from the program body.
>>
>
> In my experience line tables are /tiny/ - have you prototyped any change
> in this space to have a sense of whether it would have significant savings?
> (it'd potentially help address the address ambiguity issues when the
linker
> discards code, though - so might be a correctness issue rather than a size
> performance issue)
>
> I did not measure the value of size reduction for line table, though I
> think that it would be a small value.
> The more important thing is a correctness issue. Line table could contain
> information for overlapping address ranges.
>
> There is another attempt to fix that issue -
> https://reviews.llvm.org/D59553.
>Yep. It's a complicated problem, and fixing the line table would be a good
way to deal with some of it. (Split DWARF makes it hard to fix up the rest
of the debug info, though - so there would still be some ambiguity in the
DWARF with a binary using Split DWARF).
>
>
>
>>
>> We evaluated the approach on LLVM and Clang codebases. The results
>> obtained are summarized in the tables below:
>>
>
> Memory usage statistics (& confidence intervals for the build time)
would
> probably be especially useful for comparing these tradeoffs.
> Doubly so when using compression (since the decompression would need to
> use more memory, as would the recompression - so, two different tradeoffs
> (compressed input, compressed output, and then both at the same time))
>
> I would measure memory impact for that PoC implementation, but I expect it
> would be significant.
> Memory usage was not optimized yet. There are several things which might
> be done to reduce memory footprint:
> do not load all compile units into memory, avoid adding Parent field to
> all DIEs.
>Yep, this is the sort of thing where I suspect the dsymutil implementation
may've already had at least some of that work done - or, if not, that doing
the work once for both/all implementations would be very preferable to
duplicating the effort.

- Dave
> Alexey.
>
>
>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190918/a87ed0ac/attachment.html>

Alexey Lapshin via llvm-dev

2019-Sep-20 20:41 UTC

head link

[llvm-dev] Remove obsolete debug info while garbage collecting

19.09.2019 4:24, David Blaikie пишет:>
>
> On Wed, Sep 18, 2019 at 7:25 AM Alexey Lapshin <a.v.lapshin at mail.ru 
> <mailto:a.v.lapshin at mail.ru>> wrote:
>
>
>>
>     Generally speaking, dsymutil does a very similar thing. It parses
>     DWARF DIEs, analyzes relocations, scans through references and
>     throws out unused DIEs. But it`s current interface does not allow
>     to use it at link stage.
>      I think it would be perfect to have a singular implementation.
>      Though I did not analyze how easy or is it possible to reuse its
>     code at the link stage, it looked like it needs a significant rework.
>
>      Implementation from this proposal does removing of obsolete debug
>     info at link stage.
>      And so has benefits of already loaded object files, already
>     created liveness information,
>      generating an optimized binary from scratch.
>
>
>     If dsymutil could be refactored in such manner that could be used
>     at the link stage, then it`s implementation could be reused. I
>     would research the possibility of such a refactoring.
>
> Yeah, if this is going to be implemented, I think that would be 
> strongly preferred - though I realize it may be substantial work to 
> refactor. The alternative - duplicating all this work - doesn't seem 
> like something that would be good for the LLVM project.
I see. So I would research the question of whether it is possible to 
refactor it accordingly.

>>         1. Minimize or entirely avoid references from subprograms
>>         into other parts of .debug_info section. That would simplify
>>         splitting and removing subprograms out in that sense that it
>>         would minimize the number of references that should be parsed
>>         and followed. (DW_FORM_ref_subroutine instead of
>>         DW_FORM_ref_*, ?)
>>
>>
>>     Not sure I follow - by "other parts of the .debug_info
section"
>>     do you mean in the same CU, or cross CU references? Any
>>     particular references you have in mind? Or encountered in practice?
>     I mean here all kinds of references into .debug_info section.
>
>
> Ah, not only references from other places /into/ .debug_info (which 
> don't really exist, so far as I know) but any references to locations 
> within debug_info.
>
> Reducing these isn't super-viable - types being the most common 
> examples. Though now I understand what you're getting at partly around 
> the debug_type_table idea - adding a level of indirection to type 
> references. So it'd be easy to find only one place to fix when 
> removing chunks of debug_info (updating only the type table without 
> having to find all the places inside debug_info to touch). That 
> indirection would come at a size cost, of course - and an overhead for 
> DWARF parsers having to follow that indirection. Doesn't make it 
> impossible - just tradeoffs to be aware of.
>
> Though that's not the only DIE references - without removing them all 
> there'd still be a fair bit of overhead for finding any remaining ones 
> and applying them. If an indirection table is to be added, maybe a 
> generalized one (for any DIE reference) rather than one only for types 
> would be good.
>yes, some general indirection table would probably be useful.
But, types would still require specialized handling.
Types have "type hash" and need some specific logic around that.

> (aspects of this have been discusesd before - we've sometimes 
> nicknamed it "bag of DWARF" when discussing it in the context of
type
> units (currently you can only reference the type DIE in a type unit - 
> which adds overhead when wanting to reference subprogram declaration 
> DIEs, etc (or maybe multiple types are clustered together and don't 
> need a separate type unit each - if only you could refer to multiple 
> types in a type unit) - so we've discussed generalizing the type unit 
> header (actually it could generalize even as far as the classic CU 
> header) to have N type DIE offset+hash pairs (zero for a normal CU, 
> one for a classic type unit, and any number for more interesting cases))
As far as I understand, "generalizing the type unit header (actually it 
could generalize even as far as the classic CU header) to have N type 
DIE offset+hash pairs" looks very close to "global type table"
which I
am talking about.

>     Going through references is the time-consuming task.
>     Thus the fewer references there should be followed then the faster
>     it works.
>
>     For the cross CU references - It requires to load referenced CU. I
>     do not know use cases where cross CU references are used.
>
>
> Cross-CU inlining due to LTO. Try something like this:
>
> a.cpp:
>   void f2();
>   __attribute__((always_inline)) void f1() {
>     f2();
>   }
>
> b.cpp:
>   void f1();
>   int main() {
>     f1();
>   }
>
> $ clang++ a.cpp b.cpp -emit-llvm -S -c -g
> $ llvm-link a.ll b.ll -o ab.bc
> $ clang++ ab.bc -c
> $ llvm-dwarfdump ab.o -v -debug-info |
> 0x0b: DW_TAG_compile_unit
>         DW_AT_name "a.cpp"
> 0x2a:   DW_TAG_subprogram
>           DW_AT_abstract_origin [DW_FORM_ref4] (cu + 0x0056 => 
> {0x00000056} "_Z2f1v")
>         DW_TAG_subprogram
>           DW_AT_name "f1"
> 0x6e: DW_TAG_compile_unit
>         DW_AT_name "b.cpp"
> 0x8d:   DW_TAG_subprogram
>           DW_AT_name "main"
> 0xa6:     DW_TAG_inlined_subroutine
>             DW_AT_abstract_origin [DW_FORM_ref_addr] 
> (0x0000000000000056 "_Z2f1v")
>
> ueaueoa
> ueaoueoa
>
> Notice that the inlined_subroutine's abstract_origin uses a linker 
> relocation into the debug_info section to give an absolute offset 
> within the finally linked debug_info section (since the debugger 
> wouldn't know that these two compile_units are bound together and to 
> use some particular compile_unit as the base offset - either it's 
> absolute across the whole debug_info section (FORM_ref_addr) or it's 
> local to the CU (FORM_refN (such as FORM_ref4 above)))
Got it. Thank you.

>     If that is the specific case and is not used inside subprograms
>     usually, then probably it is possible to avoid it.
>
>
> It's fairly specifically used inside subprograms (& would need to
be
> adjusted even if it wasn't inside a subprogram - when bytes are 
> removed, etc) - though possibly general relocation handling in the 
> linker could be used to implement handling ref_addr.
>
>     For the same CU - there could probably be cases when references
>     could be ignored: https://reviews.llvm.org/P8165
>
>
> How would references be ignored while keeping them correct? Ah, by 
> making subprograms more self-contained - maybe, but the work to figure 
> out which things are only referenced from one place and structure the 
> DWARF differently probably wouldn't be ideal in the compiler & 
> wouldn't save the debug info linker from having to haev code to handle 
> the case where it wasn't only used from that subprogram anyway.
>
>>         2. Create additional section - global types table
>>         (.debug_types_table). That would significantly reduce the
>>         number of references inside .debug_info section. It also
>>         makes it possible to have a 4-byte reference in this section
>>         instead of 8-bytes reference into type unit
>>         (DW_FORM_ref_types instead of DW_FORM_ref_sig8). It also
>>         makes it possible to place base types into this section and
>>         avoid per-compile unit duplication of them. Additionally,
>>         there could be achieved size reduction by not generating type
>>         unit header. Note, that new section - .debug_types_table -
>>         differs from DWARF4 section .debug_types in that sense that:
>>         it contains unique type descriptors referenced by offsets
>>         instead of list of type units referenced by
>>         DW_FORM_ref_sig8;  all table entries share the same
>>         abbreviations and do not have type unit headers.
>>
>>
>>     What do you mean when you say "global types table" the
phrasing
>>     in the above paragraph is present-tense, as though this thing
>>     exists but doesn't seem to describe what it actually is and how
>>     it achieves the things the text says it achieves. Perhaps I've
>>     missed some context here.
>
>
>     The "global types table" does not exist yet. It could be
created
>     if the discussed approach would be considered useful.
>
>
> Ah, the present-tense language was a bit confusing for me when 
> discussing a thing that doesn't exist yet & not having provided a 
> description of what it might be or might contain and why it would 
> exist/what it would achieve.
I should've written it more precise.

>     Please check the comparison of possible "global types table"
and
>     currently existed type units: https://reviews.llvm.org/P8164
>
> Ah, that proposed version makes it easy to remove subprograms from 
> debug_info without having to fix up type references (but you still 
> have to have the code to fix up other cross-CU references, like 
> abstract_origin, so I'm not sure it provides that much value) but 
> doesn't make it easy to remove types (becaues you'd have to go
looking
> through the debug_info section to update all the type offsets (which I 
> guess you have to do anyway to find the type references) and removing 
> the types still also requires fixing up the types that reference each 
> other...
>
> So I'm not seeing a big win there.
Correct. Even if types were put into a separated table, there still 
would be necessary to:
  "go looking through the debug_info section to update all the type 
offsets";
  "removing the types still also requires fixing up the types that 
reference each other".

  But additionally it allows to have following benefits:

  1. Size reduction by remove fragmentation. In
"-fdebug-types-section"
solution every type which is put  into type unit requires:
    - additional type unit header,
    - section header(since it put into separate section),
    - proxy type copies inside compilation unit.

   Putting types into separate table allows not to create above data for 
every type.

2. Size reduction by deduplicate base types. In
"-fdebug-types-section"
solution base types are not deduplicated at all.

3. Performance improvement by handling fewer data. #1 leads to loading 
and parsing fewer bits.

4. Performance improvement by handling fewer references. Simpler 
reference chains allow parsing references faster.
   Instead of this :

type_offset->proxy_type->DW_FORM_ref_sig8->type_unit->type_offset->type.

   There would be this :

   type_offset->type_table->type.
>>
>>         We evaluated the approach on LLVM and Clang codebases. The
>>         results obtained are summarized in the tables below:
>>
>>
>>     Memory usage statistics (& confidence intervals for the build
>>     time) would probably be especially useful for comparing these
>>     tradeoffs.
>>     Doubly so when using compression (since the decompression would
>>     need to use more memory, as would the recompression - so, two
>>     different tradeoffs (compressed input, compressed output, and
>>     then both at the same time))
>
>     I would measure memory impact for that PoC implementation, but I
>     expect it would be significant.
>     Memory usage was not optimized yet. There are several things which
>     might be done to reduce memory footprint:
>     do not load all compile units into memory, avoid adding Parent
>     field to all DIEs.
>
> Yep, this is the sort of thing where I suspect the dsymutil 
> implementation may've already had at least some of that work done - 
> or, if not, that doing the work once for both/all implementations 
> would be very preferable to duplicating the effort.
Ok,


Thank you, Alexey.

>
> - Dave
>
>     Alexey.
>
>>
>>         _______________________________________________
>>         LLVM Developers mailing list
>>         llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>         https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190920/4ebf60eb/attachment.html>

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Sep 2019 - Remove obsolete debug info while garbage collecting

[llvm-dev] Remove obsolete debug info while garbage collecting

[llvm-dev] Remove obsolete debug info while garbage collecting

[llvm-dev] Remove obsolete debug info while garbage collecting

Maybe Matching Threads