David Blaikie via llvm-dev
2020-Jun-04 21:40 UTC
[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.
FWIW, I think it's probably best to at least initially frame the discussion around non-configurable value for the sake of reducing the scope/possible surface area of the feature/users/etc. I'd probably only encourage adding the user-configurable flag if/when someone has a use case for it. On Thu, Jun 4, 2020 at 2:31 PM Fangrui Song <maskray at google.com> wrote:> > > On 2020-06-03, James Henderson via llvm-dev wrote: > >It makes me sad that the linker (via a library or otherwise) has to be > >"DWARF-aware" to be able to effectively handle --gc-sections, COMDATs, > >--icf etc for debug info, without leaving large blocks of data kicking > >around. > > > >The patching to -1 (or equivalent) is probably a good lightweight solution > >(though I'd love it if it could be done based on section type in the future > >rather than section name, but that's probably outside the realm of DWARF), > >as it requires only minimal understanding in the linker, but anything > >beyond that seems to be complicated logic that is mostly due to the > >structure of DWARF. Patching to -1 does feel a bit like a sticking > >plaster/band aid to patch over the issue rather than properly solving it > >too - there will still be debug data (potentially significant amounts in > >COMDAT-heavy objects) that the linker has to write and the debugger has to > >somehow know how to skip (even if it knows that -1 is special-case due to > >the standard being updated, it needs to get as far as the -1), which is all > >wasted effort. > > > >We've already seen from Alexey's prototyping, and from our own experiences > >with the Sony proprietary linker (which tried to rewrite .debug_line only) > >that deconstructing the DWARF so that it can be more optimally reassembled > >at link time is slow going, and will probably inevitably be however much > >effort is put into optimising it. For a start, given the current standards, > >it's impossible to know how to deconstruct it without having to parse vast > >amounts of DWARF, which is typically going to mean a lot more parsing work > >than the linker would normally have to deal with. Additionally, much of > >this parsing work is wasted effort, since it seems unlikely in many links > >that large amounts of the DWARF will be redundant. Having an option to > >opt-in doesn't help much there, since it just means the logic exists > >without most people using it, due to it not being good enough, or > >potentially they don't even know it exists. > > > >I don't have particularly concrete suggestions as to how to solve the > >structural problems with DWARF at this point. The only thing that seems > >obvious to me is a more "blessed" approach to fragmentation of sections, > >similar to what I tried with my prototype mentioned earlier in the thread, > >although we'd need to figure out the previously stated performance issues. > >Other ideas might tie into this, like somehow sharing the various table > >headers a bit like CIEs in .eh_frame that could be merged by the linker - > >each object could have separate table header sections, which are referenced > >by the individual .debug_* blocks, which in turn are one per function/data > >piece and easily discardable/merged by the linker. > > > >Just some thoughts. > > > >James > > Your proposed option --dead-reloc-addend=.debug_info=0xffffffffffffffff > seems like a good idea. (I'd expect it to support signed -1 and -2 for > convenience & consistency in some other places (we sometimes use addends > as signed values)). > > LLD only supports absolute relocation types (plus R_PPC64_DTPREL64 which > can go to .debug_addr, plus R_RISCV_{ADD,SUB}*). > > The computed value is S + A. > We still consider the symbolic value S as zero, but override A with the > supplied option --dead-reloc-addend=.debug_info=-1 > I particularly like that `addend` is part of the option name. > > My mere complaint is that the relocation record is not dead, but rather > its referenced symbol is dead. However, I can't think of a better > name... > > Checked with Martin Storsjö, this option may be useful for other binary > formats supporting DWARF. (binutils does not like ELF-specific options > not called -z foobar). > I think it is fine to add this option to LLD if GNU ld is also happy > with the name. I'll check with them. > > "There is a danger that one community won't accept an extension that > they haven't been involved in the design process for." :) (Coutesy of Peter) > > The built-in rules of the linker are the following: > > --dead-reloc-addend=.debug_loc=-2 > --dead-reloc-addend=.debug_ranges=-2 > --dead-reloc-addend=.debug_*=-1 > > They can be overridden. > > >On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev < > >llvm-dev at lists.llvm.org> wrote: > > > >> On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin > >> <alapshin at accesssoftek.com> wrote: > >> > > >> > Hi David, please find my comments inside: > >> > > >> > > >> > >>>Broad question: Do you have any specific motivation/users/etc in > >> implementing this (if you can speak about it)? > >> > > >> > >>> - it might help motivate the work, understand what tradeoffs might > >> be suitable for you/your users, etc. > >> > > >> > >>There are two general requirements: > >> > >> 1) Remove (or clean) invalid debug info. > >> > > >> > > > >> > >Perhaps a simpler direct solution for your immediate needs might be a > >> much narrower, > >> > >and more efficient linker-DWARF-awareness feature: > >> > > > >> > > With DWARFv5, rnglists present an opportunity for a DWARF linker to > >> rewrite the ranges > >> > > without parsing the rest of the DWARF. /technically/ this isn't > >> guaranteed - rnglist entries > >> > > can be referenced either directly, or by index. If all rnglists are > >> referenced by index, then > >> > > a linker could parse only the debug_rnglists section and rewrite > >> ranges to remove any > >> > > address ranges that refer to optimized-out code. > >> > > > >> > > This would only be correct for rnglists that had no direct references > >> to them (that only were > >> > > referenced via the indexes) - but we could either implement it with > >> that assumption, or could > >> > > add an LLVM extension attribute on the CU that would say "I promise I > >> only referenced rnglists > >> > > via rnglistx forms/indexes). If this DWARF-aware linking would have to > >> read the CU DIE (not > >> > > all the other DIEs) it /could/ also then rewrite high/low_pc if the CU > >> wasn't using ranges... > >> > > but that wouldn't come up in the function-removal case, because then > >> you'd have ranges anyway, > >> > > so no need for that. > >> > > > >> > > Such a DWARF-aware rnglist linking could also simplify rnglists, in > >> cases where functions > >> > > ended up being laid out next to each other, the linker could coalesce > >> their ranges together. > >> > > > >> > > I imagine this could be implemented with very little overhead to > >> linking, especially compared > >> > > to the overhead of full DWARF-aware linking. > >> > > > >> > >Though none of this fixes Split DWARF, where the linker doesn't get a > >> chance to see the > >> > > addresses being used - but if you only want/need the CU-level ranges > >> to be correct, this > >> > > might be a viable fix, and quite efficient. > >> > > >> > Yes, we think about that alternative. This would resolve our problem of > >> invalid debug info > >> > and would work much faster. Thus, if we would not have good results for > >> D74169 then we > >> > will implement it. Do you think it could be useful to have this solution > >> in upstream? > >> > >> A pure rnglist rewriting - I think it'd be OK to have in upstream - > >> again, cost/benefit/etc would have to be weighed. I'm not sure it > >> would save enough space to be particularly valuable beyond the > >> correctness issue - and it doesn't completely solve the correctness > >> issue for zero-address usage or low-address usage (because you could > >> still have overlapping subprograms inside a CU - so if you were > >> symbolizing you could use the correct rnglist to filter, but then go > >> look inside the CU only to find two subprograms that had that address > >> & not know which one was the correct one an which one was the > >> discarded one). > >> > >> rnglist rewriting might be easy enough to prototype - but depends what > >> you want to spend your time on, I know this whole issue has been a > >> huge investment of your time already - but maybe this recent > >> revitalization of the conversation around having an explicit value in > >> the linker might be sufficient to address everyone's needs... *fingers > >> crossed*) > >> > >> > >> > >> 2) Optimize the DWARF size. > >> > > >> > > >> > > Do your users care much about this? I imagine if they had significant > >> DWARF size issues, > >> > > they'd have significant link time issues and the kind of cost to link > >> time this feature has would > >> > > be prohibitive - but perhaps they're sharing linked binaries much more > >> often than they're > >> > > actually performing linking. > >> > > >> > Yes, they do. They also have significant link-time issues. > >> > So current performance results of D74169 are not very acceptable. > >> > We hope to improve it. > >> > > >> > > >> > > >> > >>The specifics which our users have: > >> > >> - embedded platform which uses 0 as start of .text section. > >> > >> - custom toolset which does not support all features yet(f.e. split > >> dwarf). > >> > >> - tolerant of the link-time increase. > >> > >> - need a useful way to share debug builds. > >> > > >> > > >> > > Sharing two files (executable and dwp) is significantly less useful > >> than sharing one file? > >> > > >> > Probably not significantly, but yes, it looks less useful comparing to > >> D74169. > >> > Having only two files (executable and .dwp) looks significantly better > >> than having executable and multiple .dwo files. > >> > Having only one file(executable) with minimal size looks better than the > >> two files with a bigger size. > >> > > >> > clang compiled with -gsplitdwarf takes 0.9G for executable and 0.9G for > >> .dwp. > >> > clang compiled with -gc-debuginfo takes only 0.76G for single executable. > >> > > >> > > >> > > >> > >>For the first point: we have a problem "Overlapping address ranges > >> starting from 0"(D59553). > >> > > >> > >>We use custom solution, but the general solution like D74169 would be > >> better here. > >> > > >> > > >> > > If CU ranges are the only ones that need fixing, then I think the > >> above solution might be as > >> > > good/better - if more than CU ranges need fixing, then I think we > >> might want to start talking about > >> > > how to fix DWARF itself (split and non-split) to signal certain > >> addresses point to dead code with a > >> > > specific blessed value that linkers would need to implement - because > >> with Split DWARF there's > >> > > no way to solve the non-CU addresses at the linker. > >> > > >> > I think the worthful solution for that signal value would be LowPC > > >> HighPC. > >> > That does not require additional bits in DWARF. > >> > It would be natural to skip such address ranges since they explicitly > >> marked as invalid. > >> > It could be implemented in a linker very easily. Probably, it would make > >> sense to describe that > >> > usage in DWARF standard. > >> > > >> > As to the addresses which are not seen by the linker(since they are in > >> .dwo files) - yes, > >> > they need to have another solution. Could you show an example of such a > >> case, please? > >> > > >> > > >> > > >> > >>>2. Support of type units. > >> > > >> > >>> > >> > > >> > >>>> That could be implemented further. > >> > > >> > >>>Enabling type units increases object size to make it easier to > >> deduplicate at link time by a DWARF-unaware > >> > > >> > >>>linker. With a DWARF aware linker it'd be generally desirable not to > >> have to add that object size overhead to > >> > > >> > >>>get the linking improvements. > >> > > >> > >> > >> > > >> > >>But, DWARFLinker should adequately work with type units since they are > >> already implemented. > >> > > >> > > >> > > Maybe - it'd be nice & all, but I don't think it's an outright > >> necessity - if someone knows they're using > >> > > a DWARF-aware linker, they'd probably not use type units in their > >> object files. It's possible someone > >> > > doesn't know for sure & maybe they have pre-canned debug object files > >> from someone else, etc. > >> > > >> > I see. > >> > > >> > >>Another thing is that the idea behind type units has the potential to > >> help Dwarf-aware linker to work faster. > >> > > >> > >>Currently, DWARFLinker analyzes context to understand whether types > >> are the same or not. > >> > > >> > > >> > >When you say "analyzes context" what do you mean? Usually I'd take that > >> to mean > >> > > "looks at things outside the type itself - like what namespace it's > >> in, etc" - which, yes, > >> > > it should do that, but it doesn't seem very expensive to do. But I > >> guess you actually > >> > > mean something about doing structural equivalence in some way, looking > >> at things inside the type? > >> > > >> > I think it could be useful for both cases. Currently, dsymutil does only > >> first thing > >> > (look at type name, namespace name, etc..) and does not do the second > >> thing > >> > (doing structural equivalence). Analyzing type names is currently quite > >> expensive > >> > (the only search in string pool takes ~10 sec from 70 sec of overall > >> time). > >> > That is expensive because of many things should be done to work with > >> strings: > >> > parse DWARF, search and resolve relocations, compute a hash for strings, > >> > put data into a string pool, create a fully qualified name(like > >> namespace::function::name). > >> > It looks like it could be optimized and finally require less time, but > >> it still would be a noticeable > >> > part of the overall time. > >> > > >> > If dsymutil starts to check for the structural equivalence, then the > >> process would be even more slowly. > >> > So, If instead of comparing types structure, there would be checked > >> single hash-id - then this process > >> > would also be faster. > >> > > >> > Thus I think using hash-id to compare types would allow to make current > >> implementation faster and would > >> > allow handling incomplete types by DWARFLinker without massive > >> performance degradation also. > >> > > >> > >> But the context is known when types are generated. So, no need to > >> spent the time analyzing it. > >> > > >> > >> If types could be compared without analyzing context, then > >> Dwarf-aware linker would work faster. > >> > > >> > >> That is just an idea(not for immediate implementation): If types > >> would be stored in some "type table" > >> > > >> > >> (instead of COMDAT section group) and could be accessed through > >> hash-id(like type units > >> > > >> > >> - then it would be the solution requiring fewer bits to store but > >> allowing to compare types > >> > > >> > >> by hash-id(not analysing context). > >> > >> In this case, size increasing would be small. And processing time > >> could be done faster. > >> > >> > >> > >> this is just an idea and could be discussed separately from the > >> problem of integrating of D74169. > >> > > >> > >> >> 6. -flto=thin > >> > > >> > >> >> That problem was described in this review > >> https://reviews.llvm.org/D54747#1503720. It also exists in > >> > > >> > >> >> current DWARFLinker/dsymutil implementation. I think that problem > >> should be discussed more: it could > >> > > >> > >> >> probably be fixed by avoiding generation of such incomplete > >> declaration during thinlto, > >> > > >> > >> >> That would be costly to produce extra/redundant debug info in > >> ThinLTO - actually ThinLTO could be doing > >> > > >> > >> >> more to reduce that redundancy early on (actually removing > >> definitions from some llvm Modules if the type > >> > > >> > >> >> definition is known to exist in another Module, etc) > >> > >> >I don't know if it's a problem since that patch was reverted. > >> > > >> > >> > >> > > >> > >> Yes. That patch was reverted, but this patch(D74169) has the same > >> problem. > >> > > >> > >> if D74169 would be applied and --gc-debuginfo used then structure type > >> > >> definition would be removed. > >> > > >> > >> DWARFLinker could handle that case - "removing definitions from some > >> llvm Modules if the type > >> > >> definition is known to exist in another Module". > >> > >> i.e. DWARFLinker could replace the declaration with the definition. > >> > > >> > >> But that problem could be more easily resolved when debug info is > >> generated(probably without > >> > >> significant increase of debug info size): > >> > > >> > >> Here we have: > >> > > >> > >> DW_TAG_compile_unit(0x0000000b) - compile unit containing concrete > >> instance for function "f". > >> > >> DW_TAG_compile_unit(0x00000073) - compile unit containing abstract > >> instance root for function "f". > >> > >> DW_TAG_compile_unit(0x000000c1) - compile unit containing function > >> "f" definition. > >> > > >> > >> Code for function "f" was deleted. gc-debuginfo deletes compile unit > >> DW_TAG_compile_unit(0x000000c1) > >> > >> containing "f" definition (since there is no corresponding code). But > >> it has structure "Foo" definition > >> > >> DW_TAG_structure_type(0x0000011e) referenced from > >> DW_TAG_compile_unit(0x00000073) > >> > >> by declaration DW_TAG_structure_type(0x000000ae). That declaration is > >> exactly the case when definition > >> > >> was removed by thinlto and replaced with declaration. > >> > > >> > >> Would it cost too much if type definition would not be replaced with > >> declaration for "abstract instance root"? > >> > >> The number of concrete instances is bigger than number of abstract > >> instance roots. > >> > >> Probably, it would not be too costly to leave definition in abstract > >> instance root? > >> > > >> > > >> > > >> > >> Alternatively, Would it cost too much if type definition would not be > >> replaced with declaration when > >> > >> declaration references type from not used function? (lto could > >> understand that concrete function is not used). > >> > > >> > > >> > >I don't follow this example - could you provide a small concrete test > >> case I could reproduce? > >> > > >> > I would provide a test case if necessary. But it looks like this issue > >> is finally clear, and you already commented on that. > >> > > >> > > Oh, I guess this is happening perhaps because ThinLTO can't know for > >> sure that a standalone > >> > > definition of 'f' won't be needed - so it produces one in case one of > >> the inlining opportunities > >> > > doesn't end up inlining. Then it turns out all calls got inlined, so > >> the external definition wasn't needed. > >> > > >> > > Oh, you're suggesting that these 3 CUs got emitted into one object > >> file during LTO, but that DWARFLinker > >> > > drops a CU without any code in it - even though... So far as I know, > >> in LTO, LLVM directly references > >> > > types across units if the CUs are all emitted in the same object file. > >> (and if they weren't in the same > >> > > object file - then the abstract_origin couldn't be pointing cross-CU). > >> > > >> > > I guess some basic things to say: > >> > > >> > > With ThinLTO, the concrete/standalone function definition is emitted > >> in case some call sites don't end up > >> > > being inlined. So we know it'll be emitted (but might not be needed by > >> the actual linker) > >> > > ANy number of inline calls might exist - but we shouldn't put the type > >> information into those, because > >> > > they aren't guaranteed to emit it (if the inline function gets > >> optimized away, there would be nothing to > >> > > enforce the type being emitted) - and even if we forced the type > >> information to be emitted into one > >> > > object file that has an inline copy of the function - there's no > >> guarantee that object file will get linked in either. > >> > > >> > > So, no, I don't think there's much we can do to keep the size of > >> object files down, while guaranteeing > >> > > the type information will be emitted with the usual linker semantics. > >> > > >> > Then dsymutil/DWARFLinker could be changed to handle that(though it > >> would probably be not very efficient). > >> > If thinlto would understand that function is not used finally(and then > >> must not contain referenced type definition), > >> > then this situation could be handled more effectively. > >> > > >> > Thank you, Alexey. > >> > > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> LLVM Developers mailing list > >> >>> llvm-dev at lists.llvm.org > >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > > >_______________________________________________ > >LLVM Developers mailing list > >llvm-dev at lists.llvm.org > >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Alexey Lapshin via llvm-dev
2020-Jun-05 11:32 UTC
[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.
>FWIW, I think it's probably best to at least initially frame the >discussion around non-configurable value for the sake of reducing the >scope/possible surface area of the feature/users/etc. I'd probably >only encourage adding the user-configurable flag if/when someone has a >use case for it.I second that: "it's probably best to at least initially frame the discussion around non-configurable value for the sake of reducing the scope/possible surface area of the feature/users/etc". The necessity of using some different concrete value most probably would arise if there is a tool which uses this another value. Until there is a known use case, it would be better to use just: --dead-reloc-addend Thank you, Alexey. On Thu, Jun 4, 2020 at 2:31 PM Fangrui Song <maskray at google.com> wrote:> > > On 2020-06-03, James Henderson via llvm-dev wrote: > >It makes me sad that the linker (via a library or otherwise) has to be > >"DWARF-aware" to be able to effectively handle --gc-sections, COMDATs, > >--icf etc for debug info, without leaving large blocks of data kicking > >around. > > > >The patching to -1 (or equivalent) is probably a good lightweight solution > >(though I'd love it if it could be done based on section type in the future > >rather than section name, but that's probably outside the realm of DWARF), > >as it requires only minimal understanding in the linker, but anything > >beyond that seems to be complicated logic that is mostly due to the > >structure of DWARF. Patching to -1 does feel a bit like a sticking > >plaster/band aid to patch over the issue rather than properly solving it > >too - there will still be debug data (potentially significant amounts in > >COMDAT-heavy objects) that the linker has to write and the debugger has to > >somehow know how to skip (even if it knows that -1 is special-case due to > >the standard being updated, it needs to get as far as the -1), which is all > >wasted effort. > > > >We've already seen from Alexey's prototyping, and from our own experiences > >with the Sony proprietary linker (which tried to rewrite .debug_line only) > >that deconstructing the DWARF so that it can be more optimally reassembled > >at link time is slow going, and will probably inevitably be however much > >effort is put into optimising it. For a start, given the current standards, > >it's impossible to know how to deconstruct it without having to parse vast > >amounts of DWARF, which is typically going to mean a lot more parsing work > >than the linker would normally have to deal with. Additionally, much of > >this parsing work is wasted effort, since it seems unlikely in many links > >that large amounts of the DWARF will be redundant. Having an option to > >opt-in doesn't help much there, since it just means the logic exists > >without most people using it, due to it not being good enough, or > >potentially they don't even know it exists. > > > >I don't have particularly concrete suggestions as to how to solve the > >structural problems with DWARF at this point. The only thing that seems > >obvious to me is a more "blessed" approach to fragmentation of sections, > >similar to what I tried with my prototype mentioned earlier in the thread, > >although we'd need to figure out the previously stated performance issues. > >Other ideas might tie into this, like somehow sharing the various table > >headers a bit like CIEs in .eh_frame that could be merged by the linker - > >each object could have separate table header sections, which are referenced > >by the individual .debug_* blocks, which in turn are one per function/data > >piece and easily discardable/merged by the linker. > > > >Just some thoughts. > > > >James > > Your proposed option --dead-reloc-addend=.debug_info=0xffffffffffffffff > seems like a good idea. (I'd expect it to support signed -1 and -2 for > convenience & consistency in some other places (we sometimes use addends > as signed values)). > > LLD only supports absolute relocation types (plus R_PPC64_DTPREL64 which > can go to .debug_addr, plus R_RISCV_{ADD,SUB}*). > > The computed value is S + A. > We still consider the symbolic value S as zero, but override A with the > supplied option --dead-reloc-addend=.debug_info=-1 > I particularly like that `addend` is part of the option name. > > My mere complaint is that the relocation record is not dead, but rather > its referenced symbol is dead. However, I can't think of a better > name... > > Checked with Martin Storsjö, this option may be useful for other binary > formats supporting DWARF. (binutils does not like ELF-specific options > not called -z foobar). > I think it is fine to add this option to LLD if GNU ld is also happy > with the name. I'll check with them. > > "There is a danger that one community won't accept an extension that > they haven't been involved in the design process for." :) (Coutesy of Peter) > > The built-in rules of the linker are the following: > > --dead-reloc-addend=.debug_loc=-2 > --dead-reloc-addend=.debug_ranges=-2 > --dead-reloc-addend=.debug_*=-1 > > They can be overridden. > > >On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev < > >llvm-dev at lists.llvm.org> wrote: > > > >> On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin > >> <alapshin at accesssoftek.com> wrote: > >> > > >> > Hi David, please find my comments inside: > >> > > >> > > >> > >>>Broad question: Do you have any specific motivation/users/etc in > >> implementing this (if you can speak about it)? > >> > > >> > >>> - it might help motivate the work, understand what tradeoffs might > >> be suitable for you/your users, etc. > >> > > >> > >>There are two general requirements: > >> > >> 1) Remove (or clean) invalid debug info. > >> > > >> > > > >> > >Perhaps a simpler direct solution for your immediate needs might be a > >> much narrower, > >> > >and more efficient linker-DWARF-awareness feature: > >> > > > >> > > With DWARFv5, rnglists present an opportunity for a DWARF linker to > >> rewrite the ranges > >> > > without parsing the rest of the DWARF. /technically/ this isn't > >> guaranteed - rnglist entries > >> > > can be referenced either directly, or by index. If all rnglists are > >> referenced by index, then > >> > > a linker could parse only the debug_rnglists section and rewrite > >> ranges to remove any > >> > > address ranges that refer to optimized-out code. > >> > > > >> > > This would only be correct for rnglists that had no direct references > >> to them (that only were > >> > > referenced via the indexes) - but we could either implement it with > >> that assumption, or could > >> > > add an LLVM extension attribute on the CU that would say "I promise I > >> only referenced rnglists > >> > > via rnglistx forms/indexes). If this DWARF-aware linking would have to > >> read the CU DIE (not > >> > > all the other DIEs) it /could/ also then rewrite high/low_pc if the CU > >> wasn't using ranges... > >> > > but that wouldn't come up in the function-removal case, because then > >> you'd have ranges anyway, > >> > > so no need for that. > >> > > > >> > > Such a DWARF-aware rnglist linking could also simplify rnglists, in > >> cases where functions > >> > > ended up being laid out next to each other, the linker could coalesce > >> their ranges together. > >> > > > >> > > I imagine this could be implemented with very little overhead to > >> linking, especially compared > >> > > to the overhead of full DWARF-aware linking. > >> > > > >> > >Though none of this fixes Split DWARF, where the linker doesn't get a > >> chance to see the > >> > > addresses being used - but if you only want/need the CU-level ranges > >> to be correct, this > >> > > might be a viable fix, and quite efficient. > >> > > >> > Yes, we think about that alternative. This would resolve our problem of > >> invalid debug info > >> > and would work much faster. Thus, if we would not have good results for > >> D74169 then we > >> > will implement it. Do you think it could be useful to have this solution > >> in upstream? > >> > >> A pure rnglist rewriting - I think it'd be OK to have in upstream - > >> again, cost/benefit/etc would have to be weighed. I'm not sure it > >> would save enough space to be particularly valuable beyond the > >> correctness issue - and it doesn't completely solve the correctness > >> issue for zero-address usage or low-address usage (because you could > >> still have overlapping subprograms inside a CU - so if you were > >> symbolizing you could use the correct rnglist to filter, but then go > >> look inside the CU only to find two subprograms that had that address > >> & not know which one was the correct one an which one was the > >> discarded one). > >> > >> rnglist rewriting might be easy enough to prototype - but depends what > >> you want to spend your time on, I know this whole issue has been a > >> huge investment of your time already - but maybe this recent > >> revitalization of the conversation around having an explicit value in > >> the linker might be sufficient to address everyone's needs... *fingers > >> crossed*) > >> > >> > >> > >> 2) Optimize the DWARF size. > >> > > >> > > >> > > Do your users care much about this? I imagine if they had significant > >> DWARF size issues, > >> > > they'd have significant link time issues and the kind of cost to link > >> time this feature has would > >> > > be prohibitive - but perhaps they're sharing linked binaries much more > >> often than they're > >> > > actually performing linking. > >> > > >> > Yes, they do. They also have significant link-time issues. > >> > So current performance results of D74169 are not very acceptable. > >> > We hope to improve it. > >> > > >> > > >> > > >> > >>The specifics which our users have: > >> > >> - embedded platform which uses 0 as start of .text section. > >> > >> - custom toolset which does not support all features yet(f.e. split > >> dwarf). > >> > >> - tolerant of the link-time increase. > >> > >> - need a useful way to share debug builds. > >> > > >> > > >> > > Sharing two files (executable and dwp) is significantly less useful > >> than sharing one file? > >> > > >> > Probably not significantly, but yes, it looks less useful comparing to > >> D74169. > >> > Having only two files (executable and .dwp) looks significantly better > >> than having executable and multiple .dwo files. > >> > Having only one file(executable) with minimal size looks better than the > >> two files with a bigger size. > >> > > >> > clang compiled with -gsplitdwarf takes 0.9G for executable and 0.9G for > >> .dwp. > >> > clang compiled with -gc-debuginfo takes only 0.76G for single executable. > >> > > >> > > >> > > >> > >>For the first point: we have a problem "Overlapping address ranges > >> starting from 0"(D59553). > >> > > >> > >>We use custom solution, but the general solution like D74169 would be > >> better here. > >> > > >> > > >> > > If CU ranges are the only ones that need fixing, then I think the > >> above solution might be as > >> > > good/better - if more than CU ranges need fixing, then I think we > >> might want to start talking about > >> > > how to fix DWARF itself (split and non-split) to signal certain > >> addresses point to dead code with a > >> > > specific blessed value that linkers would need to implement - because > >> with Split DWARF there's > >> > > no way to solve the non-CU addresses at the linker. > >> > > >> > I think the worthful solution for that signal value would be LowPC > > >> HighPC. > >> > That does not require additional bits in DWARF. > >> > It would be natural to skip such address ranges since they explicitly > >> marked as invalid. > >> > It could be implemented in a linker very easily. Probably, it would make > >> sense to describe that > >> > usage in DWARF standard. > >> > > >> > As to the addresses which are not seen by the linker(since they are in > >> .dwo files) - yes, > >> > they need to have another solution. Could you show an example of such a > >> case, please? > >> > > >> > > >> > > >> > >>>2. Support of type units. > >> > > >> > >>> > >> > > >> > >>>> That could be implemented further. > >> > > >> > >>>Enabling type units increases object size to make it easier to > >> deduplicate at link time by a DWARF-unaware > >> > > >> > >>>linker. With a DWARF aware linker it'd be generally desirable not to > >> have to add that object size overhead to > >> > > >> > >>>get the linking improvements. > >> > > >> > >> > >> > > >> > >>But, DWARFLinker should adequately work with type units since they are > >> already implemented. > >> > > >> > > >> > > Maybe - it'd be nice & all, but I don't think it's an outright > >> necessity - if someone knows they're using > >> > > a DWARF-aware linker, they'd probably not use type units in their > >> object files. It's possible someone > >> > > doesn't know for sure & maybe they have pre-canned debug object files > >> from someone else, etc. > >> > > >> > I see. > >> > > >> > >>Another thing is that the idea behind type units has the potential to > >> help Dwarf-aware linker to work faster. > >> > > >> > >>Currently, DWARFLinker analyzes context to understand whether types > >> are the same or not. > >> > > >> > > >> > >When you say "analyzes context" what do you mean? Usually I'd take that > >> to mean > >> > > "looks at things outside the type itself - like what namespace it's > >> in, etc" - which, yes, > >> > > it should do that, but it doesn't seem very expensive to do. But I > >> guess you actually > >> > > mean something about doing structural equivalence in some way, looking > >> at things inside the type? > >> > > >> > I think it could be useful for both cases. Currently, dsymutil does only > >> first thing > >> > (look at type name, namespace name, etc..) and does not do the second > >> thing > >> > (doing structural equivalence). Analyzing type names is currently quite > >> expensive > >> > (the only search in string pool takes ~10 sec from 70 sec of overall > >> time). > >> > That is expensive because of many things should be done to work with > >> strings: > >> > parse DWARF, search and resolve relocations, compute a hash for strings, > >> > put data into a string pool, create a fully qualified name(like > >> namespace::function::name). > >> > It looks like it could be optimized and finally require less time, but > >> it still would be a noticeable > >> > part of the overall time. > >> > > >> > If dsymutil starts to check for the structural equivalence, then the > >> process would be even more slowly. > >> > So, If instead of comparing types structure, there would be checked > >> single hash-id - then this process > >> > would also be faster. > >> > > >> > Thus I think using hash-id to compare types would allow to make current > >> implementation faster and would > >> > allow handling incomplete types by DWARFLinker without massive > >> performance degradation also. > >> > > >> > >> But the context is known when types are generated. So, no need to > >> spent the time analyzing it. > >> > > >> > >> If types could be compared without analyzing context, then > >> Dwarf-aware linker would work faster. > >> > > >> > >> That is just an idea(not for immediate implementation): If types > >> would be stored in some "type table" > >> > > >> > >> (instead of COMDAT section group) and could be accessed through > >> hash-id(like type units > >> > > >> > >> - then it would be the solution requiring fewer bits to store but > >> allowing to compare types > >> > > >> > >> by hash-id(not analysing context). > >> > >> In this case, size increasing would be small. And processing time > >> could be done faster. > >> > >> > >> > >> this is just an idea and could be discussed separately from the > >> problem of integrating of D74169. > >> > > >> > >> >> 6. -flto=thin > >> > > >> > >> >> That problem was described in this review > >> https://reviews.llvm.org/D54747#1503720. It also exists in > >> > > >> > >> >> current DWARFLinker/dsymutil implementation. I think that problem > >> should be discussed more: it could > >> > > >> > >> >> probably be fixed by avoiding generation of such incomplete > >> declaration during thinlto, > >> > > >> > >> >> That would be costly to produce extra/redundant debug info in > >> ThinLTO - actually ThinLTO could be doing > >> > > >> > >> >> more to reduce that redundancy early on (actually removing > >> definitions from some llvm Modules if the type > >> > > >> > >> >> definition is known to exist in another Module, etc) > >> > >> >I don't know if it's a problem since that patch was reverted. > >> > > >> > >> > >> > > >> > >> Yes. That patch was reverted, but this patch(D74169) has the same > >> problem. > >> > > >> > >> if D74169 would be applied and --gc-debuginfo used then structure type > >> > >> definition would be removed. > >> > > >> > >> DWARFLinker could handle that case - "removing definitions from some > >> llvm Modules if the type > >> > >> definition is known to exist in another Module". > >> > >> i.e. DWARFLinker could replace the declaration with the definition. > >> > > >> > >> But that problem could be more easily resolved when debug info is > >> generated(probably without > >> > >> significant increase of debug info size): > >> > > >> > >> Here we have: > >> > > >> > >> DW_TAG_compile_unit(0x0000000b) - compile unit containing concrete > >> instance for function "f". > >> > >> DW_TAG_compile_unit(0x00000073) - compile unit containing abstract > >> instance root for function "f". > >> > >> DW_TAG_compile_unit(0x000000c1) - compile unit containing function > >> "f" definition. > >> > > >> > >> Code for function "f" was deleted. gc-debuginfo deletes compile unit > >> DW_TAG_compile_unit(0x000000c1) > >> > >> containing "f" definition (since there is no corresponding code). But > >> it has structure "Foo" definition > >> > >> DW_TAG_structure_type(0x0000011e) referenced from > >> DW_TAG_compile_unit(0x00000073) > >> > >> by declaration DW_TAG_structure_type(0x000000ae). That declaration is > >> exactly the case when definition > >> > >> was removed by thinlto and replaced with declaration. > >> > > >> > >> Would it cost too much if type definition would not be replaced with > >> declaration for "abstract instance root"? > >> > >> The number of concrete instances is bigger than number of abstract > >> instance roots. > >> > >> Probably, it would not be too costly to leave definition in abstract > >> instance root? > >> > > >> > > >> > > >> > >> Alternatively, Would it cost too much if type definition would not be > >> replaced with declaration when > >> > >> declaration references type from not used function? (lto could > >> understand that concrete function is not used). > >> > > >> > > >> > >I don't follow this example - could you provide a small concrete test > >> case I could reproduce? > >> > > >> > I would provide a test case if necessary. But it looks like this issue > >> is finally clear, and you already commented on that. > >> > > >> > > Oh, I guess this is happening perhaps because ThinLTO can't know for > >> sure that a standalone > >> > > definition of 'f' won't be needed - so it produces one in case one of > >> the inlining opportunities > >> > > doesn't end up inlining. Then it turns out all calls got inlined, so > >> the external definition wasn't needed. > >> > > >> > > Oh, you're suggesting that these 3 CUs got emitted into one object > >> file during LTO, but that DWARFLinker > >> > > drops a CU without any code in it - even though... So far as I know, > >> in LTO, LLVM directly references > >> > > types across units if the CUs are all emitted in the same object file. > >> (and if they weren't in the same > >> > > object file - then the abstract_origin couldn't be pointing cross-CU). > >> > > >> > > I guess some basic things to say: > >> > > >> > > With ThinLTO, the concrete/standalone function definition is emitted > >> in case some call sites don't end up > >> > > being inlined. So we know it'll be emitted (but might not be needed by > >> the actual linker) > >> > > ANy number of inline calls might exist - but we shouldn't put the type > >> information into those, because > >> > > they aren't guaranteed to emit it (if the inline function gets > >> optimized away, there would be nothing to > >> > > enforce the type being emitted) - and even if we forced the type > >> information to be emitted into one > >> > > object file that has an inline copy of the function - there's no > >> guarantee that object file will get linked in either. > >> > > >> > > So, no, I don't think there's much we can do to keep the size of > >> object files down, while guaranteeing > >> > > the type information will be emitted with the usual linker semantics. > >> > > >> > Then dsymutil/DWARFLinker could be changed to handle that(though it > >> would probably be not very efficient). > >> > If thinlto would understand that function is not used finally(and then > >> must not contain referenced type definition), > >> > then this situation could be handled more effectively. > >> > > >> > Thank you, Alexey. > >> > > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> LLVM Developers mailing list > >> >>> llvm-dev at lists.llvm.org > >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > > >_______________________________________________ > >LLVM Developers mailing list > >llvm-dev at lists.llvm.org > >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >_______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alexey Lapshin via llvm-dev
2020-Jun-05 12:03 UTC
[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.
small typo in previous letter:>>FWIW, I think it's probably best to at least initially frame the >>discussion around non-configurable value for the sake of reducing the >>scope/possible surface area of the feature/users/etc. I'd probably >>only encourage adding the user-configurable flag if/when someone has a >>use case for it.>I second that: "it's probably best to at least initially frame the >discussion around non-configurable value for the sake of reducing the >scope/possible surface area of the feature/users/etc".>The necessity of using some different concrete value most probably >would arise if there is a tool which uses this another value. >Until there is a known use case, it would be better to use just:>--dead-reloc-addend--dead-reloc-addend=<value>>Thank you, Alexey.On Thu, Jun 4, 2020 at 2:31 PM Fangrui Song <maskray at google.com> wrote:> > > On 2020-06-03, James Henderson via llvm-dev wrote: > >It makes me sad that the linker (via a library or otherwise) has to be > >"DWARF-aware" to be able to effectively handle --gc-sections, COMDATs, > >--icf etc for debug info, without leaving large blocks of data kicking > >around. > > > >The patching to -1 (or equivalent) is probably a good lightweight solution > >(though I'd love it if it could be done based on section type in the future > >rather than section name, but that's probably outside the realm of DWARF), > >as it requires only minimal understanding in the linker, but anything > >beyond that seems to be complicated logic that is mostly due to the > >structure of DWARF. Patching to -1 does feel a bit like a sticking > >plaster/band aid to patch over the issue rather than properly solving it > >too - there will still be debug data (potentially significant amounts in > >COMDAT-heavy objects) that the linker has to write and the debugger has to > >somehow know how to skip (even if it knows that -1 is special-case due to > >the standard being updated, it needs to get as far as the -1), which is all > >wasted effort. > > > >We've already seen from Alexey's prototyping, and from our own experiences > >with the Sony proprietary linker (which tried to rewrite .debug_line only) > >that deconstructing the DWARF so that it can be more optimally reassembled > >at link time is slow going, and will probably inevitably be however much > >effort is put into optimising it. For a start, given the current standards, > >it's impossible to know how to deconstruct it without having to parse vast > >amounts of DWARF, which is typically going to mean a lot more parsing work > >than the linker would normally have to deal with. Additionally, much of > >this parsing work is wasted effort, since it seems unlikely in many links > >that large amounts of the DWARF will be redundant. Having an option to > >opt-in doesn't help much there, since it just means the logic exists > >without most people using it, due to it not being good enough, or > >potentially they don't even know it exists. > > > >I don't have particularly concrete suggestions as to how to solve the > >structural problems with DWARF at this point. The only thing that seems > >obvious to me is a more "blessed" approach to fragmentation of sections, > >similar to what I tried with my prototype mentioned earlier in the thread, > >although we'd need to figure out the previously stated performance issues. > >Other ideas might tie into this, like somehow sharing the various table > >headers a bit like CIEs in .eh_frame that could be merged by the linker - > >each object could have separate table header sections, which are referenced > >by the individual .debug_* blocks, which in turn are one per function/data > >piece and easily discardable/merged by the linker. > > > >Just some thoughts. > > > >James > > Your proposed option --dead-reloc-addend=.debug_info=0xffffffffffffffff > seems like a good idea. (I'd expect it to support signed -1 and -2 for > convenience & consistency in some other places (we sometimes use addends > as signed values)). > > LLD only supports absolute relocation types (plus R_PPC64_DTPREL64 which > can go to .debug_addr, plus R_RISCV_{ADD,SUB}*). > > The computed value is S + A. > We still consider the symbolic value S as zero, but override A with the > supplied option --dead-reloc-addend=.debug_info=-1 > I particularly like that `addend` is part of the option name. > > My mere complaint is that the relocation record is not dead, but rather > its referenced symbol is dead. However, I can't think of a better > name... > > Checked with Martin Storsjö, this option may be useful for other binary > formats supporting DWARF. (binutils does not like ELF-specific options > not called -z foobar). > I think it is fine to add this option to LLD if GNU ld is also happy > with the name. I'll check with them. > > "There is a danger that one community won't accept an extension that > they haven't been involved in the design process for." :) (Coutesy of Peter) > > The built-in rules of the linker are the following: > > --dead-reloc-addend=.debug_loc=-2 > --dead-reloc-addend=.debug_ranges=-2 > --dead-reloc-addend=.debug_*=-1 > > They can be overridden. > > >On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev < > >llvm-dev at lists.llvm.org> wrote: > > > >> On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin > >> <alapshin at accesssoftek.com> wrote: > >> > > >> > Hi David, please find my comments inside: > >> > > >> > > >> > >>>Broad question: Do you have any specific motivation/users/etc in > >> implementing this (if you can speak about it)? > >> > > >> > >>> - it might help motivate the work, understand what tradeoffs might > >> be suitable for you/your users, etc. > >> > > >> > >>There are two general requirements: > >> > >> 1) Remove (or clean) invalid debug info. > >> > > >> > > > >> > >Perhaps a simpler direct solution for your immediate needs might be a > >> much narrower, > >> > >and more efficient linker-DWARF-awareness feature: > >> > > > >> > > With DWARFv5, rnglists present an opportunity for a DWARF linker to > >> rewrite the ranges > >> > > without parsing the rest of the DWARF. /technically/ this isn't > >> guaranteed - rnglist entries > >> > > can be referenced either directly, or by index. If all rnglists are > >> referenced by index, then > >> > > a linker could parse only the debug_rnglists section and rewrite > >> ranges to remove any > >> > > address ranges that refer to optimized-out code. > >> > > > >> > > This would only be correct for rnglists that had no direct references > >> to them (that only were > >> > > referenced via the indexes) - but we could either implement it with > >> that assumption, or could > >> > > add an LLVM extension attribute on the CU that would say "I promise I > >> only referenced rnglists > >> > > via rnglistx forms/indexes). If this DWARF-aware linking would have to > >> read the CU DIE (not > >> > > all the other DIEs) it /could/ also then rewrite high/low_pc if the CU > >> wasn't using ranges... > >> > > but that wouldn't come up in the function-removal case, because then > >> you'd have ranges anyway, > >> > > so no need for that. > >> > > > >> > > Such a DWARF-aware rnglist linking could also simplify rnglists, in > >> cases where functions > >> > > ended up being laid out next to each other, the linker could coalesce > >> their ranges together. > >> > > > >> > > I imagine this could be implemented with very little overhead to > >> linking, especially compared > >> > > to the overhead of full DWARF-aware linking. > >> > > > >> > >Though none of this fixes Split DWARF, where the linker doesn't get a > >> chance to see the > >> > > addresses being used - but if you only want/need the CU-level ranges > >> to be correct, this > >> > > might be a viable fix, and quite efficient. > >> > > >> > Yes, we think about that alternative. This would resolve our problem of > >> invalid debug info > >> > and would work much faster. Thus, if we would not have good results for > >> D74169 then we > >> > will implement it. Do you think it could be useful to have this solution > >> in upstream? > >> > >> A pure rnglist rewriting - I think it'd be OK to have in upstream - > >> again, cost/benefit/etc would have to be weighed. I'm not sure it > >> would save enough space to be particularly valuable beyond the > >> correctness issue - and it doesn't completely solve the correctness > >> issue for zero-address usage or low-address usage (because you could > >> still have overlapping subprograms inside a CU - so if you were > >> symbolizing you could use the correct rnglist to filter, but then go > >> look inside the CU only to find two subprograms that had that address > >> & not know which one was the correct one an which one was the > >> discarded one). > >> > >> rnglist rewriting might be easy enough to prototype - but depends what > >> you want to spend your time on, I know this whole issue has been a > >> huge investment of your time already - but maybe this recent > >> revitalization of the conversation around having an explicit value in > >> the linker might be sufficient to address everyone's needs... *fingers > >> crossed*) > >> > >> > >> > >> 2) Optimize the DWARF size. > >> > > >> > > >> > > Do your users care much about this? I imagine if they had significant > >> DWARF size issues, > >> > > they'd have significant link time issues and the kind of cost to link > >> time this feature has would > >> > > be prohibitive - but perhaps they're sharing linked binaries much more > >> often than they're > >> > > actually performing linking. > >> > > >> > Yes, they do. They also have significant link-time issues. > >> > So current performance results of D74169 are not very acceptable. > >> > We hope to improve it. > >> > > >> > > >> > > >> > >>The specifics which our users have: > >> > >> - embedded platform which uses 0 as start of .text section. > >> > >> - custom toolset which does not support all features yet(f.e. split > >> dwarf). > >> > >> - tolerant of the link-time increase. > >> > >> - need a useful way to share debug builds. > >> > > >> > > >> > > Sharing two files (executable and dwp) is significantly less useful > >> than sharing one file? > >> > > >> > Probably not significantly, but yes, it looks less useful comparing to > >> D74169. > >> > Having only two files (executable and .dwp) looks significantly better > >> than having executable and multiple .dwo files. > >> > Having only one file(executable) with minimal size looks better than the > >> two files with a bigger size. > >> > > >> > clang compiled with -gsplitdwarf takes 0.9G for executable and 0.9G for > >> .dwp. > >> > clang compiled with -gc-debuginfo takes only 0.76G for single executable. > >> > > >> > > >> > > >> > >>For the first point: we have a problem "Overlapping address ranges > >> starting from 0"(D59553). > >> > > >> > >>We use custom solution, but the general solution like D74169 would be > >> better here. > >> > > >> > > >> > > If CU ranges are the only ones that need fixing, then I think the > >> above solution might be as > >> > > good/better - if more than CU ranges need fixing, then I think we > >> might want to start talking about > >> > > how to fix DWARF itself (split and non-split) to signal certain > >> addresses point to dead code with a > >> > > specific blessed value that linkers would need to implement - because > >> with Split DWARF there's > >> > > no way to solve the non-CU addresses at the linker. > >> > > >> > I think the worthful solution for that signal value would be LowPC > > >> HighPC. > >> > That does not require additional bits in DWARF. > >> > It would be natural to skip such address ranges since they explicitly > >> marked as invalid. > >> > It could be implemented in a linker very easily. Probably, it would make > >> sense to describe that > >> > usage in DWARF standard. > >> > > >> > As to the addresses which are not seen by the linker(since they are in > >> .dwo files) - yes, > >> > they need to have another solution. Could you show an example of such a > >> case, please? > >> > > >> > > >> > > >> > >>>2. Support of type units. > >> > > >> > >>> > >> > > >> > >>>> That could be implemented further. > >> > > >> > >>>Enabling type units increases object size to make it easier to > >> deduplicate at link time by a DWARF-unaware > >> > > >> > >>>linker. With a DWARF aware linker it'd be generally desirable not to > >> have to add that object size overhead to > >> > > >> > >>>get the linking improvements. > >> > > >> > >> > >> > > >> > >>But, DWARFLinker should adequately work with type units since they are > >> already implemented. > >> > > >> > > >> > > Maybe - it'd be nice & all, but I don't think it's an outright > >> necessity - if someone knows they're using > >> > > a DWARF-aware linker, they'd probably not use type units in their > >> object files. It's possible someone > >> > > doesn't know for sure & maybe they have pre-canned debug object files > >> from someone else, etc. > >> > > >> > I see. > >> > > >> > >>Another thing is that the idea behind type units has the potential to > >> help Dwarf-aware linker to work faster. > >> > > >> > >>Currently, DWARFLinker analyzes context to understand whether types > >> are the same or not. > >> > > >> > > >> > >When you say "analyzes context" what do you mean? Usually I'd take that > >> to mean > >> > > "looks at things outside the type itself - like what namespace it's > >> in, etc" - which, yes, > >> > > it should do that, but it doesn't seem very expensive to do. But I > >> guess you actually > >> > > mean something about doing structural equivalence in some way, looking > >> at things inside the type? > >> > > >> > I think it could be useful for both cases. Currently, dsymutil does only > >> first thing > >> > (look at type name, namespace name, etc..) and does not do the second > >> thing > >> > (doing structural equivalence). Analyzing type names is currently quite > >> expensive > >> > (the only search in string pool takes ~10 sec from 70 sec of overall > >> time). > >> > That is expensive because of many things should be done to work with > >> strings: > >> > parse DWARF, search and resolve relocations, compute a hash for strings, > >> > put data into a string pool, create a fully qualified name(like > >> namespace::function::name). > >> > It looks like it could be optimized and finally require less time, but > >> it still would be a noticeable > >> > part of the overall time. > >> > > >> > If dsymutil starts to check for the structural equivalence, then the > >> process would be even more slowly. > >> > So, If instead of comparing types structure, there would be checked > >> single hash-id - then this process > >> > would also be faster. > >> > > >> > Thus I think using hash-id to compare types would allow to make current > >> implementation faster and would > >> > allow handling incomplete types by DWARFLinker without massive > >> performance degradation also. > >> > > >> > >> But the context is known when types are generated. So, no need to > >> spent the time analyzing it. > >> > > >> > >> If types could be compared without analyzing context, then > >> Dwarf-aware linker would work faster. > >> > > >> > >> That is just an idea(not for immediate implementation): If types > >> would be stored in some "type table" > >> > > >> > >> (instead of COMDAT section group) and could be accessed through > >> hash-id(like type units > >> > > >> > >> - then it would be the solution requiring fewer bits to store but > >> allowing to compare types > >> > > >> > >> by hash-id(not analysing context). > >> > >> In this case, size increasing would be small. And processing time > >> could be done faster. > >> > >> > >> > >> this is just an idea and could be discussed separately from the > >> problem of integrating of D74169. > >> > > >> > >> >> 6. -flto=thin > >> > > >> > >> >> That problem was described in this review > >> https://reviews.llvm.org/D54747#1503720. It also exists in > >> > > >> > >> >> current DWARFLinker/dsymutil implementation. I think that problem > >> should be discussed more: it could > >> > > >> > >> >> probably be fixed by avoiding generation of such incomplete > >> declaration during thinlto, > >> > > >> > >> >> That would be costly to produce extra/redundant debug info in > >> ThinLTO - actually ThinLTO could be doing > >> > > >> > >> >> more to reduce that redundancy early on (actually removing > >> definitions from some llvm Modules if the type > >> > > >> > >> >> definition is known to exist in another Module, etc) > >> > >> >I don't know if it's a problem since that patch was reverted. > >> > > >> > >> > >> > > >> > >> Yes. That patch was reverted, but this patch(D74169) has the same > >> problem. > >> > > >> > >> if D74169 would be applied and --gc-debuginfo used then structure type > >> > >> definition would be removed. > >> > > >> > >> DWARFLinker could handle that case - "removing definitions from some > >> llvm Modules if the type > >> > >> definition is known to exist in another Module". > >> > >> i.e. DWARFLinker could replace the declaration with the definition. > >> > > >> > >> But that problem could be more easily resolved when debug info is > >> generated(probably without > >> > >> significant increase of debug info size): > >> > > >> > >> Here we have: > >> > > >> > >> DW_TAG_compile_unit(0x0000000b) - compile unit containing concrete > >> instance for function "f". > >> > >> DW_TAG_compile_unit(0x00000073) - compile unit containing abstract > >> instance root for function "f". > >> > >> DW_TAG_compile_unit(0x000000c1) - compile unit containing function > >> "f" definition. > >> > > >> > >> Code for function "f" was deleted. gc-debuginfo deletes compile unit > >> DW_TAG_compile_unit(0x000000c1) > >> > >> containing "f" definition (since there is no corresponding code). But > >> it has structure "Foo" definition > >> > >> DW_TAG_structure_type(0x0000011e) referenced from > >> DW_TAG_compile_unit(0x00000073) > >> > >> by declaration DW_TAG_structure_type(0x000000ae). That declaration is > >> exactly the case when definition > >> > >> was removed by thinlto and replaced with declaration. > >> > > >> > >> Would it cost too much if type definition would not be replaced with > >> declaration for "abstract instance root"? > >> > >> The number of concrete instances is bigger than number of abstract > >> instance roots. > >> > >> Probably, it would not be too costly to leave definition in abstract > >> instance root? > >> > > >> > > >> > > >> > >> Alternatively, Would it cost too much if type definition would not be > >> replaced with declaration when > >> > >> declaration references type from not used function? (lto could > >> understand that concrete function is not used). > >> > > >> > > >> > >I don't follow this example - could you provide a small concrete test > >> case I could reproduce? > >> > > >> > I would provide a test case if necessary. But it looks like this issue > >> is finally clear, and you already commented on that. > >> > > >> > > Oh, I guess this is happening perhaps because ThinLTO can't know for > >> sure that a standalone > >> > > definition of 'f' won't be needed - so it produces one in case one of > >> the inlining opportunities > >> > > doesn't end up inlining. Then it turns out all calls got inlined, so > >> the external definition wasn't needed. > >> > > >> > > Oh, you're suggesting that these 3 CUs got emitted into one object > >> file during LTO, but that DWARFLinker > >> > > drops a CU without any code in it - even though... So far as I know, > >> in LTO, LLVM directly references > >> > > types across units if the CUs are all emitted in the same object file. > >> (and if they weren't in the same > >> > > object file - then the abstract_origin couldn't be pointing cross-CU). > >> > > >> > > I guess some basic things to say: > >> > > >> > > With ThinLTO, the concrete/standalone function definition is emitted > >> in case some call sites don't end up > >> > > being inlined. So we know it'll be emitted (but might not be needed by > >> the actual linker) > >> > > ANy number of inline calls might exist - but we shouldn't put the type > >> information into those, because > >> > > they aren't guaranteed to emit it (if the inline function gets > >> optimized away, there would be nothing to > >> > > enforce the type being emitted) - and even if we forced the type > >> information to be emitted into one > >> > > object file that has an inline copy of the function - there's no > >> guarantee that object file will get linked in either. > >> > > >> > > So, no, I don't think there's much we can do to keep the size of > >> object files down, while guaranteeing > >> > > the type information will be emitted with the usual linker semantics. > >> > > >> > Then dsymutil/DWARFLinker could be changed to handle that(though it > >> would probably be not very efficient). > >> > If thinlto would understand that function is not used finally(and then > >> must not contain referenced type definition), > >> > then this situation could be handled more effectively. > >> > > >> > Thank you, Alexey. > >> > > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> LLVM Developers mailing list > >> >>> llvm-dev at lists.llvm.org > >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > > >_______________________________________________ > >LLVM Developers mailing list > >llvm-dev at lists.llvm.org > >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >_______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev