Alexander Yermolovich via llvm-dev
2021-Jul-28 19:49 UTC
[llvm-dev] [RFC][Dwarf Library] Relocations for DWO sections
Somewhat different topic. Have you seen multiple Skelton CUs having exact same DWO ID, and point to two different .dwo files with exact same debug information? This is produced by ThinLTO. My understanding is that this is not legal. P.S. I updated the patch, https://reviews.llvm.org/D106624, with all suggestions. I kept it as one patch for now, until all parts are in. Then can break it up. ________________________________ From: David Blaikie <dblaikie at gmail.com> Sent: Monday, July 26, 2021 10:04 AM To: Alexander Yermolovich <ayermolo at fb.com> Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO sections On Mon, Jul 26, 2021 at 12:58 PM Alexander Yermolovich <ayermolo at fb.com<mailto:ayermolo at fb.com>> wrote: Haven't seen overflows in Split DWARF yet Careful, as they're totally silent (at least with gold dwp, and probably also with llvm dwp) - the str_offsets get overflowed values, and then when the data is read by the DWARF consumer, the strings end up corrupted - because you're reading from arbitrary/incorrect offsets. , but thanks for letting me know, and the links to discussions. Is there a plan to productize either one or both? Yep, the plan on both counts is to upstream them. I have the simplified template names implementation on the go at the moment - adding a flag to clang that implements the functionality, but also implements a "mangled" mode, where if a name should eb able to be simplified instead it's emitted in full with a special prefix ("_STN") - and then the consumer can attempt to reconstitute that name and compare it against the name provided (& the llvm-dwarfdump --verify mode does this checking and fails if they don't match). So I'm going through lots of cases, either adding the rebuilding logic that's needed, or modifying the frontend not to simplify/mark certain names that can't be rebuilt. For us, in monolithic format, it was .debug_info that was growing too large and relocations failing in to, or out of it. The.debug_aranges relocations in to it, and don't quite remember from top of my head what out relocation was in to. I think it was .debug_loc Huh, fascinating. Good to know! - Dave Alex ________________________________ From: David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> Sent: Friday, July 23, 2021 11:58 AM To: Alexander Yermolovich <ayermolo at fb.com<mailto:ayermolo at fb.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Subject: Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO sections On Fri, Jul 23, 2021 at 1:18 PM Alexander Yermolovich <ayermolo at fb.com<mailto:ayermolo at fb.com>> wrote: Thanks for replying in the patch. Left my reply. We are using it to deal with dwarf relocation overflows. Ah, that's good to know. FWIW we've started to hit some overflows even in Split DWARF on larger binaries (and/or those making especially heavy use of expression templates - creating an exceptional amount of DWARF/long symbol names A couple of ideas to address this particular overflow (which section(s) did you manage to overflow? We're dealing with .debug_str[.dwo] overflow in particular) that I'm looking into are: Simplified template names ( https://lists.llvm.org/pipermail/llvm-dev/2021-June/150903.html<https://lists.llvm.org/pipermail/llvm-dev/2021-June/150903.html> ) - emit only the base name ("foo") of a template rather than all the template parameters ("foo<int>") - and then reconstruct the full name by using the DW_TAG_template_type_parameters, etc. Reconstituted Mangled names ( https://groups.google.com/g/llvm-dev/c/2jMqDjdChuQ/m/HpOpWy8pAwAJ<https://groups.google.com/g/llvm-dev/c/2jMqDjdChuQ/m/HpOpWy8pAwAJ> ) - skip mangled names when they can be reconstituted from the DWARF structural representation (eg: "void f1(int) { }" -> "_Z2f1i" but we could build the latter from DWARF's representation that says f1 has one "int" parameter). We considered DWARF64, but split dwarf seems like a more traveled path. As for single vs split my understanding is that single plays nicer with our build system ATM. Ah, fair enough. ________________________________ From: David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> Sent: Friday, July 23, 2021 7:41 AM To: Alexander Yermolovich <ayermolo at fb.com<mailto:ayermolo at fb.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Subject: Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO sections General premise sounds correct to me (that we shouldn't be processing those sections, etc). I've replied to the patch - thanks for taking a look at this! (out of curiosity: What are you using Split DWARF single mode for (if you can speak to the application)?) On Thu, Jul 22, 2021 at 9:10 PM Alexander Yermolovich via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hello I observed when DWARF Context is created for DWO object (split dwarf single mode), that relocations for .debug_info are processed and are stored in a map. This adds quite a bit of memory overhead. This doesn't seem like it is needed for DWO Context. Context created through API DWARFContext::getDWOContext. Am I missing something? Illustrative patch to fix this: https://reviews.llvm.org/D106624<https://reviews.llvm.org/D106624> Thank you, Alex _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210728/ec59ad5f/attachment.html>
David Blaikie via llvm-dev
2021-Jul-28 20:12 UTC
[llvm-dev] [RFC][Dwarf Library] Relocations for DWO sections
On Wed, Jul 28, 2021 at 12:49 PM Alexander Yermolovich <ayermolo at fb.com> wrote:> Somewhat different topic. > Have you seen multiple Skelton CUs having exact same DWO ID, and point to > two different .dwo files with exact same debug information? This is > produced by ThinLTO. My understanding is that this is not legal. >Yeah, we've seen that a few times. I have yet to see it due to a compiler bug - so far due to "interesting" builds. The most recent was a case of using "gmlt"/g1+split DWARF - but where two files differ only by global variables (since global variables don't have any DWARF emission under -g1, the only thing remaining was a static function with the same name in both files, so the hashes were identical. I've not decided what to do about that yet (it's not high on my list to think about, but rolling around there from time to time) - maybe to just allow this/no longer warn in the dwp tool if there are these cases where it's reasonable/correct to have duplicate units... There were some other cases I came across years ago (both old and new cases were in ffmpeg, FWIW - they build the same files multiple times with different defines (to emit wide and narrow versions of the interface, I think?) ) so it's easier for the split unit/hash to become identical due to that idiom) I think did involve fixes to ffmpeg's build to not build empty files when certain features weren't enabled that made the files basically empty (& thus their split units identical)... Partly the issue is that the dwp tool doesn't have the same linker behavior as the real linker - it doesn't discard things where the real linker discards things because they're unreferenced. Arguably that build could be fixed to avoid the duplication or empty object files (eg: object file containing only a global ctor - if that's in a library, it'll never actually be used/linked in))> > P.S. I updated the patch, https://reviews.llvm.org/D106624, with all > suggestions. I kept it as one patch for now, until all parts are in. Then > can break it up. >Yeah, I'm looking at it, but getting myself a bit muddled up. See if I can get my head on straight.> > ------------------------------ > *From:* David Blaikie <dblaikie at gmail.com> > *Sent:* Monday, July 26, 2021 10:04 AM > *To:* Alexander Yermolovich <ayermolo at fb.com> > *Cc:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO > sections > > On Mon, Jul 26, 2021 at 12:58 PM Alexander Yermolovich <ayermolo at fb.com> > wrote: > > Haven't seen overflows in Split DWARF yet > > > Careful, as they're totally silent (at least with gold dwp, and probably > also with llvm dwp) - the str_offsets get overflowed values, and then when > the data is read by the DWARF consumer, the strings end up corrupted - > because you're reading from arbitrary/incorrect offsets. > > > , but thanks for letting me know, and the links to discussions. Is there a > plan to productize either one or both? > > > Yep, the plan on both counts is to upstream them. I have the simplified > template names implementation on the go at the moment - adding a flag to > clang that implements the functionality, but also implements a "mangled" > mode, where if a name should eb able to be simplified instead it's emitted > in full with a special prefix ("_STN") - and then the consumer can attempt > to reconstitute that name and compare it against the name provided (& the > llvm-dwarfdump --verify mode does this checking and fails if they don't > match). So I'm going through lots of cases, either adding the rebuilding > logic that's needed, or modifying the frontend not to simplify/mark certain > names that can't be rebuilt. > > > For us, in monolithic format, it was .debug_info that was growing too > large and relocations failing in to, or out of it. The.debug_aranges > relocations in to it, and don't quite remember from top of my head what out > relocation was in to. I think it was .debug_loc > > > Huh, fascinating. Good to know! > > - Dave > > > > Alex > ------------------------------ > *From:* David Blaikie <dblaikie at gmail.com> > *Sent:* Friday, July 23, 2021 11:58 AM > *To:* Alexander Yermolovich <ayermolo at fb.com> > *Cc:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO > sections > > On Fri, Jul 23, 2021 at 1:18 PM Alexander Yermolovich <ayermolo at fb.com> > wrote: > > Thanks for replying in the patch. Left my reply. > We are using it to deal with dwarf relocation overflows. > > > Ah, that's good to know. FWIW we've started to hit some overflows even in > Split DWARF on larger binaries (and/or those making especially heavy use of > expression templates - creating an exceptional amount of DWARF/long symbol > names > > A couple of ideas to address this particular overflow (which section(s) > did you manage to overflow? We're dealing with .debug_str[.dwo] overflow in > particular) that I'm looking into are: > Simplified template names ( > https://lists.llvm.org/pipermail/llvm-dev/2021-June/150903.html ) - emit > only the base name ("foo") of a template rather than all the template > parameters ("foo<int>") - and then reconstruct the full name by using the > DW_TAG_template_type_parameters, etc. > Reconstituted Mangled names ( > https://groups.google.com/g/llvm-dev/c/2jMqDjdChuQ/m/HpOpWy8pAwAJ ) - > skip mangled names when they can be reconstituted from the DWARF structural > representation (eg: "void f1(int) { }" -> "_Z2f1i" but we could build the > latter from DWARF's representation that says f1 has one "int" parameter). > > > We considered DWARF64, but split dwarf seems like a more traveled path. As > for single vs split my understanding is that single plays nicer with our > build system ATM. > > > Ah, fair enough. > > > ------------------------------ > *From:* David Blaikie <dblaikie at gmail.com> > *Sent:* Friday, July 23, 2021 7:41 AM > *To:* Alexander Yermolovich <ayermolo at fb.com> > *Cc:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] [RFC][Dwarf Library] Relocations for DWO > sections > > General premise sounds correct to me (that we shouldn't be processing > those sections, etc). I've replied to the patch - thanks for taking a look > at this! > > (out of curiosity: What are you using Split DWARF single mode for (if you > can speak to the application)?) > > On Thu, Jul 22, 2021 at 9:10 PM Alexander Yermolovich via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hello > > I observed when DWARF Context is created for DWO object (split dwarf > single mode), that relocations for .debug_info are processed and are stored > in a map. This adds quite a bit of memory overhead. This doesn't seem like > it is needed for DWO Context. Context created through > API DWARFContext::getDWOContext. Am I missing something? > > Illustrative patch to fix this: > https://reviews.llvm.org/D106624 > > Thank you, > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210728/073452c5/attachment.html>