David Blaikie via llvm-dev
2021-Feb-16 06:09 UTC
[llvm-dev] Extracting LocList address ranges from DWO .debug_info
This stuff is a bit ad-hoc at best. I believe some of these APIs have been generalized enough to be usable for your use-case, but it might be at a lower level - specifically I think the loclist infrastructure is used by lldb when parsing DWARFv5. But it might be used without some of the LLVM DWARF Unit abstractions you're using. (those abstractions are used in llvm-dwarfdump - which often isn't dealing with both .o and .dwo, but only dumping one of the files & doing what it can (or sometimes dumping one file containing both sets of sections, in which case it can do some address lookup, etc, more conveniently)) On Fri, Feb 12, 2021 at 6:07 PM Alexander Yermolovich via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello > > I am wondering if this is a bug, or more likely something I am doing wrong/using wrong APIs. > I have binary A, and object file A.o, compiled with Clang debug fission single mode. So .dwo sections are in the object file. Although with split mode it would bre the same behavior. > Relevant parts of the code: > for (const auto &CU : DwCtx->compile_units()) { > auto *const DwarfUnit = CU.get(); > if (llvm::Optional<uint64_t> DWOId = DwarfUnit->getDWOId()) { > auto *CUDWO = static_cast<DWARFCompileUnit*>(DwarfUnit->getNonSkeletonUnitDIE(false).getDwarfUnit()); > ... > } > } > > Later in the code I iterate over DIEs for .debug_info.dwo and call > DIE.getLocations(dwarf::DW_AT_location); > > Alternatively can manually extract offset and call > CUnit->findLoclistFromOffset(Offset); > > It fails because it tries to look up address using DWARFUnit in NormalUnits that it extracts from A.o. > Under the hood vistAsoluteLocationList is called with getAddrOffsetSectionItem passed in. > Since this DWARFUnit is DWO, it invokes Context.info_section_units(). Which uses A.o to create DW_SECT_INFO and DW_SECT_EXT_TYPES. > Then calls itself, but from the newly constructed Debug DWARFUnit. The skeleton CU that is in A.o. > > Since the way it's constructed the AddrOffsetSectionBase is never set, so getAddrOffsetSectionItem returns None. Eventually error is returned from high level API call. > > I ended up doing this to get address ranges: > DWARFLocationExpressionsVector LocEVector; > auto CallBack = [&](const DWARFLocationEntry &Entry) -> bool { > auto StartAddress > BaseUnit->getAddrOffsetSectionItem(Entry.Value0); > if (!StartAddress) { > //TODO: Handle Error > return false; > } > LocEVector.emplace_back(DWARFLocationExpression{DWARFAddressRange{ > (*StartAddress).Address, (*StartAddress).Address + Entry.Value1, > Entry.SectionIndex}, Entry.Loc}); > return true; > }; > > if(Unit->getLocationTable().visitLocationList(&Offset, CallBack)) > ... > > > But back to original API calls. Are they just not designed to work with DWO CUs, or am I missing something? > > Even if AddrOffsetSectionBase was set to 0, the address section it is accessing is in A.o and is not relocated. One would still need to get base address from the address from Skeleton CU to get fully resolved address ranges, or what I did to use index to access binary .debug_addr section directly (with appropriate AddrOffsetSectionBase). > > Thank You > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alexander Yermolovich via llvm-dev
2021-Feb-17 02:39 UTC
[llvm-dev] Extracting LocList address ranges from DWO .debug_info
I was hoping you will see this. 🙂 Digging some more this is also affected for DW_TAG_subprogram DWARFDie::getLowAndHighPC Under the hood it eventually calls DWARFUnit::getAddrOffsetSectionItem I played/hacked around with things some more, and would like opinions on potential fixes. One is to add SkeletonCU to the DWO CU when it is created. https://reviews.llvm.org/D96826 It kind of follows logic in getAddrOffsetSectionItem in a sense that if Unit is DWO, it looks in to NormalUnits, and uses that to get the address. Alternatively there is just adding a check in getAddrOffsetSectionItem itself. https://reviews.llvm.org/D96827 Reason this works is that when parseDWO, by way of getNonSkeletonUnitDIE, is invoked the AddrOffsetSection and AddrOffsetSectionBase get set correctly. The AddrOffsetSection points to the base of .debug_addr in the binary A, and AddrOffsetSectionBase is proper offset for this unit within it. So when getAddrOffsetSectionItem is invoked address is looked up correctly in the section. Kind of weird part is that Obj is A.o since we are using the DWO unit Context. I am just digging into these APIs, so maybe I am missing something, and there is a better option. Thank You Alex ________________________________ From: David Blaikie <dblaikie at gmail.com> Sent: Monday, February 15, 2021 10:09 PM To: Alexander Yermolovich <ayermolo at fb.com>; Pavel Labath <pavel at labath.sk> Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Extracting LocList address ranges from DWO .debug_info This stuff is a bit ad-hoc at best. I believe some of these APIs have been generalized enough to be usable for your use-case, but it might be at a lower level - specifically I think the loclist infrastructure is used by lldb when parsing DWARFv5. But it might be used without some of the LLVM DWARF Unit abstractions you're using. (those abstractions are used in llvm-dwarfdump - which often isn't dealing with both .o and .dwo, but only dumping one of the files & doing what it can (or sometimes dumping one file containing both sets of sections, in which case it can do some address lookup, etc, more conveniently)) On Fri, Feb 12, 2021 at 6:07 PM Alexander Yermolovich via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello > > I am wondering if this is a bug, or more likely something I am doing wrong/using wrong APIs. > I have binary A, and object file A.o, compiled with Clang debug fission single mode. So .dwo sections are in the object file. Although with split mode it would bre the same behavior. > Relevant parts of the code: > for (const auto &CU : DwCtx->compile_units()) { > auto *const DwarfUnit = CU.get(); > if (llvm::Optional<uint64_t> DWOId = DwarfUnit->getDWOId()) { > auto *CUDWO = static_cast<DWARFCompileUnit*>(DwarfUnit->getNonSkeletonUnitDIE(false).getDwarfUnit()); > ... > } > } > > Later in the code I iterate over DIEs for .debug_info.dwo and call > DIE.getLocations(dwarf::DW_AT_location); > > Alternatively can manually extract offset and call > CUnit->findLoclistFromOffset(Offset); > > It fails because it tries to look up address using DWARFUnit in NormalUnits that it extracts from A.o. > Under the hood vistAsoluteLocationList is called with getAddrOffsetSectionItem passed in. > Since this DWARFUnit is DWO, it invokes Context.info_section_units(). Which uses A.o to create DW_SECT_INFO and DW_SECT_EXT_TYPES. > Then calls itself, but from the newly constructed Debug DWARFUnit. The skeleton CU that is in A.o. > > Since the way it's constructed the AddrOffsetSectionBase is never set, so getAddrOffsetSectionItem returns None. Eventually error is returned from high level API call. > > I ended up doing this to get address ranges: > DWARFLocationExpressionsVector LocEVector; > auto CallBack = [&](const DWARFLocationEntry &Entry) -> bool { > auto StartAddress > BaseUnit->getAddrOffsetSectionItem(Entry.Value0); > if (!StartAddress) { > //TODO: Handle Error > return false; > } > LocEVector.emplace_back(DWARFLocationExpression{DWARFAddressRange{ > (*StartAddress).Address, (*StartAddress).Address + Entry.Value1, > Entry.SectionIndex}, Entry.Loc}); > return true; > }; > > if(Unit->getLocationTable().visitLocationList(&Offset, CallBack)) > ... > > > But back to original API calls. Are they just not designed to work with DWO CUs, or am I missing something? > > Even if AddrOffsetSectionBase was set to 0, the address section it is accessing is in A.o and is not relocated. One would still need to get base address from the address from Skeleton CU to get fully resolved address ranges, or what I did to use index to access binary .debug_addr section directly (with appropriate AddrOffsetSectionBase). > > Thank You > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210217/f46b198b/attachment.html>
Alexander Yermolovich via llvm-dev
2021-Feb-22 18:50 UTC
[llvm-dev] Extracting LocList address ranges from DWO .debug_info
Hello David. My apologies, let me provide some context. I am helping with BOLT binary optimizer (soon to be upstreamed). As part of its functionality it updates debug information to reflect the changes it had made to the binary. Moving functions around, extracting cold blocks, ICF, etc. Right now, it works with monolithic Debug information, but not with Fission one. It completely re-writes debug line, ranges/aranges, and patches relevant DIEs entries to point to new offsets within those sections. Which means finding what current addresses are in DIE, mapping them to new addresses and from that new offsets within sections. For debug fission it also will need to re-write .debug_addr and update indices that point to it. I looked at llvm-symbolizer and this seems a bit high level. So usage model is closer to 1) I think. Right now there is no link, but one solution would be to add it, when getNonSkeletonUnitDIE/parseDWO is called. This reflects the code in getAddrOffsetSection that tries to parse normal CUs current DWARFUnit is DWO. I don't know what original intent of that code was, but as it stands, I don't think it works because it parses none relocated skeleton CU in A.o. Rough idea: * https://reviews.llvm.org/D96826 Alternative, that whole code can be skipped entirely. * https://reviews.llvm.org/D96827 This works because in parseDWO we set AddrOffsetSectionBase, and AddrOffsetSection from .debug_addr in binary. Then in getAddrOffsetSectionItem we have all the information to get addresses from indices. One weird part is that DWARFDataExtractor is created with A.o file, while AddrOffsetSection is from A binary. The getAddrOffsetSectionItem is an important low level API. For example, it is also used by DWARFUnit::getLowandHighPC, along with DWARFDie::getLocations, DWARFUnit::findLocationLIstFromOffset. So, making a fix at that level, would make other more high-level APIs work for DWO contents. *Diffs are same ones as previously mentioned. Alex " Sorry I'm not really following all these pieces. There's two basic ways these APIs are predominantly used: 1) llvm-dwarfdump: This opens one file/context at a time, and generally doesn't open other files - such as dwos or o/exe for skeleton. (indeed, there's no reliable way to find a skeleton, given a dwo - only to find dwos given skeletons) 2) llvm-symbolizer: this opens executable files (or .o files) and from there can load dwo/dwp/dsym related files as needed What sort of use case do you have? I guess it can/should look something like (2) so can you use the LLVM debug info APIs in a similar manner to llvm-symbolizer to achieve your goals? " ________________________________ From: David Blaikie <dblaikie at gmail.com> Sent: Monday, February 15, 2021 10:09 PM To: Alexander Yermolovich <ayermolo at fb.com>; Pavel Labath <pavel at labath.sk> Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Extracting LocList address ranges from DWO .debug_info This stuff is a bit ad-hoc at best. I believe some of these APIs have been generalized enough to be usable for your use-case, but it might be at a lower level - specifically I think the loclist infrastructure is used by lldb when parsing DWARFv5. But it might be used without some of the LLVM DWARF Unit abstractions you're using. (those abstractions are used in llvm-dwarfdump - which often isn't dealing with both .o and .dwo, but only dumping one of the files & doing what it can (or sometimes dumping one file containing both sets of sections, in which case it can do some address lookup, etc, more conveniently)) On Fri, Feb 12, 2021 at 6:07 PM Alexander Yermolovich via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello > > I am wondering if this is a bug, or more likely something I am doing wrong/using wrong APIs. > I have binary A, and object file A.o, compiled with Clang debug fission single mode. So .dwo sections are in the object file. Although with split mode it would bre the same behavior. > Relevant parts of the code: > for (const auto &CU : DwCtx->compile_units()) { > auto *const DwarfUnit = CU.get(); > if (llvm::Optional<uint64_t> DWOId = DwarfUnit->getDWOId()) { > auto *CUDWO = static_cast<DWARFCompileUnit*>(DwarfUnit->getNonSkeletonUnitDIE(false).getDwarfUnit()); > ... > } > } > > Later in the code I iterate over DIEs for .debug_info.dwo and call > DIE.getLocations(dwarf::DW_AT_location); > > Alternatively can manually extract offset and call > CUnit->findLoclistFromOffset(Offset); > > It fails because it tries to look up address using DWARFUnit in NormalUnits that it extracts from A.o. > Under the hood vistAsoluteLocationList is called with getAddrOffsetSectionItem passed in. > Since this DWARFUnit is DWO, it invokes Context.info_section_units(). Which uses A.o to create DW_SECT_INFO and DW_SECT_EXT_TYPES. > Then calls itself, but from the newly constructed Debug DWARFUnit. The skeleton CU that is in A.o. > > Since the way it's constructed the AddrOffsetSectionBase is never set, so getAddrOffsetSectionItem returns None. Eventually error is returned from high level API call. > > I ended up doing this to get address ranges: > DWARFLocationExpressionsVector LocEVector; > auto CallBack = [&](const DWARFLocationEntry &Entry) -> bool { > auto StartAddress > BaseUnit->getAddrOffsetSectionItem(Entry.Value0); > if (!StartAddress) { > //TODO: Handle Error > return false; > } > LocEVector.emplace_back(DWARFLocationExpression{DWARFAddressRange{ > (*StartAddress).Address, (*StartAddress).Address + Entry.Value1, > Entry.SectionIndex}, Entry.Loc}); > return true; > }; > > if(Unit->getLocationTable().visitLocationList(&Offset, CallBack)) > ... > > > But back to original API calls. Are they just not designed to work with DWO CUs, or am I missing something? > > Even if AddrOffsetSectionBase was set to 0, the address section it is accessing is in A.o and is not relocated. One would still need to get base address from the address from Skeleton CU to get fully resolved address ranges, or what I did to use index to access binary .debug_addr section directly (with appropriate AddrOffsetSectionBase). > > Thank You > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210222/2a5e6752/attachment.html>