Luke Drummond via llvm-dev
2020-Mar-11 15:09 UTC
[llvm-dev] DWARF .debug_aranges data objects and address spaces
On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote:> If you only want code addresses, why not use the CU's > low_pc/high_pc/ranges > - those are guaranteed to be only code addresses, I think? >In the common case, for most targets LLVM supports I think you're right, but for my case, regrettably, not. Because my target is a Harvard Architecture, any code address can have the same ordinal value as any data address: the code and data reside on different buses so the whole 4GiB space is available to both code, and data. `DW_AT_low_pc` and `DW_AT_high_pc` can be used to find the range of the code segment, but given an arbitrary address, cannot be used to conclusively determine whether that address belongs to code or data when both segments contain addresses in that numeric range. All the Best Luke -- Codeplay Software Ltd. Company registered in England and Wales, number: 04567874 Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF
David Blaikie via llvm-dev
2020-Mar-12 17:37 UTC
[llvm-dev] DWARF .debug_aranges data objects and address spaces
On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond <luke.drummond at codeplay.com> wrote:> On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote: > > If you only want code addresses, why not use the CU's > > low_pc/high_pc/ranges > > - those are guaranteed to be only code addresses, I think? > > > In the common case, for most targets LLVM supports I think you're right, > but for my case, regrettably, not. Because my target is a Harvard > Architecture, any code address can have the same ordinal value as any > data address: the code and data reside on different buses so the whole > 4GiB space is available to both code, and data. `DW_AT_low_pc` and > `DW_AT_high_pc` can be used to find the range of the code segment, but > given an arbitrary address, cannot be used to conclusively determine > whether that address belongs to code or data when both segments contain > addresses in that numeric range.Sorry I'm not following, partly probably due to my not having worked with such machines before. But how are the code addresses and data addresses differentiated then (eg: if you had segment selectors in debug_aranges, how would they be used? The addresses taken from the system at runtime have some kind of segment selector associated with them, that you can then use to match with the addr+segment selector in aranges?). Actually, coming at it from a different angle: It sounds like in the original email you're suggesting if debug_aranges did not contain data addresses, this would be good/sufficient for you? So somehow you'd be ensuring you only query debug_aranges using things you know are code addresses, not data addresses? So why would the same solution/approach not hold to querying low/high/ranges on a CU that's already guaranteed not to contain data addresses? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200312/16d831ef/attachment.html>
Luke Drummond via llvm-dev
2020-Mar-12 18:00 UTC
[llvm-dev] DWARF .debug_aranges data objects and address spaces
On Thu Mar 12, 2020 at 5:37 PM, David Blaikie wrote:> On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond > <luke.drummond at codeplay.com> > wrote: > > > On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote: > > > If you only want code addresses, why not use the CU's > > > low_pc/high_pc/ranges > > > - those are guaranteed to be only code addresses, I think? > > > > > In the common case, for most targets LLVM supports I think you're right, > > but for my case, regrettably, not. Because my target is a Harvard > > Architecture, any code address can have the same ordinal value as any > > data address: the code and data reside on different buses so the whole > > 4GiB space is available to both code, and data. `DW_AT_low_pc` and > > `DW_AT_high_pc` can be used to find the range of the code segment, but > > given an arbitrary address, cannot be used to conclusively determine > > whether that address belongs to code or data when both segments contain > > addresses in that numeric range. > > > Sorry I'm not following, partly probably due to my not having worked > with > such machines before. > > But how are the code addresses and data addresses differentiated then > (eg: > if you had segment selectors in debug_aranges, how would they be used? > The > addresses taken from the system at runtime have some kind of segment > selector associated with them, that you can then use to match with the > addr+segment selector in aranges?).Yes. This. The system mostly provides us the ability to disambiguate addresses because the device's simulator / debugger make this unambiguous, but the current .debug_aranges does not allow us to do this because it's missing such info.> > Actually, coming at it from a different angle: It sounds like in the > original email you're suggesting if debug_aranges did not contain data > addresses, this would be good/sufficient for you? So somehow you'd be > ensuring you only query debug_aranges using things you know are code > addresses, not data addresses? So why would the same solution/approach > not > hold to querying low/high/ranges on a CU that's already guaranteed not > to > contain data addresses?That's the root of the issue: the .debug_aranges section emitted by llvm *does* contain data addresses by default and therefore can be ambiguous. I've worked around this locally by hacking llvm to only emit aranges for text objects, but I was wandering if it's something that's valuable to fix upstream. My guess is that it's probably too niche to worry about for the moment, but if there's interest I can propose a design (probably a target hook to ask if segment selectors are required and how to get their number from an object). Thanks for your help Luke -- Codeplay Software Ltd. Company registered in England and Wales, number: 04567874 Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF