James Henderson via llvm-dev
2020-Jul-21 08:25 UTC
[llvm-dev] [DWARF] Handling empty ranges/location lists in ET_REL files
Hi all, I've put this email in a different thread, although it is quite similar to some of the threads on tombstoning etc, with similar underlying structural issues. Whilst prototyping my fragmented DWARF idea for GC-ing DWARF sections properly, I ran into an object in the game code I was using as my input where a v4 .debug_loc section had a location description that looked something like this: .quad foo .quad foo ... # location description where foo was a section symbol, i.e. the start and end address were the same. Consequently, there would be two relocations with 0 addend patching the start and end offset. When I was using llvm-dwarfdump to dump the .debug_loc section, I ended up with a decoding, and eventually a parsing error, because it saw a 0, 0 pair, so treated the entry as an end of list entry, and assumed the location description was the start of the next list. The debug_loc parsing code treats 0, 0 pairs as end of list entries, whether or not they are relocated. I think this is a bug - if there are relocations we can be reasonably confident that the compiler did not intend it to be the end of the list, and at link time, this probably won't get resolved to 0, 0 (it's still technically possible it will, if 0 is a valid address, and the corresponding section was put at that address, but that's outside the scope of this email). I've got a fairly simple change that could solve this, but it would require to check for the presence of a relocation at either address, in the event 0, 0 was read. Should I go ahead with tidying up the change/testing it etc? Or do we want a different solution to this problem (aside from using DWARFv5 of course!)? Related aside: I haven't checked, but it's quite possible there's a similar problem in .debug_ranges parsing. James -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200721/41b39b85/attachment.html>
Robinson, Paul via llvm-dev
2020-Jul-21 15:29 UTC
[llvm-dev] [DWARF] Handling empty ranges/location lists in ET_REL files
I agree it’s a bug. An absolute (0, 0) pair is what indicates end-of-list. You can get pairs of 0 addends with `.quad foo; .quad foo` or `.quad foo; .quad bar` but the former is an empty range and the latter would be a real range. I’d expect the identical issue to pop up in .debug_ranges, so a patch should address both. --paulr From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of James Henderson via llvm-dev Sent: Tuesday, July 21, 2020 4:26 AM To: llvm-dev <llvm-dev at lists.llvm.org>; David Blaikie <dblaikie at gmail.com>; Alexey Lapshin <a.v.lapshin at mail.ru> Subject: [llvm-dev] [DWARF] Handling empty ranges/location lists in ET_REL files Hi all, I've put this email in a different thread, although it is quite similar to some of the threads on tombstoning etc, with similar underlying structural issues. Whilst prototyping my fragmented DWARF idea for GC-ing DWARF sections properly, I ran into an object in the game code I was using as my input where a v4 .debug_loc section had a location description that looked something like this: .quad foo .quad foo ... # location description where foo was a section symbol, i.e. the start and end address were the same. Consequently, there would be two relocations with 0 addend patching the start and end offset. When I was using llvm-dwarfdump to dump the .debug_loc section, I ended up with a decoding, and eventually a parsing error, because it saw a 0, 0 pair, so treated the entry as an end of list entry, and assumed the location description was the start of the next list. The debug_loc parsing code treats 0, 0 pairs as end of list entries, whether or not they are relocated. I think this is a bug - if there are relocations we can be reasonably confident that the compiler did not intend it to be the end of the list, and at link time, this probably won't get resolved to 0, 0 (it's still technically possible it will, if 0 is a valid address, and the corresponding section was put at that address, but that's outside the scope of this email). I've got a fairly simple change that could solve this, but it would require to check for the presence of a relocation at either address, in the event 0, 0 was read. Should I go ahead with tidying up the change/testing it etc? Or do we want a different solution to this problem (aside from using DWARFv5 of course!)? Related aside: I haven't checked, but it's quite possible there's a similar problem in .debug_ranges parsing. James -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200721/503c5088/attachment.html>
David Blaikie via llvm-dev
2020-Jul-21 18:31 UTC
[llvm-dev] [DWARF] Handling empty ranges/location lists in ET_REL files
Yep, sounds like a reasonable bug in llvm-dwarfdump to fix. Probably the right way to fix this is to check whether the start/end addresses have a section number. If they're zero with no section number, then they're really zero & should terminate the list. Otherwise they shouldn't. Here's a reproducer for debug_ranges at least, without needing any patches to LLVM: $ cat range.cpp void f1() { } void f2() { __builtin_unreachable(); } // alternatively: "int f2() { }" - both constructs are valid so long as f2 is never called, though it may still need to have a valid address (could use pointers to it in a map for some reason, etc) int main() { } $ clang++ range.cpp -ffunction-sections -g -O1 -c $ llvm-dwarfdump-tot range.o -debug-info -debug-ranges range.o: file format elf64-x86-64 .debug_info contents: 0x00000000: Compile Unit: length = 0x00000079, format = DWARF32, version 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x0000007d) 0x0000000b: DW_TAG_compile_unit ... DW_AT_ranges (0x00000000 [0x0000000000000000, 0x0000000000000001)) ... .debug_ranges contents: 00000000 0000000000000000 0000000000000001 00000000 <End of list> 00000020 0000000000000000 0000000000000003 00000020 <End of list> $ llvm-dwarfdump-tot a.out -debug-info -debug-ranges a.out: file format elf64-x86-64 .debug_info contents: 0x00000000: Compile Unit: length = 0x00000079, format = DWARF32, version 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x0000007d) 0x0000000b: DW_TAG_compile_unit ... DW_AT_ranges (0x00000000 [0x0000000000401110, 0x0000000000401111) [0x0000000000401120, 0x0000000000401120) [0x0000000000401120, 0x0000000000401123)) ... .debug_ranges contents: 00000000 0000000000401110 0000000000401111 00000000 0000000000401120 0000000000401120 00000000 0000000000401120 0000000000401123 00000000 <End of list> On Tue, Jul 21, 2020 at 8:29 AM Robinson, Paul <paul.robinson at sony.com> wrote:> I agree it’s a bug. An absolute (0, 0) pair is what indicates > end-of-list. You can get pairs of 0 addends with `.quad foo; .quad foo` or > `.quad foo; .quad bar` but the former is an empty range and the latter > would be a real range. > > I’d expect the identical issue to pop up in .debug_ranges, so a patch > should address both. > > --paulr > > > > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *James > Henderson via llvm-dev > *Sent:* Tuesday, July 21, 2020 4:26 AM > *To:* llvm-dev <llvm-dev at lists.llvm.org>; David Blaikie < > dblaikie at gmail.com>; Alexey Lapshin <a.v.lapshin at mail.ru> > *Subject:* [llvm-dev] [DWARF] Handling empty ranges/location lists in > ET_REL files > > > > Hi all, > > > > I've put this email in a different thread, although it is quite similar to > some of the threads on tombstoning etc, with similar underlying structural > issues. > > > > Whilst prototyping my fragmented DWARF idea for GC-ing DWARF sections > properly, I ran into an object in the game code I was using as my input > where a v4 .debug_loc section had a location description that looked > something like this: > > > > .quad foo > > .quad foo > > ... # location description > > > > where foo was a section symbol, i.e. the start and end address were the > same. Consequently, there would be two relocations with 0 addend patching > the start and end offset. When I was using llvm-dwarfdump to dump the > .debug_loc section, I ended up with a decoding, and eventually a parsing > error, because it saw a 0, 0 pair, so treated the entry as an end of list > entry, and assumed the location description was the start of the next list. > > > > The debug_loc parsing code treats 0, 0 pairs as end of list entries, > whether or not they are relocated. I think this is a bug - if there are > relocations we can be reasonably confident that the compiler did not intend > it to be the end of the list, and at link time, this probably won't get > resolved to 0, 0 (it's still technically possible it will, if 0 is a valid > address, and the corresponding section was put at that address, but that's > outside the scope of this email). > > > > I've got a fairly simple change that could solve this, but it would > require to check for the presence of a relocation at either address, in the > event 0, 0 was read. Should I go ahead with tidying up the change/testing > it etc? Or do we want a different solution to this problem (aside from > using DWARFv5 of course!)? > > > > Related aside: I haven't checked, but it's quite possible there's a > similar problem in .debug_ranges parsing. > > > > James >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200721/ddd58870/attachment.html>