Fangrui Song via llvm-dev
2020-May-29 05:06 UTC
[llvm-dev] Range lists, zero-length functions, linker gc
On 2020-05-28, David Blaikie wrote:>On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com> >wrote: > >> As has been mentioned elsewhere, Sony generally fixes up references from >> debug info to stripped functions (of any length) using -1, because that’s a >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical base >> address for shared libraries, I’d think using it has the potential to >> mislead various consumers.) For .debug_ranges we use -2, because both a >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section. >> > >Any harm in using -2 everywhere, for consistency?When resolving a relocation, in certain cases we have to give an undefined symbol a value. This can happen with: * an undefined weak symbol * an undefined global symbol in --noinhibit-exec mode (a buggy --gc-sections implementation can trigger this as well) * a relocation referencing an undefined symbol in a non-SHF_ALLOC section We always respect the addend in a relocation entry for an absolute/PC-relative (I can use "most" here) relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64, local exec TLS relocation types, ...) Ignoring the addend (using -2 everywhere) will break this consistency. The relocated code may do pointer subtraction which would work if addends were respected, but will break using -2 everywhere. The relocated code can be allocatable or not. Non-allocatable non-debug code can have meaningful pointer subtraction as well. This is why I am not too fond of (using a fixed value everywhere).>(also, I had a silly idea, but what would happen if we added a CU attribute >with an address value that was a reference to a weak always-unused symbol, >that way the linker would fix it up with whatever its preferred magic value >was, and the consumer would then know what the magic value was that >represented dead code? (though this would only work if the value were used >consistently everywhere - which is zero for gold/lld (well, almost... you >can still create situations where a non-zero value is used even for a >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0 elsewhere) >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually work >for any producer currently, so maybe there's little value in that as a >feature))For a non-SHF_ALLOC section, LLD currently considers it a GC root if all the conditions below are satisfied: * not SHT_REL[A] * not SHF_LINK_ORDER * not in a section group (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35 onwards will have mostly compatible semantics with LLD) There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64 -> each section takes 64 bytes in the section header table. SHF_LINK_ORDER has semantics of a lightweight section group. Assume we don't want to have one .debug_* for each function section, this .debug_* will be a GC root. Relocations from it (even if the symbol is weak) will retain the sections defining the symbols. So, this trick can't work without refining the --gc-sections rules further.> >> If you’re looking only at zero-length functions, you can stop there; but >> I’m not sure why stopping there solves much of a real problem, as >> zero-length functions seem like a weird corner case. >> > >They're the case that breaks existing usage by terminating the range list >early - the other existing usage seems to be fine with "resolve to addend" >strategy that lld and gold use - in that it moves most dead/deduplicated >functions outside the executable range and so consumers never come asking >for "what code is at instruction 5" because they're never executing code at >a pc of 5. But, yes, this existing solution doesn't work once you have code >mapped into low address spaces or have utterly massive functions that might >have a length that would reach into the executable address space even when >their start is remapped to zero.For posterity, David gave me an example offline: void f1() { } void f2() { } int main() { f1(); } clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o a.bfd llvm-dwarfdump -debug-ranges a.bfd => R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring the addend (Behavior introduced in https://sourceware.org/git/?p=binutils-gdb.git;a=blobdiff;f=bfd/ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbbfdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721) The comments below are also insightful. I need to ponder more (and need to read the DWARF v4 and v5 specs more as I am not so familiar these DWARF constructs). But it is too late now. Will probably comment another day :)> >> Linkers know how to strip dead functions (gc) or deduplicate them (icf, >> COMDAT) and people do this all the time, in some cases (COMDAT) without >> explicitly asking for it, so non-zero-length functions seem like the much >> more interesting case. In that situation, -1 (or -2) seems like a much >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1. >> >> >> >> Stripping non-zero-length functions does mean you have to care about more >> sections. For example .debug_locs would want to be fixed up the same way >> as .debug_ranges, not because a debugger would care but so that dumpers >> would not run into the 0/0 brick wall. >> > >Yep - in theory a consumer could actually use a loclist across multiple >sections (if a global variable got hoisted into a register for a function >for instance), but I don't know of any producers doing this today - until >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF that >has that problem (because it resolves both relocations to dead code >(begin/end of a range) to zero in all sections except debug_ranges, so >terminates the loclist list early) - binutils objdump avoids dumping the >following corrupted fragment by only dumping hunks of debug_loc starting at >places referenced from debug_info. Without debug_info it won't dump >anything from debug_loc - and if the references from debug_info, parsed >until the 0,0 terminator don't cover the whole debug_loc section, it prints >messages saying there are "gaps". > >Agreed that you'd want debug_loc to have the same special handling as >debug_ranges if it has special handling. Though ideally we'd pick a value >that works equally everywhere? (-2, by the sounds of it) > > >> We also fix up lengths in .debug_aranges to zero, although there might be >> history behind that tactic that I’m not aware of; it seems like it ought to >> be unnecessary, if consumers are aware of the special address(es). >> > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine with the >same approach as debug_ranges, but I haven't looked at debug_aranges in a >long time. > >I guess the only remaining question is: Since it's possible to have code on >some systems down at address zero, or close enough to it that [0, length) >might overlap with real exxecutable code addresses - does anyone know of >the inverse: where code is mapped up near uint32 max? Such that that usage >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed value here? > >- Dave > > >> >> >> --paulr >> >> >> >> *From:* Alexey Lapshin <alapshin at accesssoftek.com> >> *Sent:* Thursday, May 28, 2020 9:03 AM >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi <wmi at google.com>; >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl <aprantl at apple.com>; >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin < >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>; Fangrui Song >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>; >> llvm-dev at lists.llvm.org >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions, linker gc >> >> >> >> Hi David, >> >> >> >> >So there have been several recent discussions about the issues around >> >> >DWARF-agnostic linking and gc-sections, linkonce function definitions >> being >> >> >dropped, etc - and just how much DWARF-awareness would be suitable >> >> >in a linker to help with this situation. >> >> >> > I'd like to discuss a narrower instance of this issue: Zero length >> gc'd/deduplicated functions. >> >> > LLVM seems to at least produce zero length functions in a few cases: >> > * non-void function without a return statement >> > * function definition containing only llvm_unreachable >> > (both of these trap at -O0, but at higher optimization levels even the >> trap >> >> > instruction is removed & you get the full power UB of control >> flowing off >> >> > the end of the function into whatever other bytes are after that >> function) >> >> > So, for context, debug_ranges (this whole issue doesn't exist in >> DWARFv5, >> >> > FWIW) is a list of address pairs, terminated by a pair of zeros. >> >> > With function sections, or even just with normal C++ inline functions, >> >> > the CU will have a range entry for that function that consists of two >> relocations >> >> > - to the start and end of the function. Generally the start of the >> function is the >> >> > start of the section, and the end is "start of function + length of >> function (aka addend)". >> >> > Usually any relocation to the section would keep that section "alive" >> during linking - >> >> > but that would cause debug info to defeat linker GC and deduplication. >> So there's >> >> > special rules for how linkers handle these relocations in debug info to >> allow the >> >> > sections to be dropped - what do you write in the bytes that requested >> the relocation? >> >> > Binutils ld: Special cases only debug_ranges, resolving all relocations >> to dead >> >> > code to 1. In other debug sections, these values are all resolved to >> zero. >> >> > Gold and lld: Special cases all debug info sections - resolving all >> relocations >> >> > to "addend" (so begin usually goes to zero, end goes to "size of >> function") >> >> > These special rules are designed to ensure omitted/gc'd/deduplicated >> functions >> >> > don't cause the range list to terminate prematurely (which would happen >> if begin/end >> >> > were both resolved to zero). >> >> >But with an empty function, gold and lld's strategy here fails to avoid >> terminating a >> >> >range list by accident. >> >> > What should we do about it? >> >> > 1) Ensure no zero-length functions exist? (doesn't address backwards >> >> > compatibility/existing functions/other compilers) >> > 2) adopt the binutils approach to this (at least in debug_ranges - maybe >> in all >> >> > debug sections? (doing it in other sections could break ) >> > 3) Revisit the discussion about using an even more 'blessed' value, >> >> > like int max-1? ( https://reviews.llvm.org/D59553 >> <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$> >> ) >> >> > (I don't have links to all the recent threads about this discussion - I >> think D59553 >> >> > might've spawned a separate broader discussion/non-review - oh, Alexey >> wrote a >> >> > good summary with links to other discussions here: >> >> > http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$> >> ) >> >> > Thoughts? >> >> >> >> I think for the problem of "zero length functions and .debug_ranges" >> binutils approach looks good: >> >> >Special cases only debug_ranges, resolving all relocations to >> >dead code to 1. In other debug sections, these values are all resolved to >> >zero. >> >> But, this would not completely solve the problem from >> https://reviews.llvm.org/D59553 >> <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$> >> - Overlapped address ranges. Binutils approach will solve the problem if >> the address range specified as start_address:end_address. While resolving >> relocations, it would replace such a range with 1:1. >> However, It would not work if address ranges were specified as >> start_address:length since the length is not relocated. This case could be >> additionally fixed by fast scan debug_info for High_PC defined as length >> and changing it to 1. Something which you suggested here: >> http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63PsubKJQ$> >> . >> >> So it looks like following solution could fix both problems and be >> relatively fast: >> >> "Resolve all relocations from debug sections into dead code to 1. Parse >> debug sections and replace HighPc of an address range pointing to dead code >> and specified as length to 1". >> >> As the result all address ranges pointing into dead code would be marked >> as zero length. >> >> There still exist another problem: >> >> DWARF4: "A range list entry (but not a base address selection or end of >> list entry) whose beginning and >> ending addresses are equal has no effect because the size of the range >> covered by such an >> entry is zero." >> >> DWARF5: "A bounded range entry whose beginning and ending address offsets >> are equal >> (including zero) indicates an empty range and may be ignored." >> >> These rules allow us to ignore zero-length address ranges. I.e., some tool >> reading DWARF is permitted to ignore related DWARF entries. In that case, >> there could be ignored essential descriptions. That problem could happen >> with -flto=thin example https://reviews.llvm.org/D54747#1503720 >> <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY637ju_eQw$> >> . In this example, all type definitions except one were replaced with >> declarations by thinlto. The definition, which was left, is in a piece of >> debug info related to deleted code. According to zero-length rule, that >> definition could be ignored, and finally, incomplete debug info could be >> used. >> >> So, it probably should be forbidden to generate debug_info, which could >> become incomplete after removing pieces related to zero length address >> ranges. Otherwise, creating zero-length address ranges could lead to >> incomplete debug info. >> >> >> >> Thank you, Alexey. >> >> >>
Robinson, Paul via llvm-dev
2020-May-29 16:21 UTC
[llvm-dev] Range lists, zero-length functions, linker gc
> -----Original Message----- > From: Fangrui Song <maskray at google.com> > Sent: Friday, May 29, 2020 1:07 AM > To: David Blaikie <dblaikie at gmail.com> > Cc: Robinson, Paul <paul.robinson at sony.com>; Alexey Lapshin > <alapshin at accesssoftek.com>; Sriraman Tallam <tmsriram at google.com>; Wei Mi > <wmi at google.com>; Adrian Prantl <aprantl at apple.com>; Jonas Devlieghere > <jdevlieghere at apple.com>; Alexey Lapshin <a.v.lapshin at mail.ru>; Eric > Christopher <echristo at gmail.com>; peter.smith at arm.com; > grimar at accesssoftek.com; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc > > On 2020-05-28, David Blaikie wrote: > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com> > >wrote: > > > >> As has been mentioned elsewhere, Sony generally fixes up references > from > >> debug info to stripped functions (of any length) using -1, because > that’s a > >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical base > >> address for shared libraries, I’d think using it has the potential to > >> mislead various consumers.) For .debug_ranges we use -2, because both > a > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section. > >> > > > >Any harm in using -2 everywhere, for consistency? > > When resolving a relocation, in certain cases we have to give an undefined > symbol a value. > This can happen with: > > * an undefined weak symbol > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc- > sections implementation can trigger this as well) > * a relocation referencing an undefined symbol in a non-SHF_ALLOC section > > We always respect the addend in a relocation entry for an absolute/PC- > relative (I can use "most" here) > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64, > local exec TLS relocation types, ...) > Ignoring the addend (using -2 everywhere) will break this consistency. > > The relocated code may do pointer subtraction which would work if addends > were > respected, but will break using -2 everywhere.I suspect David meant "any harm to using -2 in all .debug_* sections?" and not literally everywhere. Sony does special cases only for the .debug_* sections. I've been meaning to propose that DWARF v6 reserve a special address for this kind of situation. Whether the committee would be willing to make it be -1 or -2 for all targets, or make it target-defined, I don't know. (Dreading the inevitable argument over whether addresses are signed or unsigned, or more to the point whether they wrap. They've been unsigned and wrapping was undefined on the small set of machines I'm familiar with.) Certainly the toolchain community would benefit from making it be the same everywhere. Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges sections be an extra-special case using -2. We can (I hope) standardize on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended practice for prior versions.> > The relocated code can be allocatable or not. Non-allocatable non-debug > code can have meaningful pointer subtraction as well. This is why I am > not too fond of (using a fixed value everywhere). > > >(also, I had a silly idea, but what would happen if we added a CU > attribute > >with an address value that was a reference to a weak always-unused > symbol, > >that way the linker would fix it up with whatever its preferred magic > value > >was, and the consumer would then know what the magic value was that > >represented dead code? (though this would only work if the value were > used > >consistently everywhere - which is zero for gold/lld (well, almost... you > >can still create situations where a non-zero value is used even for a > >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0 > elsewhere) > >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually > work > >for any producer currently, so maybe there's little value in that as a > >feature)) > > For a non-SHF_ALLOC section, LLD currently considers it a GC root if all > the conditions below are satisfied: > > * not SHT_REL[A] > * not SHF_LINK_ORDER > * not in a section group > > (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35 > onwards will have mostly compatible semantics with LLD) > > There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64 -> > each section takes 64 bytes in the section header table. SHF_LINK_ORDER > has semantics of a lightweight section group. Assume we don't want to > have one .debug_* for each function section, this .debug_* will be a GC > root. Relocations from it (even if the symbol is weak) will retain the > sections defining the symbols.We did some quick research into per-function .debug_info fragments a while back, putting the subprogram info into the same section group as the function; it was not an unqualified win. The very large number of sections costs processing time, and cross-section references added to the relocation count (I believe these can generally be resolved by MC in a non-fragmented .debug_info section). James Henderson might have the actual results stashed somewhere. That approach *might* still be faster than post-processing a unified section, which IIUC is what D59553 does.> > So, this trick can't work without refining the --gc-sections rules > further.If I understand the objection, yeah, we can't have .debug_* sections being gc roots. --paulr> > > > >> If you’re looking only at zero-length functions, you can stop there; > but > >> I’m not sure why stopping there solves much of a real problem, as > >> zero-length functions seem like a weird corner case. > >> > > > >They're the case that breaks existing usage by terminating the range list > >early - the other existing usage seems to be fine with "resolve to > addend" > >strategy that lld and gold use - in that it moves most dead/deduplicated > >functions outside the executable range and so consumers never come asking > >for "what code is at instruction 5" because they're never executing code > at > >a pc of 5. But, yes, this existing solution doesn't work once you have > code > >mapped into low address spaces or have utterly massive functions that > might > >have a length that would reach into the executable address space even > when > >their start is remapped to zero. > > For posterity, David gave me an example offline: void f1() { } void f2() { > } int main() { f1(); } > > clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o a.bfd > llvm-dwarfdump -debug-ranges a.bfd > => > R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring the > addend > > (Behavior introduced in > https://sourceware.org/git/?p=binutils- > gdb.git;a=blobdiff;f=bfd*ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbb > fdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39 > c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721) > > The comments below are also insightful. I need to ponder more (and need > to read the DWARF v4 and v5 specs more as I am not so familiar these > DWARF constructs). But it is too late now. Will probably comment > another day :) > > > > >> Linkers know how to strip dead functions (gc) or deduplicate them (icf, > >> COMDAT) and people do this all the time, in some cases (COMDAT) without > >> explicitly asking for it, so non-zero-length functions seem like the > much > >> more interesting case. In that situation, -1 (or -2) seems like a much > >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1. > >> > >> > >> > >> Stripping non-zero-length functions does mean you have to care about > more > >> sections. For example .debug_locs would want to be fixed up the same > way > >> as .debug_ranges, not because a debugger would care but so that dumpers > >> would not run into the 0/0 brick wall. > >> > > > >Yep - in theory a consumer could actually use a loclist across multiple > >sections (if a global variable got hoisted into a register for a function > >for instance), but I don't know of any producers doing this today - until > >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF > that > >has that problem (because it resolves both relocations to dead code > >(begin/end of a range) to zero in all sections except debug_ranges, so > >terminates the loclist list early) - binutils objdump avoids dumping the > >following corrupted fragment by only dumping hunks of debug_loc starting > at > >places referenced from debug_info. Without debug_info it won't dump > >anything from debug_loc - and if the references from debug_info, parsed > >until the 0,0 terminator don't cover the whole debug_loc section, it > prints > >messages saying there are "gaps". > > > >Agreed that you'd want debug_loc to have the same special handling as > >debug_ranges if it has special handling. Though ideally we'd pick a value > >that works equally everywhere? (-2, by the sounds of it) > > > > > >> We also fix up lengths in .debug_aranges to zero, although there might > be > >> history behind that tactic that I’m not aware of; it seems like it > ought to > >> be unnecessary, if consumers are aware of the special address(es). > >> > > > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine with > the > >same approach as debug_ranges, but I haven't looked at debug_aranges in a > >long time. > > > >I guess the only remaining question is: Since it's possible to have code > on > >some systems down at address zero, or close enough to it that [0, length) > >might overlap with real exxecutable code addresses - does anyone know of > >the inverse: where code is mapped up near uint32 max? Such that that > usage > >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed value > here? > > > >- Dave > > > > > >> > >> > >> --paulr > >> > >> > >> > >> *From:* Alexey Lapshin <alapshin at accesssoftek.com> > >> *Sent:* Thursday, May 28, 2020 9:03 AM > >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi <wmi at google.com>; > >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl > <aprantl at apple.com>; > >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin < > >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>; Fangrui > Song > >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>; > >> llvm-dev at lists.llvm.org > >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions, linker gc > >> > >> > >> > >> Hi David, > >> > >> > >> > >> >So there have been several recent discussions about the issues around > >> > >> >DWARF-agnostic linking and gc-sections, linkonce function definitions > >> being > >> > >> >dropped, etc - and just how much DWARF-awareness would be suitable > >> > >> >in a linker to help with this situation. > >> > >> > >> > I'd like to discuss a narrower instance of this issue: Zero length > >> gc'd/deduplicated functions. > >> > >> > LLVM seems to at least produce zero length functions in a few cases: > >> > * non-void function without a return statement > >> > * function definition containing only llvm_unreachable > >> > (both of these trap at -O0, but at higher optimization levels even > the > >> trap > >> > >> > instruction is removed & you get the full power UB of control > >> flowing off > >> > >> > the end of the function into whatever other bytes are after that > >> function) > >> > >> > So, for context, debug_ranges (this whole issue doesn't exist in > >> DWARFv5, > >> > >> > FWIW) is a list of address pairs, terminated by a pair of zeros. > >> > >> > With function sections, or even just with normal C++ inline > functions, > >> > >> > the CU will have a range entry for that function that consists of two > >> relocations > >> > >> > - to the start and end of the function. Generally the start of the > >> function is the > >> > >> > start of the section, and the end is "start of function + length of > >> function (aka addend)". > >> > >> > Usually any relocation to the section would keep that section > "alive" > >> during linking - > >> > >> > but that would cause debug info to defeat linker GC and > deduplication. > >> So there's > >> > >> > special rules for how linkers handle these relocations in debug info > to > >> allow the > >> > >> > sections to be dropped - what do you write in the bytes that > requested > >> the relocation? > >> > >> > Binutils ld: Special cases only debug_ranges, resolving all > relocations > >> to dead > >> > >> > code to 1. In other debug sections, these values are all resolved to > >> zero. > >> > >> > Gold and lld: Special cases all debug info sections - resolving all > >> relocations > >> > >> > to "addend" (so begin usually goes to zero, end goes to "size of > >> function") > >> > >> > These special rules are designed to ensure omitted/gc'd/deduplicated > >> functions > >> > >> > don't cause the range list to terminate prematurely (which would > happen > >> if begin/end > >> > >> > were both resolved to zero). > >> > >> >But with an empty function, gold and lld's strategy here fails to > avoid > >> terminating a > >> > >> >range list by accident. > >> > >> > What should we do about it? > >> > >> > 1) Ensure no zero-length functions exist? (doesn't address backwards > >> > >> > compatibility/existing functions/other compilers) > >> > 2) adopt the binutils approach to this (at least in debug_ranges - > maybe > >> in all > >> > >> > debug sections? (doing it in other sections could break ) > >> > 3) Revisit the discussion about using an even more 'blessed' value, > >> > >> > like int max-1? ( > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > >> ) > >> > >> > (I don't have links to all the recent threads about this discussion > - I > >> think D59553 > >> > >> > might've spawned a separate broader discussion/non-review - oh, > Alexey > >> wrote a > >> > >> > good summary with links to other discussions here: > >> > >> > https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > dev/2019-September/135068.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiey_aMV0lQ$ > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > dev/2019- > September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE > 70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$> > >> ) > >> > >> > Thoughts? > >> > >> > >> > >> I think for the problem of "zero length functions and .debug_ranges" > >> binutils approach looks good: > >> > >> >Special cases only debug_ranges, resolving all relocations to > >> >dead code to 1. In other debug sections, these values are all resolved > to > >> >zero. > >> > >> But, this would not completely solve the problem from > >> > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > >> - Overlapped address ranges. Binutils approach will solve the problem > if > >> the address range specified as start_address:end_address. While > resolving > >> relocations, it would replace such a range with 1:1. > >> However, It would not work if address ranges were specified as > >> start_address:length since the length is not relocated. This case could > be > >> additionally fixed by fast scan debug_info for High_PC defined as > length > >> and changing it to 1. Something which you suggested here: > >> https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiexb8NU_Fw$ > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > dev/2020- > May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b > 4_rsj1TN0qB8NpBvVKtY63PsubKJQ$> > >> . > >> > >> So it looks like following solution could fix both problems and be > >> relatively fast: > >> > >> "Resolve all relocations from debug sections into dead code to 1. Parse > >> debug sections and replace HighPc of an address range pointing to dead > code > >> and specified as length to 1". > >> > >> As the result all address ranges pointing into dead code would be > marked > >> as zero length. > >> > >> There still exist another problem: > >> > >> DWARF4: "A range list entry (but not a base address selection or end of > >> list entry) whose beginning and > >> ending addresses are equal has no effect because the size of the range > >> covered by such an > >> entry is zero." > >> > >> DWARF5: "A bounded range entry whose beginning and ending address > offsets > >> are equal > >> (including zero) indicates an empty range and may be ignored." > >> > >> These rules allow us to ignore zero-length address ranges. I.e., some > tool > >> reading DWARF is permitted to ignore related DWARF entries. In that > case, > >> there could be ignored essential descriptions. That problem could > happen > >> with -flto=thin example > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw!! > JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL- > 1Zdb4OydT2xVhpDTRyUixtaozLYiezSujGHwQ$ > >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw!! > JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY > 637ju_eQw$> > >> . In this example, all type definitions except one were replaced with > >> declarations by thinlto. The definition, which was left, is in a piece > of > >> debug info related to deleted code. According to zero-length rule, that > >> definition could be ignored, and finally, incomplete debug info could > be > >> used. > >> > >> So, it probably should be forbidden to generate debug_info, which could > >> become incomplete after removing pieces related to zero length address > >> ranges. Otherwise, creating zero-length address ranges could lead to > >> incomplete debug info. > >> > >> > >> > >> Thank you, Alexey. > >> > >> > >>
Alexey Lapshin via llvm-dev
2020-May-29 19:08 UTC
[llvm-dev] Range lists, zero-length functions, linker gc
> Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc > > On 2020-05-28, David Blaikie wrote: > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com> > >wrote: > > > >> As has been mentioned elsewhere, Sony generally fixes up references > from > >> debug info to stripped functions (of any length) using -1, because > that’s a > >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical base > >> address for shared libraries, I’d think using it has the potential to > >> mislead various consumers.) For .debug_ranges we use -2, because both > a > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section. > >> > > > >Any harm in using -2 everywhere, for consistency? > > When resolving a relocation, in certain cases we have to give an undefined > symbol a value. > This can happen with: > > * an undefined weak symbol > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc- > sections implementation can trigger this as well) > * a relocation referencing an undefined symbol in a non-SHF_ALLOC section > > We always respect the addend in a relocation entry for an absolute/PC- > relative (I can use "most" here) > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64, > local exec TLS relocation types, ...) > Ignoring the addend (using -2 everywhere) will break this consistency. > > The relocated code may do pointer subtraction which would work if addends > were > respected, but will break using -2 everywhere.>I suspect David meant "any harm to using -2 in all .debug_* sections?" >and not literally everywhere. Sony does special cases only for the >.debug_* sections.>I've been meaning to propose that DWARF v6 reserve a special address for >this kind of situation. Whether the committee would be willing to make >it be -1 or -2 for all targets, or make it target-defined, I don't know.>(Dreading the inevitable argument over whether addresses are signed or >unsigned, or more to the point whether they wrap. They've been unsigned >and wrapping was undefined on the small set of machines I'm familiar with.) >Certainly the toolchain community would benefit from making it be the >same everywhere.>Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges >sections be an extra-special case using -2. We can (I hope) standardize >on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended >practice for prior versions.Would it make sense to use "LowPC > HighPC" in DWARF documentation as a sign for that case, instead of -1 or -2 ? Or more correct: To indicate that address range points into deleted code there should be used either zero length, either LowPC>HighPc range ? zero length address range is already defined in DWARF documentation. LowPC>HighPc is currently not described. It could be documented and used as general representation instead of concrete special value. Implementation could still use -2 for resolving relocations and it would satisfy above definition. Thank you, Alexey.
David Blaikie via llvm-dev
2020-May-29 21:00 UTC
[llvm-dev] Range lists, zero-length functions, linker gc
On Fri, May 29, 2020 at 9:21 AM Robinson, Paul <paul.robinson at sony.com> wrote:> > > > > -----Original Message----- > > From: Fangrui Song <maskray at google.com> > > Sent: Friday, May 29, 2020 1:07 AM > > To: David Blaikie <dblaikie at gmail.com> > > Cc: Robinson, Paul <paul.robinson at sony.com>; Alexey Lapshin > > <alapshin at accesssoftek.com>; Sriraman Tallam <tmsriram at google.com>; Wei Mi > > <wmi at google.com>; Adrian Prantl <aprantl at apple.com>; Jonas Devlieghere > > <jdevlieghere at apple.com>; Alexey Lapshin <a.v.lapshin at mail.ru>; Eric > > Christopher <echristo at gmail.com>; peter.smith at arm.com; > > grimar at accesssoftek.com; llvm-dev at lists.llvm.org > > Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc > > > > On 2020-05-28, David Blaikie wrote: > > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com> > > >wrote: > > > > > >> As has been mentioned elsewhere, Sony generally fixes up references > > from > > >> debug info to stripped functions (of any length) using -1, because > > that’s a > > >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical base > > >> address for shared libraries, I’d think using it has the potential to > > >> mislead various consumers.) For .debug_ranges we use -2, because both > > a > > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section. > > >> > > > > > >Any harm in using -2 everywhere, for consistency? > > > > When resolving a relocation, in certain cases we have to give an undefined > > symbol a value. > > This can happen with: > > > > * an undefined weak symbol > > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc- > > sections implementation can trigger this as well) > > * a relocation referencing an undefined symbol in a non-SHF_ALLOC section > > > > We always respect the addend in a relocation entry for an absolute/PC- > > relative (I can use "most" here) > > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64, > > local exec TLS relocation types, ...) > > Ignoring the addend (using -2 everywhere) will break this consistency. > > > > The relocated code may do pointer subtraction which would work if addends > > were > > respected, but will break using -2 everywhere. > > I suspect David meant "any harm to using -2 in all .debug_* sections?" > and not literally everywhere. Sony does special cases only for the > .debug_* sections.Right - thanks for the clarification.> I've been meaning to propose that DWARF v6 reserve a special address for > this kind of situation. Whether the committee would be willing to make > it be -1 or -2 for all targets, or make it target-defined, I don't know. > (Dreading the inevitable argument over whether addresses are signed or > unsigned, or more to the point whether they wrap. They've been unsigned > and wrapping was undefined on the small set of machines I'm familiar with.) > Certainly the toolchain community would benefit from making it be the > same everywhere. > > Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges > sections be an extra-special case using -2. We can (I hope) standardize > on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended > practice for prior versions.That'd make linking difficult - the unix linkers at least, currently don't have to identify the DWARF version when linking - having to pass an extra linking flag or have the linker parse any DWARF (what if an object file contains more than one CU & the linker has to apply different relocations in different parts of the object file because of that?) would be a significant cost/problem, I think. Though I like the tidiness of -1 everywhere, that backwards compatibility with debug_ranges (& debug_loc similarly) is a problem. Though ld.bfd does special case debug_ranges (& should special case debug_loc), perhaps that's the solution. -2 for debug_ranges and debug_loc, -1 everywhere else (which effectively means everywhere in DWARFv5 onwards)?> > > > > > The relocated code can be allocatable or not. Non-allocatable non-debug > > code can have meaningful pointer subtraction as well. This is why I am > > not too fond of (using a fixed value everywhere). > > > > >(also, I had a silly idea, but what would happen if we added a CU > > attribute > > >with an address value that was a reference to a weak always-unused > > symbol, > > >that way the linker would fix it up with whatever its preferred magic > > value > > >was, and the consumer would then know what the magic value was that > > >represented dead code? (though this would only work if the value were > > used > > >consistently everywhere - which is zero for gold/lld (well, almost... you > > >can still create situations where a non-zero value is used even for a > > >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0 > > elsewhere) > > >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually > > work > > >for any producer currently, so maybe there's little value in that as a > > >feature)) > > > > For a non-SHF_ALLOC section, LLD currently considers it a GC root if all > > the conditions below are satisfied: > > > > * not SHT_REL[A] > > * not SHF_LINK_ORDER > > * not in a section group > > > > (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35 > > onwards will have mostly compatible semantics with LLD) > > > > There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64 -> > > each section takes 64 bytes in the section header table.Soryr, I missed a step here - you're talking about the cost to fragmenting .debug_* sections as an alternative to choosing a special address value to resolve for dead code? (by removing the DWARF that refers to the dead code, uinstead of keeping it and having to write a special address value into it?) Unfortunately, no matter the cost - that solution doesn't apply to Split DWARF. Maybe at some point we'll want to have some output from the linker that lists the dead/live code, and use that for building a dwp (like dsymutil) but I don't think we can predicate correctness on such a thing - in part because we'd still want to be able to read the .dwo files without post-processing for more interactive/iterative development scenarios. So we'd still need a special address value to write into debug_addr when using Split DWARF, and I think it's important to allow the non-split case to look like the split case where it doesn't /have/ to diverge - if divergence provides benefits, that's nice, but I don't think it'd be good to make that divergence /necessary/.> > SHF_LINK_ORDER > > has semantics of a lightweight section group. Assume we don't want to > > have one .debug_* for each function section, this .debug_* will be a GC > > root. Relocations from it (even if the symbol is weak) will retain the > > sections defining the symbols. > > We did some quick research into per-function .debug_info fragments a > while back, putting the subprogram info into the same section group as > the function; it was not an unqualified win. The very large number of > sections costs processing time, and cross-section references added to > the relocation count (I believe these can generally be resolved by MC > in a non-fragmented .debug_info section).Yeah, you either pay more relocations (or size cost to use signatures instead) and/or more size to duplicate some DIEs (like type units currently duplicate fundamental/non-user-defined types into the type unit). (& it's a non-starter for Split DWARF anyway)> James Henderson might have > the actual results stashed somewhere. > > That approach *might* still be faster than post-processing a unified > section, which IIUC is what D59553 does. > > > > > So, this trick can't work without refining the --gc-sections rules > > further. > > If I understand the objection, yeah, we can't have .debug_* sections > being gc roots.I'm not sure I follow, here - "this trick" being "splitting debug info into droppable chunks for each function/subprogram and putting those chunks in the same comdat group as the function code itself" - that trick wouldn't work because the current rules would move that debug info from (where it currently is in one big debug_info section) a non-gc root to (where it would go - becoming part of a comdat group) a gc-root? OK, right. Agreed that's another reason (apart from the Split DWARF one, and the size/reloc tradeoff ones) that would be problematic. - Dave> > --paulr > > > > > > > > >> If you’re looking only at zero-length functions, you can stop there; > > but > > >> I’m not sure why stopping there solves much of a real problem, as > > >> zero-length functions seem like a weird corner case. > > >> > > > > > >They're the case that breaks existing usage by terminating the range list > > >early - the other existing usage seems to be fine with "resolve to > > addend" > > >strategy that lld and gold use - in that it moves most dead/deduplicated > > >functions outside the executable range and so consumers never come asking > > >for "what code is at instruction 5" because they're never executing code > > at > > >a pc of 5. But, yes, this existing solution doesn't work once you have > > code > > >mapped into low address spaces or have utterly massive functions that > > might > > >have a length that would reach into the executable address space even > > when > > >their start is remapped to zero. > > > > For posterity, David gave me an example offline: void f1() { } void f2() { > > } int main() { f1(); } > > > > clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o a.bfd > > llvm-dwarfdump -debug-ranges a.bfd > > => > > R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring the > > addend > > > > (Behavior introduced in > > https://sourceware.org/git/?p=binutils- > > gdb.git;a=blobdiff;f=bfd*ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbb > > fdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39 > > c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721) > > > > The comments below are also insightful. I need to ponder more (and need > > to read the DWARF v4 and v5 specs more as I am not so familiar these > > DWARF constructs). But it is too late now. Will probably comment > > another day :) > > > > > > > >> Linkers know how to strip dead functions (gc) or deduplicate them (icf, > > >> COMDAT) and people do this all the time, in some cases (COMDAT) without > > >> explicitly asking for it, so non-zero-length functions seem like the > > much > > >> more interesting case. In that situation, -1 (or -2) seems like a much > > >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1. > > >> > > >> > > >> > > >> Stripping non-zero-length functions does mean you have to care about > > more > > >> sections. For example .debug_locs would want to be fixed up the same > > way > > >> as .debug_ranges, not because a debugger would care but so that dumpers > > >> would not run into the 0/0 brick wall. > > >> > > > > > >Yep - in theory a consumer could actually use a loclist across multiple > > >sections (if a global variable got hoisted into a register for a function > > >for instance), but I don't know of any producers doing this today - until > > >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF > > that > > >has that problem (because it resolves both relocations to dead code > > >(begin/end of a range) to zero in all sections except debug_ranges, so > > >terminates the loclist list early) - binutils objdump avoids dumping the > > >following corrupted fragment by only dumping hunks of debug_loc starting > > at > > >places referenced from debug_info. Without debug_info it won't dump > > >anything from debug_loc - and if the references from debug_info, parsed > > >until the 0,0 terminator don't cover the whole debug_loc section, it > > prints > > >messages saying there are "gaps". > > > > > >Agreed that you'd want debug_loc to have the same special handling as > > >debug_ranges if it has special handling. Though ideally we'd pick a value > > >that works equally everywhere? (-2, by the sounds of it) > > > > > > > > >> We also fix up lengths in .debug_aranges to zero, although there might > > be > > >> history behind that tactic that I’m not aware of; it seems like it > > ought to > > >> be unnecessary, if consumers are aware of the special address(es). > > >> > > > > > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine with > > the > > >same approach as debug_ranges, but I haven't looked at debug_aranges in a > > >long time. > > > > > >I guess the only remaining question is: Since it's possible to have code > > on > > >some systems down at address zero, or close enough to it that [0, length) > > >might overlap with real exxecutable code addresses - does anyone know of > > >the inverse: where code is mapped up near uint32 max? Such that that > > usage > > >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed value > > here? > > > > > >- Dave > > > > > > > > >> > > >> > > >> --paulr > > >> > > >> > > >> > > >> *From:* Alexey Lapshin <alapshin at accesssoftek.com> > > >> *Sent:* Thursday, May 28, 2020 9:03 AM > > >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi <wmi at google.com>; > > >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl > > <aprantl at apple.com>; > > >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin < > > >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>; Fangrui > > Song > > >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>; > > >> llvm-dev at lists.llvm.org > > >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions, linker gc > > >> > > >> > > >> > > >> Hi David, > > >> > > >> > > >> > > >> >So there have been several recent discussions about the issues around > > >> > > >> >DWARF-agnostic linking and gc-sections, linkonce function definitions > > >> being > > >> > > >> >dropped, etc - and just how much DWARF-awareness would be suitable > > >> > > >> >in a linker to help with this situation. > > >> > > >> > > >> > I'd like to discuss a narrower instance of this issue: Zero length > > >> gc'd/deduplicated functions. > > >> > > >> > LLVM seems to at least produce zero length functions in a few cases: > > >> > * non-void function without a return statement > > >> > * function definition containing only llvm_unreachable > > >> > (both of these trap at -O0, but at higher optimization levels even > > the > > >> trap > > >> > > >> > instruction is removed & you get the full power UB of control > > >> flowing off > > >> > > >> > the end of the function into whatever other bytes are after that > > >> function) > > >> > > >> > So, for context, debug_ranges (this whole issue doesn't exist in > > >> DWARFv5, > > >> > > >> > FWIW) is a list of address pairs, terminated by a pair of zeros. > > >> > > >> > With function sections, or even just with normal C++ inline > > functions, > > >> > > >> > the CU will have a range entry for that function that consists of two > > >> relocations > > >> > > >> > - to the start and end of the function. Generally the start of the > > >> function is the > > >> > > >> > start of the section, and the end is "start of function + length of > > >> function (aka addend)". > > >> > > >> > Usually any relocation to the section would keep that section > > "alive" > > >> during linking - > > >> > > >> > but that would cause debug info to defeat linker GC and > > deduplication. > > >> So there's > > >> > > >> > special rules for how linkers handle these relocations in debug info > > to > > >> allow the > > >> > > >> > sections to be dropped - what do you write in the bytes that > > requested > > >> the relocation? > > >> > > >> > Binutils ld: Special cases only debug_ranges, resolving all > > relocations > > >> to dead > > >> > > >> > code to 1. In other debug sections, these values are all resolved to > > >> zero. > > >> > > >> > Gold and lld: Special cases all debug info sections - resolving all > > >> relocations > > >> > > >> > to "addend" (so begin usually goes to zero, end goes to "size of > > >> function") > > >> > > >> > These special rules are designed to ensure omitted/gc'd/deduplicated > > >> functions > > >> > > >> > don't cause the range list to terminate prematurely (which would > > happen > > >> if begin/end > > >> > > >> > were both resolved to zero). > > >> > > >> >But with an empty function, gold and lld's strategy here fails to > > avoid > > >> terminating a > > >> > > >> >range list by accident. > > >> > > >> > What should we do about it? > > >> > > >> > 1) Ensure no zero-length functions exist? (doesn't address backwards > > >> > > >> > compatibility/existing functions/other compilers) > > >> > 2) adopt the binutils approach to this (at least in debug_ranges - > > maybe > > >> in all > > >> > > >> > debug sections? (doing it in other sections could break ) > > >> > 3) Revisit the discussion about using an even more 'blessed' value, > > >> > > >> > like int max-1? ( > > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > > >> > > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > > > >> ) > > >> > > >> > (I don't have links to all the recent threads about this discussion > > - I > > >> think D59553 > > >> > > >> > might've spawned a separate broader discussion/non-review - oh, > > Alexey > > >> wrote a > > >> > > >> > good summary with links to other discussions here: > > >> > > >> > https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > > dev/2019-September/135068.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiey_aMV0lQ$ > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > > dev/2019- > > September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE > > 70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$> > > >> ) > > >> > > >> > Thoughts? > > >> > > >> > > >> > > >> I think for the problem of "zero length functions and .debug_ranges" > > >> binutils approach looks good: > > >> > > >> >Special cases only debug_ranges, resolving all relocations to > > >> >dead code to 1. In other debug sections, these values are all resolved > > to > > >> >zero. > > >> > > >> But, this would not completely solve the problem from > > >> > > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > > >> > > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > > > >> - Overlapped address ranges. Binutils approach will solve the problem > > if > > >> the address range specified as start_address:end_address. While > > resolving > > >> relocations, it would replace such a range with 1:1. > > >> However, It would not work if address ranges were specified as > > >> start_address:length since the length is not relocated. This case could > > be > > >> additionally fixed by fast scan debug_info for High_PC defined as > > length > > >> and changing it to 1. Something which you suggested here: > > >> https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > > dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiexb8NU_Fw$ > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > > dev/2020- > > May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b > > 4_rsj1TN0qB8NpBvVKtY63PsubKJQ$> > > >> . > > >> > > >> So it looks like following solution could fix both problems and be > > >> relatively fast: > > >> > > >> "Resolve all relocations from debug sections into dead code to 1. Parse > > >> debug sections and replace HighPc of an address range pointing to dead > > code > > >> and specified as length to 1". > > >> > > >> As the result all address ranges pointing into dead code would be > > marked > > >> as zero length. > > >> > > >> There still exist another problem: > > >> > > >> DWARF4: "A range list entry (but not a base address selection or end of > > >> list entry) whose beginning and > > >> ending addresses are equal has no effect because the size of the range > > >> covered by such an > > >> entry is zero." > > >> > > >> DWARF5: "A bounded range entry whose beginning and ending address > > offsets > > >> are equal > > >> (including zero) indicates an empty range and may be ignored." > > >> > > >> These rules allow us to ignore zero-length address ranges. I.e., some > > tool > > >> reading DWARF is permitted to ignore related DWARF entries. In that > > case, > > >> there could be ignored essential descriptions. That problem could > > happen > > >> with -flto=thin example > > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw!! > > JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL- > > 1Zdb4OydT2xVhpDTRyUixtaozLYiezSujGHwQ$ > > >> > > <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw!! > > JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY > > 637ju_eQw$> > > >> . In this example, all type definitions except one were replaced with > > >> declarations by thinlto. The definition, which was left, is in a piece > > of > > >> debug info related to deleted code. According to zero-length rule, that > > >> definition could be ignored, and finally, incomplete debug info could > > be > > >> used. > > >> > > >> So, it probably should be forbidden to generate debug_info, which could > > >> become incomplete after removing pieces related to zero length address > > >> ranges. Otherwise, creating zero-length address ranges could lead to > > >> incomplete debug info. > > >> > > >> > > >> > > >> Thank you, Alexey. > > >> > > >> > > >>
James Henderson via llvm-dev
2020-Jun-01 11:51 UTC
[llvm-dev] Range lists, zero-length functions, linker gc
On Fri, 29 May 2020 at 17:22, Robinson, Paul via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > > -----Original Message----- > > From: Fangrui Song <maskray at google.com> > > Sent: Friday, May 29, 2020 1:07 AM > > To: David Blaikie <dblaikie at gmail.com> > > Cc: Robinson, Paul <paul.robinson at sony.com>; Alexey Lapshin > > <alapshin at accesssoftek.com>; Sriraman Tallam <tmsriram at google.com>; Wei > Mi > > <wmi at google.com>; Adrian Prantl <aprantl at apple.com>; Jonas Devlieghere > > <jdevlieghere at apple.com>; Alexey Lapshin <a.v.lapshin at mail.ru>; Eric > > Christopher <echristo at gmail.com>; peter.smith at arm.com; > > grimar at accesssoftek.com; llvm-dev at lists.llvm.org > > Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc > > > > On 2020-05-28, David Blaikie wrote: > > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com> > > >wrote: > > > > > >> As has been mentioned elsewhere, Sony generally fixes up references > > from > > >> debug info to stripped functions (of any length) using -1, because > > that’s a > > >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical > base > > >> address for shared libraries, I’d think using it has the potential to > > >> mislead various consumers.) For .debug_ranges we use -2, because both > > a > > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section. > > >> > > > > > >Any harm in using -2 everywhere, for consistency? > > > > When resolving a relocation, in certain cases we have to give an > undefined > > symbol a value. > > This can happen with: > > > > * an undefined weak symbol > > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc- > > sections implementation can trigger this as well) > > * a relocation referencing an undefined symbol in a non-SHF_ALLOC section > > > > We always respect the addend in a relocation entry for an absolute/PC- > > relative (I can use "most" here) > > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64, > > local exec TLS relocation types, ...) > > Ignoring the addend (using -2 everywhere) will break this consistency. > > > > The relocated code may do pointer subtraction which would work if addends > > were > > respected, but will break using -2 everywhere. > > I suspect David meant "any harm to using -2 in all .debug_* sections?" > and not literally everywhere. Sony does special cases only for the > .debug_* sections. > > I've been meaning to propose that DWARF v6 reserve a special address for > this kind of situation. Whether the committee would be willing to make > it be -1 or -2 for all targets, or make it target-defined, I don't know. > (Dreading the inevitable argument over whether addresses are signed or > unsigned, or more to the point whether they wrap. They've been unsigned > and wrapping was undefined on the small set of machines I'm familiar with.) > Certainly the toolchain community would benefit from making it be the > same everywhere. > > Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges > sections be an extra-special case using -2. We can (I hope) standardize > on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended > practice for prior versions. > > > > > > The relocated code can be allocatable or not. Non-allocatable non-debug > > code can have meaningful pointer subtraction as well. This is why I am > > not too fond of (using a fixed value everywhere). > > > > >(also, I had a silly idea, but what would happen if we added a CU > > attribute > > >with an address value that was a reference to a weak always-unused > > symbol, > > >that way the linker would fix it up with whatever its preferred magic > > value > > >was, and the consumer would then know what the magic value was that > > >represented dead code? (though this would only work if the value were > > used > > >consistently everywhere - which is zero for gold/lld (well, almost... > you > > >can still create situations where a non-zero value is used even for a > > >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0 > > elsewhere) > > >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually > > work > > >for any producer currently, so maybe there's little value in that as a > > >feature)) > > > > For a non-SHF_ALLOC section, LLD currently considers it a GC root if all > > the conditions below are satisfied: > > > > * not SHT_REL[A] > > * not SHF_LINK_ORDER > > * not in a section group > > > > (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35 > > onwards will have mostly compatible semantics with LLD) > > > > There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64 -> > > each section takes 64 bytes in the section header table. SHF_LINK_ORDER > > has semantics of a lightweight section group. Assume we don't want to > > have one .debug_* for each function section, this .debug_* will be a GC > > root. Relocations from it (even if the symbol is weak) will retain the > > sections defining the symbols. > > We did some quick research into per-function .debug_info fragments a > while back, putting the subprogram info into the same section group as > the function; it was not an unqualified win. The very large number of > sections costs processing time, and cross-section references added to > the relocation count (I believe these can generally be resolved by MC > in a non-fragmented .debug_info section). James Henderson might have > the actual results stashed somewhere. >Amazingly, I managed to find them. Loosely explained, the experiment fragmented up the sections into smaller pieces that the linker glued back together using standard linker section grouping semantics. IIRC (I've not looked at the prototype in a long while), they used the SHF_LINK_ORDER section flag to achieve natural GC-ing, and consequently were just thrown away when the associated executable section was discarded. I measured both the size gains and link times for one of our games, both with and without --gc-sections, and got mixed results - good debug data size reductions when GC-ing was enabled, but bad link times degradations both with and without GC, due to the extra section overhead (and possibly relocation overhead - I never dug in to identify what exactly caused the slowdown). The prototype compiler was a modified version of our then-current clang compiler, and function and data sections were enabled to get effective GCing. I don't know which linker I used for this unfortunately - it might well have been our proprietary one rather than LLD - but I suspect the impact will be similar regardless, and even if it had been LLD, it would be quite an old version by now, so we'd want to rerun these results if there was interest in reinvestigating this approach. Results are below - they're somewhat rough-and-ready, as I haven't spent a lot of time editing them since I drew them up whenever it was: Baseline: File size-nogc = 221952512 (211.67MB) File size-gc = 218083024 (207.98MB) (GC size change: -1.7%) gc: Name & Size .debug_info 0x07E21FAA .debug_line 0x01389E3D .debug_abbrev 0x00369F62 .debug_aranges 0x00594E90 .debug_frame 0x00000118 .debug_loc 0x00523365 .debug_macinfo 0x00000664 .debug_pubtypes 0x00000000 .debug_ranges 0x005A8300 .debug_str 0x0039E38A Total debug_* 177294660 (169.08MB) (81.3% of total file size) Link time : 5.8s no gc: .debug_info 0x07E21FAA .debug_line 0x01389E3D .debug_abbrev 0x00369F62 .debug_aranges 0x00594E90 .debug_frame 0x00000118 .debug_loc 0x00523365 .debug_macinfo 0x00000664 .debug_pubtypes 0x00000000 .debug_ranges 0x005A8300 .debug_str 0x0039E38A Total debug_* 177294660 (169.08MB) (79.9% of total file size) Link time: 5.8s Prototype: File size-nogc = 279623672 (266.67MB) (126.0% vs. baseline) File size-gc = 194166760 (185.17MB) (GC size change: -30.6%) (89.0% of baseline) gc: .debug_info 0x071F0EFA .debug_line 0x00F7B5C4 .debug_abbrev 0x003EED11 .debug_aranges 0x00379410 .debug_frame 0x00000118 .debug_loc 0x00523365 .debug_macinfo 0x00000664 .debug_pubtypes 0x00000000 .debug_ranges 0x0006BA00 .debug_str 0x0039E38A Total debug_*: 153099850 (146.01MB) (78.8% of total file size) (86.4% of baseline) Link time: 11.7s (201.7% of baseline) no gc: .debug_info 0x08AFE274 .debug_line 0x037F9220 .debug_abbrev 0x003EED11 .debug_aranges 0x01001130 .debug_frame 0x00000118 .debug_loc 0x00523365 .debug_macinfo 0x00000664 .debug_pubtypes 0x00000000 .debug_ranges 0x0006BA00 .debug_str 0x0039E38A Total debug_*: 234965824 (224.08MB) (84.0% of total file size) (132% of baseline) Link time: 29.2s (503.4% of baseline) Another issue with this prototype was that the pre-link object files were not DWARF compliant, since the contribution headers were kept in separate sections to the body parts, so that they could be kept without throwing away the body. Perhaps these issues could be resolved with a bit more support from the standard though, allowing for leaner (preferably non-existent) headers or something, as current "COMDAT DWARF" tends to be very bloated (there needs to be one header per function, which is impractical, especially with line tables).> That approach *might* still be faster than post-processing a unified > section, which IIUC is what D59553 does. > > > > > So, this trick can't work without refining the --gc-sections rules > > further. > > If I understand the objection, yeah, we can't have .debug_* sections > being gc roots. > > --paulr > > > > > > > > >> If you’re looking only at zero-length functions, you can stop there; > > but > > >> I’m not sure why stopping there solves much of a real problem, as > > >> zero-length functions seem like a weird corner case. > > >> > > > > > >They're the case that breaks existing usage by terminating the range > list > > >early - the other existing usage seems to be fine with "resolve to > > addend" > > >strategy that lld and gold use - in that it moves most dead/deduplicated > > >functions outside the executable range and so consumers never come > asking > > >for "what code is at instruction 5" because they're never executing code > > at > > >a pc of 5. But, yes, this existing solution doesn't work once you have > > code > > >mapped into low address spaces or have utterly massive functions that > > might > > >have a length that would reach into the executable address space even > > when > > >their start is remapped to zero. > > > > For posterity, David gave me an example offline: void f1() { } void f2() > { > > } int main() { f1(); } > > > > clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o a.bfd > > llvm-dwarfdump -debug-ranges a.bfd > > => > > R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring the > > addend > > > > (Behavior introduced in > > https://sourceware.org/git/?p=binutils- > > > gdb.git;a=blobdiff;f=bfd*ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbb > > > fdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39 > > c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721) > > > > The comments below are also insightful. I need to ponder more (and need > > to read the DWARF v4 and v5 specs more as I am not so familiar these > > DWARF constructs). But it is too late now. Will probably comment > > another day :) > > > > > > > >> Linkers know how to strip dead functions (gc) or deduplicate them > (icf, > > >> COMDAT) and people do this all the time, in some cases (COMDAT) > without > > >> explicitly asking for it, so non-zero-length functions seem like the > > much > > >> more interesting case. In that situation, -1 (or -2) seems like a > much > > >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1. > > >> > > >> > > >> > > >> Stripping non-zero-length functions does mean you have to care about > > more > > >> sections. For example .debug_locs would want to be fixed up the same > > way > > >> as .debug_ranges, not because a debugger would care but so that > dumpers > > >> would not run into the 0/0 brick wall. > > >> > > > > > >Yep - in theory a consumer could actually use a loclist across multiple > > >sections (if a global variable got hoisted into a register for a > function > > >for instance), but I don't know of any producers doing this today - > until > > >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF > > that > > >has that problem (because it resolves both relocations to dead code > > >(begin/end of a range) to zero in all sections except debug_ranges, so > > >terminates the loclist list early) - binutils objdump avoids dumping the > > >following corrupted fragment by only dumping hunks of debug_loc starting > > at > > >places referenced from debug_info. Without debug_info it won't dump > > >anything from debug_loc - and if the references from debug_info, parsed > > >until the 0,0 terminator don't cover the whole debug_loc section, it > > prints > > >messages saying there are "gaps". > > > > > >Agreed that you'd want debug_loc to have the same special handling as > > >debug_ranges if it has special handling. Though ideally we'd pick a > value > > >that works equally everywhere? (-2, by the sounds of it) > > > > > > > > >> We also fix up lengths in .debug_aranges to zero, although there might > > be > > >> history behind that tactic that I’m not aware of; it seems like it > > ought to > > >> be unnecessary, if consumers are aware of the special address(es). > > >> > > > > > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine with > > the > > >same approach as debug_ranges, but I haven't looked at debug_aranges in > a > > >long time. > > > > > >I guess the only remaining question is: Since it's possible to have code > > on > > >some systems down at address zero, or close enough to it that [0, > length) > > >might overlap with real exxecutable code addresses - does anyone know of > > >the inverse: where code is mapped up near uint32 max? Such that that > > usage > > >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed value > > here? > > > > > >- Dave > > > > > > > > >> > > >> > > >> --paulr > > >> > > >> > > >> > > >> *From:* Alexey Lapshin <alapshin at accesssoftek.com> > > >> *Sent:* Thursday, May 28, 2020 9:03 AM > > >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi <wmi at google.com>; > > >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl > > <aprantl at apple.com>; > > >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin < > > >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>; Fangrui > > Song > > >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>; > > >> llvm-dev at lists.llvm.org > > >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions, linker > gc > > >> > > >> > > >> > > >> Hi David, > > >> > > >> > > >> > > >> >So there have been several recent discussions about the issues around > > >> > > >> >DWARF-agnostic linking and gc-sections, linkonce function definitions > > >> being > > >> > > >> >dropped, etc - and just how much DWARF-awareness would be suitable > > >> > > >> >in a linker to help with this situation. > > >> > > >> > > >> > I'd like to discuss a narrower instance of this issue: Zero length > > >> gc'd/deduplicated functions. > > >> > > >> > LLVM seems to at least produce zero length functions in a few cases: > > >> > * non-void function without a return statement > > >> > * function definition containing only llvm_unreachable > > >> > (both of these trap at -O0, but at higher optimization levels even > > the > > >> trap > > >> > > >> > instruction is removed & you get the full power UB of control > > >> flowing off > > >> > > >> > the end of the function into whatever other bytes are after that > > >> function) > > >> > > >> > So, for context, debug_ranges (this whole issue doesn't exist in > > >> DWARFv5, > > >> > > >> > FWIW) is a list of address pairs, terminated by a pair of zeros. > > >> > > >> > With function sections, or even just with normal C++ inline > > functions, > > >> > > >> > the CU will have a range entry for that function that consists of > two > > >> relocations > > >> > > >> > - to the start and end of the function. Generally the start of the > > >> function is the > > >> > > >> > start of the section, and the end is "start of function + length of > > >> function (aka addend)". > > >> > > >> > Usually any relocation to the section would keep that section > > "alive" > > >> during linking - > > >> > > >> > but that would cause debug info to defeat linker GC and > > deduplication. > > >> So there's > > >> > > >> > special rules for how linkers handle these relocations in debug info > > to > > >> allow the > > >> > > >> > sections to be dropped - what do you write in the bytes that > > requested > > >> the relocation? > > >> > > >> > Binutils ld: Special cases only debug_ranges, resolving all > > relocations > > >> to dead > > >> > > >> > code to 1. In other debug sections, these values are all resolved to > > >> zero. > > >> > > >> > Gold and lld: Special cases all debug info sections - resolving all > > >> relocations > > >> > > >> > to "addend" (so begin usually goes to zero, end goes to "size of > > >> function") > > >> > > >> > These special rules are designed to ensure omitted/gc'd/deduplicated > > >> functions > > >> > > >> > don't cause the range list to terminate prematurely (which would > > happen > > >> if begin/end > > >> > > >> > were both resolved to zero). > > >> > > >> >But with an empty function, gold and lld's strategy here fails to > > avoid > > >> terminating a > > >> > > >> >range list by accident. > > >> > > >> > What should we do about it? > > >> > > >> > 1) Ensure no zero-length functions exist? (doesn't address > backwards > > >> > > >> > compatibility/existing functions/other compilers) > > >> > 2) adopt the binutils approach to this (at least in debug_ranges - > > maybe > > >> in all > > >> > > >> > debug sections? (doing it in other sections could break ) > > >> > 3) Revisit the discussion about using an even more 'blessed' value, > > >> > > >> > like int max-1? ( > > > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > > >> > > < > https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > > > >> ) > > >> > > >> > (I don't have links to all the recent threads about this discussion > > - I > > >> think D59553 > > >> > > >> > might've spawned a separate broader discussion/non-review - oh, > > Alexey > > >> wrote a > > >> > > >> > good summary with links to other discussions here: > > >> > > >> > https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > > dev/2019-September/135068.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiey_aMV0lQ$ > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > > dev/2019- > > > September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE > > 70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$> > > >> ) > > >> > > >> > Thoughts? > > >> > > >> > > >> > > >> I think for the problem of "zero length functions and .debug_ranges" > > >> binutils approach looks good: > > >> > > >> >Special cases only debug_ranges, resolving all relocations to > > >> >dead code to 1. In other debug sections, these values are all > resolved > > to > > >> >zero. > > >> > > >> But, this would not completely solve the problem from > > >> > > > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$ > > >> > > < > https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3 > > > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$ > > > > > >> - Overlapped address ranges. Binutils approach will solve the problem > > if > > >> the address range specified as start_address:end_address. While > > resolving > > >> relocations, it would replace such a range with 1:1. > > >> However, It would not work if address ranges were specified as > > >> start_address:length since the length is not relocated. This case > could > > be > > >> additionally fixed by fast scan debug_info for High_PC defined as > > length > > >> and changing it to 1. Something which you suggested here: > > >> https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm- > > dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ- > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiexb8NU_Fw$ > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm- > > dev/2020- > > > May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b > > 4_rsj1TN0qB8NpBvVKtY63PsubKJQ$> > > >> . > > >> > > >> So it looks like following solution could fix both problems and be > > >> relatively fast: > > >> > > >> "Resolve all relocations from debug sections into dead code to 1. > Parse > > >> debug sections and replace HighPc of an address range pointing to dead > > code > > >> and specified as length to 1". > > >> > > >> As the result all address ranges pointing into dead code would be > > marked > > >> as zero length. > > >> > > >> There still exist another problem: > > >> > > >> DWARF4: "A range list entry (but not a base address selection or end > of > > >> list entry) whose beginning and > > >> ending addresses are equal has no effect because the size of the range > > >> covered by such an > > >> entry is zero." > > >> > > >> DWARF5: "A bounded range entry whose beginning and ending address > > offsets > > >> are equal > > >> (including zero) indicates an empty range and may be ignored." > > >> > > >> These rules allow us to ignore zero-length address ranges. I.e., some > > tool > > >> reading DWARF is permitted to ignore related DWARF entries. In that > > case, > > >> there could be ignored essential descriptions. That problem could > > happen > > >> with -flto=thin example > > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw > !! > > JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL- > > 1Zdb4OydT2xVhpDTRyUixtaozLYiezSujGHwQ$ > > >> > > <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw > !! > > > JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY > > 637ju_eQw$> > > >> . In this example, all type definitions except one were replaced with > > >> declarations by thinlto. The definition, which was left, is in a piece > > of > > >> debug info related to deleted code. According to zero-length rule, > that > > >> definition could be ignored, and finally, incomplete debug info could > > be > > >> used. > > >> > > >> So, it probably should be forbidden to generate debug_info, which > could > > >> become incomplete after removing pieces related to zero length address > > >> ranges. Otherwise, creating zero-length address ranges could lead to > > >> incomplete debug info. > > >> > > >> > > >> > > >> Thank you, Alexey. > > >> > > >> > > >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200601/72de2adf/attachment.html>