thr3ads.net - llvm dev - [llvm-dev] Range lists, zero-length functions, linker gc [May 2020]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2020-May-28 22:55 UTC

[llvm-dev] Range lists, zero-length functions, linker gc

On Thu, May 28, 2020 at 6:03 AM Alexey Lapshin <alapshin at
accesssoftek.com>
wrote:
> Hi David,
>
>
> >So there have been several recent discussions about the issues around
>
> >DWARF-agnostic linking and gc-sections, linkonce function definitions
> being
>
> >dropped, etc - and just how much DWARF-awareness would be suitable
>
> >in a linker to help with this situation.
>
> > I'd like to discuss a narrower instance of this issue: Zero length
> gc'd/deduplicated functions.
>
> > LLVM seems to at least produce zero length functions in a few cases:
> > * non-void function without a return statement
> > * function definition containing only llvm_unreachable
> > (both of these trap at -O0, but at higher optimization levels even the
> trap
> > instruction is removed & you get the full power UB of control
> flowing off
> > the end of the function into whatever other bytes are after that
> function)
>
> > So, for context, debug_ranges (this whole issue doesn't exist in
> DWARFv5,
> > FWIW) is a list of address pairs, terminated by a pair of zeros.
> > With function sections, or even just with normal C++ inline functions,
> > the CU will have a range entry for that function that consists of two
> relocations
> > - to the start and end of the function. Generally the start of the
> function is the
> > start of the section, and the end is "start of function + length
of
> function (aka addend)".
>
> >  Usually any relocation to the section would keep that section
"alive"
> during linking -
> > but that would cause debug info to defeat linker GC and deduplication.
> So there's
> > special rules for how linkers handle these relocations in debug info
to
> allow the
> > sections to be dropped - what do you write in the bytes that requested
> the relocation?
>
> > Binutils ld: Special cases only debug_ranges, resolving all
relocations
> to dead
> > code to 1. In other debug sections, these values are all resolved to
> zero.
> > Gold and lld: Special cases all debug info sections - resolving all
> relocations
> > to "addend" (so begin usually goes to zero, end goes to
"size of
> function")
>
> > These special rules are designed to ensure
omitted/gc'd/deduplicated
> functions
> > don't cause the range list to terminate prematurely (which would
happen
> if begin/end
> > were both resolved to zero).
>
> >But with an empty function, gold and lld's strategy here fails to
avoid
> terminating a
> >range list by accident.
>
> > What should we do about it?
>
> >  1) Ensure no zero-length functions exist? (doesn't address
backwards
> > compatibility/existing functions/other compilers)
> > 2) adopt the binutils approach to this (at least in debug_ranges -
maybe
> in all
> > debug sections? (doing it in other sections could break )
> >  3) Revisit the discussion about using an even more 'blessed'
value,
> > like int max-1? ( https://reviews.llvm.org/D59553 )
>
> >  (I don't have links to all the recent threads about this
discussion - I
> think D59553
> > might've spawned a separate broader discussion/non-review - oh,
Alexey
> wrote a
> > good summary with links to other discussions here:
> >  http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html )
>
> > Thoughts?
>
> I think for the problem of "zero length functions and
.debug_ranges"
> binutils approach looks good:
>
> >Special cases only debug_ranges, resolving all relocations to
> >dead code to 1. In other debug sections, these values are all resolved
to
> >zero.
>
> But, this would not completely solve the problem from
> https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils
> approach will solve the problem if the address range specified as
> start_address:end_address. While resolving relocations, it would replace
> such a range with 1:1.
> However, It would not work if address ranges were specified as
> start_address:length since the length is not relocated.
>This case could be additionally fixed by fast scan debug_info for
High_PC> defined as length and changing it to 1. Something which you suggested here:
> http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.
>
Hmm, I don't /think/ I intended to suggest anything that would have to
parse all the debug_info, even if just to fixup high_pc. I meant that
debug_rnglist for the CU at least (rnglist has fewer problems - you can't
accidentally terminate it early, but still has the "large functions in
programs that use relatively low code addresses can't just be resolved to
"addend" because then [0, length) of the large function might overlap
into
that code address range") could be modified by a DWARF-aware linker to
remove the unused chunks. The DWARF that describes a specific function
using low_pc/high_pc - it may be split into a .dwo file and unreachable by
the linker - so it /needs/ a magic value for the address referenced by the
low_pc to indicate that it is invalid.

Which all comes back to "we probably need to pick a value that's
explicitly
invalid" and -2 (max - 1) seems to be about the right thing.

>
> So it looks like following solution could fix both problems and be
> relatively fast:
>
> "Resolve all relocations from debug sections into dead code to 1.
Parse
> debug sections and replace HighPc of an address range pointing to dead code
> and specified as length to 1".
>
That second part seems pretty expensive compared to anything else the
linker is doing with debug info. I'd try to avoid it if at all possible.

> As the result all address ranges pointing into dead code would be marked
> as zero length.
>
> There still exist another problem:
>
> DWARF4: "A range list entry (but not a base address selection or end
of
> list entry) whose beginning and
> ending addresses are equal has no effect because the size of the range
> covered by such an
> entry is zero."
>
> DWARF5: "A bounded range entry whose beginning and ending address
offsets
> are equal
> (including zero) indicates an empty range and may be ignored."
>
> These rules allow us to ignore zero-length address ranges. I.e., some tool
> reading DWARF is permitted to ignore related DWARF entries.
>
I agree it allows consumers to ignore that entry in the range list because
that entry is zero-length/equivalent to not being present at all - I don't
think that means consumers can ignore the DIE that refers to this range
list. I think it's valid DWARF to have a CU that only describes types,
without any code attached to it at all. Or for a subprogram that's been
eliminated to still be used by a consumer for name lookup purposes - so the
consumer can tell the user there is a function called "f1" and tell
the
user what parameter types, return type it has, etc - not ignore it entirely.

> In that case, there could be ignored essential descriptions. That problem
> could happen with -flto=thin example
> https://reviews.llvm.org/D54747#1503720 . In this example, all type
> definitions except one were replaced with declarations by thinlto. The
> definition, which was left, is in a piece of debug info related to deleted
> code. According to zero-length rule, that definition could be ignored, and
> finally, incomplete debug info could be used.
>
Yeah, I think the bug there is the linker dropping object files just
because they have no exxecutable code in them - I think the patch that did
that was reverted, if I'm remembering correctly.

>
> So, it probably should be forbidden to generate debug_info, which could
> become incomplete after removing pieces related to zero length address
> ranges. Otherwise, creating zero-length address ranges could lead to
> incomplete debug info.
>
> Thank you, Alexey.
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200528/d6306619/attachment-0001.html>

Eric Christopher via llvm-dev

2020-May-29 00:39 UTC

head link

[llvm-dev] Range lists, zero-length functions, linker gc

(FWIW I'm following this, but also haven't disagreed with anything Dave
has
said yet :)

On Thu, May 28, 2020 at 3:55 PM David Blaikie <dblaikie at gmail.com>
wrote:
> On Thu, May 28, 2020 at 6:03 AM Alexey Lapshin <alapshin at
accesssoftek.com>
> wrote:
>
>> Hi David,
>>
>>
>> >So there have been several recent discussions about the issues
around
>>
>> >DWARF-agnostic linking and gc-sections, linkonce function
definitions
>> being
>>
>> >dropped, etc - and just how much DWARF-awareness would be suitable
>>
>> >in a linker to help with this situation.
>>
>> > I'd like to discuss a narrower instance of this issue: Zero
length
>> gc'd/deduplicated functions.
>>
>> > LLVM seems to at least produce zero length functions in a few
cases:
>> > * non-void function without a return statement
>> > * function definition containing only llvm_unreachable
>> > (both of these trap at -O0, but at higher optimization levels even
the
>> trap
>> > instruction is removed & you get the full power UB of control
>> flowing off
>> > the end of the function into whatever other bytes are after that
>> function)
>>
>> > So, for context, debug_ranges (this whole issue doesn't exist
in
>> DWARFv5,
>> > FWIW) is a list of address pairs, terminated by a pair of zeros.
>> > With function sections, or even just with normal C++ inline
functions,
>> > the CU will have a range entry for that function that consists of
two
>> relocations
>> > - to the start and end of the function. Generally the start of the
>> function is the
>> > start of the section, and the end is "start of function +
length of
>> function (aka addend)".
>>
>> >  Usually any relocation to the section would keep that section
"alive"
>> during linking -
>> > but that would cause debug info to defeat linker GC and
deduplication.
>> So there's
>> > special rules for how linkers handle these relocations in debug
info to
>> allow the
>> > sections to be dropped - what do you write in the bytes that
requested
>> the relocation?
>>
>> > Binutils ld: Special cases only debug_ranges, resolving all
relocations
>> to dead
>> > code to 1. In other debug sections, these values are all resolved
to
>> zero.
>> > Gold and lld: Special cases all debug info sections - resolving
all
>> relocations
>> > to "addend" (so begin usually goes to zero, end goes to
"size of
>> function")
>>
>> > These special rules are designed to ensure
omitted/gc'd/deduplicated
>> functions
>> > don't cause the range list to terminate prematurely (which
would happen
>> if begin/end
>> > were both resolved to zero).
>>
>> >But with an empty function, gold and lld's strategy here fails
to avoid
>> terminating a
>> >range list by accident.
>>
>> > What should we do about it?
>>
>> >  1) Ensure no zero-length functions exist? (doesn't address
backwards
>> > compatibility/existing functions/other compilers)
>> > 2) adopt the binutils approach to this (at least in debug_ranges -
>> maybe in all
>> > debug sections? (doing it in other sections could break )
>> >  3) Revisit the discussion about using an even more
'blessed' value,
>> > like int max-1? ( https://reviews.llvm.org/D59553 )
>>
>> >  (I don't have links to all the recent threads about this
discussion -
>> I think D59553
>> > might've spawned a separate broader discussion/non-review -
oh, Alexey
>> wrote a
>> > good summary with links to other discussions here:
>> > 
http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html )
>>
>> > Thoughts?
>>
>> I think for the problem of "zero length functions and
.debug_ranges"
>> binutils approach looks good:
>>
>> >Special cases only debug_ranges, resolving all relocations to
>> >dead code to 1. In other debug sections, these values are all
resolved to
>> >zero.
>>
>> But, this would not completely solve the problem from
>> https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils
>> approach will solve the problem if the address range specified as
>> start_address:end_address. While resolving relocations, it would
replace
>> such a range with 1:1.
>> However, It would not work if address ranges were specified as
>> start_address:length since the length is not relocated.
>>
> This case could be additionally fixed by fast scan debug_info for High_PC
>> defined as length and changing it to 1. Something which you suggested
here:
>> http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.
>>
>
> Hmm, I don't /think/ I intended to suggest anything that would have to
> parse all the debug_info, even if just to fixup high_pc. I meant that
> debug_rnglist for the CU at least (rnglist has fewer problems - you
can't
> accidentally terminate it early, but still has the "large functions in
> programs that use relatively low code addresses can't just be resolved
to
> "addend" because then [0, length) of the large function might
overlap into
> that code address range") could be modified by a DWARF-aware linker to
> remove the unused chunks. The DWARF that describes a specific function
> using low_pc/high_pc - it may be split into a .dwo file and unreachable by
> the linker - so it /needs/ a magic value for the address referenced by the
> low_pc to indicate that it is invalid.
>
> Which all comes back to "we probably need to pick a value that's
> explicitly invalid" and -2 (max - 1) seems to be about the right
thing.
>
>
>>
>> So it looks like following solution could fix both problems and be
>> relatively fast:
>>
>> "Resolve all relocations from debug sections into dead code to 1.
Parse
>> debug sections and replace HighPc of an address range pointing to dead
code
>> and specified as length to 1".
>>
>
> That second part seems pretty expensive compared to anything else the
> linker is doing with debug info. I'd try to avoid it if at all
possible.
>
>
>> As the result all address ranges pointing into dead code would be
marked
>> as zero length.
>>
>> There still exist another problem:
>>
>> DWARF4: "A range list entry (but not a base address selection or
end of
>> list entry) whose beginning and
>> ending addresses are equal has no effect because the size of the range
>> covered by such an
>> entry is zero."
>>
>> DWARF5: "A bounded range entry whose beginning and ending address
offsets
>> are equal
>> (including zero) indicates an empty range and may be ignored."
>>
>> These rules allow us to ignore zero-length address ranges. I.e., some
>> tool reading DWARF is permitted to ignore related DWARF entries.
>>
>
> I agree it allows consumers to ignore that entry in the range list because
> that entry is zero-length/equivalent to not being present at all - I
don't
> think that means consumers can ignore the DIE that refers to this range
> list. I think it's valid DWARF to have a CU that only describes types,
> without any code attached to it at all. Or for a subprogram that's been
> eliminated to still be used by a consumer for name lookup purposes - so the
> consumer can tell the user there is a function called "f1" and
tell the
> user what parameter types, return type it has, etc - not ignore it
entirely.
>
>
>> In that case, there could be ignored essential descriptions. That
problem
>> could happen with -flto=thin example
>> https://reviews.llvm.org/D54747#1503720 . In this example, all type
>> definitions except one were replaced with declarations by thinlto. The
>> definition, which was left, is in a piece of debug info related to
deleted
>> code. According to zero-length rule, that definition could be ignored,
and
>> finally, incomplete debug info could be used.
>>
>
> Yeah, I think the bug there is the linker dropping object files just
> because they have no exxecutable code in them - I think the patch that did
> that was reverted, if I'm remembering correctly.
>
>
>>
>> So, it probably should be forbidden to generate debug_info, which could
>> become incomplete after removing pieces related to zero length address
>> ranges. Otherwise, creating zero-length address ranges could lead to
>> incomplete debug info.
>>
>> Thank you, Alexey.
>>
>>
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200528/6e94e509/attachment.html>

Alexey Lapshin via llvm-dev

2020-May-29 13:31 UTC

head link

[llvm-dev] Range lists, zero-length functions, linker gc

But, this would not completely solve the problem from
https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils approach
will solve the problem if the address range specified as
start_address:end_address. While resolving relocations, it would replace such a
range with 1:1.
>> However, It would not work if address ranges were specified as
start_address:length since the
>> length is not relocated.
>>This case could be additionally fixed by fast scan debug_info for
High_PC defined as length
>>and changing it to 1. Something which you suggested
>>here: http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.
> Hmm, I don't /think/ I intended to suggest anything that would have to
parse all the debug_info,
> even if just to fixup high_pc. I meant that debug_rnglist for the CU at
least (rnglist has fewer
> problems - you can't accidentally terminate it early, but still has the
"large functions in programs
> that use relatively low code addresses can't just be resolved to
"addend" because then [0, length)
> of the large function might overlap into that code address range")
could be modified by a
> DWARF-aware linker to remove the unused chunks.
right. you did not.
that is my suggestion to extend that idea - not only fix debug_rnglist
but all other occurrences of HighPC.
>The DWARF that describes a specific function
> using low_pc/high_pc - it may be split into a .dwo file and unreachable by
the linker - so it /needs/ a
> magic value for the address referenced by the low_pc to indicate that it is
invalid.
for the split-dwarf: solution which updates HighPC should patch .dwo files also.
> Which all comes back to "we probably need to pick a value that's
explicitly invalid" and -2 (max - 1)
> seems to be about the right thing.
>>So it looks like following solution could fix both problems and be
relatively fast:
>>"Resolve all relocations from debug sections into dead code to 1.
Parse debug sections
>> and replace HighPc of an address range pointing to dead code and
specified as length to 1".
> That second part seems pretty expensive compared to anything else the
linker is doing with
> debug info. I'd try to avoid it if at all possible.
Agreed with that. Though there are some concerns about -2 which could be
essential or not:

I do not know real problems caused by using UINT64_MAX-1 for address ranges
pointing to deleted code. Moreover, while testing
https://reviews.llvm.org/D59553
I noticed that the tools become work better: lldb, llvm-symbolizer, gnu
addr2line,
gnu objdump. They report code location correctly with the patch and incorrectly
without the patch.

But there is a corner case: address range is specified as start_address:length.
After replacing start_address with -2, LowPC becomes higher than
HighPC.>From the point of DWARF standard - this is "undefined behavior".
The standardsays nothing about that case. Different tools could interpret it differently.
Some tools could assume that such a situation is not possible and crash if it
occurs.
Some could ignore it. Others could report an error and stop working.
f.e. llvm-dwarfdump --verify reports error and continue to work.

llvm-dwarfdump --verify :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

So after implementing this, some tools could potentially stop working.
I do not know, such tools. So, I am not sure whether that is the problem.

Additionally, It is necessary to document that behavior in DWARF standard to
avoid problems
in the future(same as for zero length address ranges):

"A bounded range entry whose beginning address offset greater than ending
address offset
indicates an invalid range and may be ignored. "

Note, that this does not specify an additional magic value(UINT64_MAX-1).
Instead, it describes general situation(LowPC>HighPC).

If backward compatibility is not a problem - then using LowPC>HighPC to
indicate invalid
address range pointing to deleted code seems to be the fastest solution(which
could be
implemented by resolving relocations from debug sections to deleted code to
UINT64_MAX-1).

If backward compatibility is a problem - then we could use already standardized
"zero-length address range" to mark address ranges pointing to deleted
code.
That solution would require to patch address range length in the dwarf.

Thank you, Alexey
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200529/60230385/attachment.html>

Alexey Lapshin via llvm-dev

2020-May-29 21:32 UTC

head link

[llvm-dev] Range lists, zero-length functions, linker gc

>>There still exist another problem:
>> DWARF4: "A range list entry (but not a base address selection or
end of list entry) whose beginning and
>>ending addresses are equal has no effect because the size of the range
covered by such
>>an entry is zero."
>>DWARF5: "A bounded range entry whose beginning and ending address
offsets are equal
>>(including zero) indicates an empty range and may be ignored."
>>These rules allow us to ignore zero-length address ranges. I.e., some
tool reading DWARF is permitted to ignore related DWARF entries.
>I agree it allows consumers to ignore that entry in the range list because
that entry is zero-length/equivalent to not being present at all - I don't
think that >means consumers can ignore the DIE that refers to this range
list. I think it's valid DWARF to have a CU that only describes types,
without any code attached >to it at all. Or for a subprogram that's been
eliminated to still be used by a consumer for name lookup purposes - so the
consumer can tell the user there is a >function called "f1" and
tell the user what parameter types, return type it has, etc - not ignore it
entirely.
Probably it relies on interpretation. And then it would be good to clarify that
question in the DWARF standard.
I think there is a difference when CU does not relate to any address. And when
it relates to invalid
address(deleted code). Probably, these two situations should be handled
differently:

1. CU that only describes types without any code attached to it should not be
ignored by the tools.
2. CU that relates to the deleted code could be removed/ignored by the tools.
>> In that case, there could be ignored essential descriptions. That
problem could happen with -flto=thin
>> example https://reviews.llvm.org/D54747#1503720 . In this example, all
type definitions except one were
>> replaced with declarations by thinlto. The definition, which was left,
is in a piece of debug info related to
>> deleted code. According to zero-length rule, that definition could be
ignored, and finally, incomplete debug
>> info could be used.
> Yeah, I think the bug there is the linker dropping object files just
because they have no exxecutable
> code in them - I think the patch that did that was reverted, if I'm
remembering correctly.
Right. The patch was reverted. But that problem is actual for any tool which
tries to remove debug
info related to garbage collected code. For example, that problem exists for
dsymutil:

$ cat a.cpp
int f();
int main() {
  return f();
}

$ cat b.cpp
struct Foo {
  int x, y;
};
int f() {
  volatile Foo var;
  var.x = 13;
  var.y = 42;
  return var.x + var.y;
}

$ clang++ a.cpp b.cpp -O -g -flto=thin -Wl,-dead_strip
$ dsymutil a.out
$ llvm-dwarfdump -a a.out.dSYM/Contents/Resources/DWARF/a.out | grep Foo
   DW_AT_type (0x00000000000000b1 "volatile Foo")
   DW_AT_type (0x00000000000000b6 "Foo")
   DW_AT_name ("Foo")  
<<<<<<<<<<<<<<<<<< that is
a declaration(definition is removed)
0x000000af: "Foo"?

i.e. Probably we need to clarify that question in the standard: whether it is
allowed to
remove/ignore DIEs related to deleted code.
So that tools(dsymutil/DWARF aware linker) correctly handle such situations.

If it would be necessary to analyze debug info related to deleted code
(whether it contains something used in other parts of debug info)
then the linking process will become even more slow.
It would be better to not allow to generate such closely-coupled debug info.

Thank You, Alexey.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200529/6713b018/attachment.html>

David Blaikie via llvm-dev

2020-May-29 21:42 UTC

head link

[llvm-dev] Range lists, zero-length functions, linker gc

On Fri, May 29, 2020 at 2:32 PM Alexey Lapshin
<alapshin at accesssoftek.com> wrote:>
> >>There still exist another problem:
>
> >> DWARF4: "A range list entry (but not a base address selection
or end of list entry) whose beginning and
> >>ending addresses are equal has no effect because the size of the
range covered by such
> >>an entry is zero."
>
> >>DWARF5: "A bounded range entry whose beginning and ending
address offsets are equal
> >>(including zero) indicates an empty range and may be ignored."
>
> >>These rules allow us to ignore zero-length address ranges. I.e.,
some tool reading DWARF is permitted to ignore related DWARF entries.
>
>
> >I agree it allows consumers to ignore that entry in the range list
because that entry is zero-length/equivalent to not being present at all - I
don't think that >means consumers can ignore the DIE that refers to this
range list. I think it's valid DWARF to have a CU that only describes types,
without any code attached >to it at all. Or for a subprogram that's been
eliminated to still be used by a consumer for name lookup purposes - so the
consumer can tell the user there is a >function called "f1" and
tell the user what parameter types, return type it has, etc - not ignore it
entirely.
>
> Probably it relies on interpretation. And then it would be good to clarify
that question in the DWARF standard.
> I think there is a difference when CU does not relate to any address. And
when it relates to invalid
> address(deleted code). Probably, these two situations should be handled
differently:
>
> 1. CU that only describes types without any code attached to it should not
be ignored by the tools.
> 2. CU that relates to the deleted code could be removed/ignored by the
tools.
I think it's probably best not to handle those two cases differently -
because there's no way to know if (2) has some types that might be
useful even if the code is eliminated, in the same way that (1) has
types that might be useful.

That said, I don't think the current state of affairs - of linkers
implementing the "usual unix linker-y semantics" dropping some object
files & thus dropping their associated debug info. One can probably
create cases where, even within that semantic (without the extra lld
feature that was especially problematic) some strange things happen
that might not be ideal - but I think I'd have to see those cases
before I'd worry too much about that situation.


> >> In that case, there could be ignored essential descriptions. That
problem could happen with -flto=thin
> >> example https://reviews.llvm.org/D54747#1503720 . In this example,
all type definitions except one were
> >> replaced with declarations by thinlto. The definition, which was
left, is in a piece of debug info related to
> >> deleted code. According to zero-length rule, that definition could
be ignored, and finally, incomplete debug
> >> info could be used.
>
>
> > Yeah, I think the bug there is the linker dropping object files just
because they have no exxecutable
> > code in them - I think the patch that did that was reverted, if
I'm remembering correctly.
>
> Right. The patch was reverted. But that problem is actual for any tool
which tries to remove debug
> info related to garbage collected code. For example, that problem exists
for dsymutil:
>
> $ cat a.cpp
> int f();
> int main() {
>   return f();
> }
>
> $ cat b.cpp
> struct Foo {
>   int x, y;
> };
> int f() {
>   volatile Foo var;
>   var.x = 13;
>   var.y = 42;
>   return var.x + var.y;
> }
>
> $ clang++ a.cpp b.cpp -O -g -flto=thin -Wl,-dead_strip
> $ dsymutil a.out
> $ llvm-dwarfdump -a a.out.dSYM/Contents/Resources/DWARF/a.out | grep Foo
>    DW_AT_type (0x00000000000000b1 "volatile Foo")
>    DW_AT_type (0x00000000000000b6 "Foo")
>    DW_AT_name ("Foo")  
<<<<<<<<<<<<<<<<<< that is
a declaration(definition is removed)
> 0x000000af: "Foo"
Honestly, ThinLTO is a weird beast when it comes to linker semantics
and debug info & could probably do with a bunch of tweaking, perhaps
especially in the overlap with the Apple/MachO linking semantics. I'd
say there's room for improvement there, but I don't know that that
improvement requires any word from the DWARF committee/changes to the
DWARF standard.
> i.e. Probably we need to clarify that question in the standard: whether it
is allowed to
> remove/ignore DIEs related to deleted code.
> So that tools(dsymutil/DWARF aware linker) correctly handle such
situations.
>
> If it would be necessary to analyze debug info related to deleted code
> (whether it contains something used in other parts of debug info)
> then the linking process will become even more slow.
Yep - I'd rather avoid that & probably address such issues by changing
how ThinLTO works, to ensure it doesn't drop type information (there
are other solutions here other than changing linker semantics -
ThinLTO could be changed to produce different object files to begin
with - importing/exporting types to different files to ensure they are
preserved into the final linked binary)
> It would be better to not allow to generate such closely-coupled debug
info.
That would come at the cost of increasing duplication, potentially,
which I'd like to avoid.

- Dave

llvm dev - May 2020 - Range lists, zero-length functions, linker gc

[llvm-dev] Range lists, zero-length functions, linker gc

[llvm-dev] Range lists, zero-length functions, linker gc

[llvm-dev] Range lists, zero-length functions, linker gc

[llvm-dev] Range lists, zero-length functions, linker gc

[llvm-dev] Range lists, zero-length functions, linker gc