thr3ads.net - llvm dev - [llvm-dev] DebugInfo: Purpose of call site tags [Jan 2020]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2020-Jan-14 22:21 UTC

[llvm-dev] DebugInfo: Purpose of call site tags

Hey folks,

I'm trying to wrap my head around the implementation, purpose, and costs
involved in both the GCC-extension v4 and standard v5 DW_TAG_call_site,
call site parameters, addresses, etc.

So picking up from some of the design discussion in
https://reviews.llvm.org/D72489:

Me (Blaikie): I'm not sure why AT_call_return_pc would be needed at a tail
call site as the debugger must ignore it. As for emitting DW_AT_low_pc
under gdb tuning, I think this might be an artifact from the original GNU
implementation.


Djordje: Yes, that is the GNU implementation's heritage (I cannot remember
why GCC generated the low_pc info in the case of the tail calls), but GNU
GDB needs the low_pc (as an address) in order to handle the call_site and
call_site_parameters debug info for non-tail calls. To avoiding the pc
address info in the case of tail calls makes sense to me, since debuggers
should avoid that info.


OK, so a few questions on that:
1) Why would low_pc not be required for tail calls?
2) Why is the v4 low_pc predicated on GDB tuning too? If we're producing
the call_site tag, what's the point of that without an address?
3) What features do these call_site tags enable (in the absence of
call_site_parameters)?
4) What's the end goal in terms of what calls should be described in the
DWARF? (describing literally every call sounds /super/ expensive) - they
currently seem quite different between GCC and Clang on a few test cases
I've tried, so it's hard to tell the logic

(& if I understand correctly, the call_site_parameters are intended to work
collaboratively between callees and callers, so if, say, a parameter value
is caller saved & then clobbered in the callee - you could still print the
value of that parameter by looking at the saved copy in the caller?)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200114/fc85078f/attachment.html>

Vedant Kumar via llvm-dev

2020-Jan-14 23:36 UTC

head link

[llvm-dev] DebugInfo: Purpose of call site tags

> On Jan 14, 2020, at 2:21 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> Hey folks,
> 
> I'm trying to wrap my head around the implementation, purpose, and
costs involved in both the GCC-extension v4 and standard v5 DW_TAG_call_site,
call site parameters, addresses, etc.
> 
> So picking up from some of the design discussion in
https://reviews.llvm.org/D72489 <https://reviews.llvm.org/D72489>:
> 
> Me (Blaikie): I'm not sure why AT_call_return_pc would be needed at a
tail call site as the debugger must ignore it. As for emitting DW_AT_low_pc
under gdb tuning, I think this might be an artifact from the original GNU
implementation.
> 
> Djordje: Yes, that is the GNU implementation's heritage (I cannot
remember why GCC generated the low_pc info in the case of the tail calls), but
GNU GDB needs the low_pc (as an address) in order to handle the call_site and
call_site_parameters debug info for non-tail calls. To avoiding the pc address
info in the case of tail calls makes sense to me, since debuggers should avoid
that info.
> 
> OK, so a few questions on that: 
> 1) Why would low_pc not be required for tail calls?
I don’t think a meaningful return PC can be encoded at a tail call site. Control
doesn’t transfer to `PC+4` past the jump instruction when the callee returns
(the PC is set to whatever the last saved return address is instead).

My understanding is that the point of AT_call_return_pc is to allow the debugger
to present better backtraces, i.e. to implement a solver to figure out where to
insert artificial tail call frames in the backtrace.
> 2) Why is the v4 low_pc predicated on GDB tuning too? If we're
producing the call_site tag, what's the point of that without an address?
I’m fuzzy on this but IIUC the low_pc attribute in a call site tag is the GNU
predecessor to AT_call_return_pc. And a tag without return PC information just
gives a hint to the debugger that the function contains a tail call.
> 3) What features do these call_site tags enable (in the absence of
call_site_parameters)?
At the moment, just artificial tail call frames, but there are some interesting
potential future applications. E.g.: disambiguating backtraces in the presence
of function merging (a bigger deal for swift than it is for clang - the call
site tag for a thunk-call could record the “original”/unmerged/source-level
callee), and surfacing rich(er) information about CFI failures at call sites.
> 4) What's the end goal in terms of what calls should be described in
the DWARF? (describing literally every call sounds /super/ expensive) - they
currently seem quite different between GCC and Clang on a few test cases
I've tried, so it's hard to tell the logic
The goal is to describe all calls that aren’t optimized out. At least, I’m not
sure that there’s a leaner subset that would really be sufficient for Apple’s
use cases, and the size overhead hasn’t caused issues internally. We could
certainly add a mode to clang to elide some of this call site info, though.
> 
> (& if I understand correctly, the call_site_parameters are intended to
work collaboratively between callees and callers, so if, say, a parameter value
is caller saved & then clobbered in the callee - you could still print the
value of that parameter by looking at the saved copy in the caller?)
Yep!

vedant

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200114/65fe85fe/attachment.html>

Djordje Todorovic via llvm-dev

2020-Jan-15 08:46 UTC

head link

[llvm-dev] DebugInfo: Purpose of call site tags

On 15.1.20. 00:36, Vedant Kumar wrote:> 
> 
>> On Jan 14, 2020, at 2:21 PM, David Blaikie <dblaikie at gmail.com
<mailto:dblaikie at gmail.com>> wrote:
>>
>> Hey folks,
>>
>> I'm trying to wrap my head around the implementation, purpose, and
costs involved in both the GCC-extension v4 and standard v5 DW_TAG_call_site,
call site parameters, addresses, etc.
>>
>> So picking up from some of the design discussion
in https://reviews.llvm.org/D72489:
>>
>>         Me (Blaikie): I'm not sure why AT_call_return_pc would be
needed at a tail call site as the debugger must ignore it. As for emitting
DW_AT_low_pc under gdb tuning, I think this might be an artifact from the
original GNU implementation.
>>
>>
>>     Djordje: Yes, that is the GNU implementation's heritage (I
cannot remember why GCC generated the low_pc info in the case of the tail
calls), but GNU GDB needs the low_pc (as an address) in order to handle the
call_site and call_site_parameters debug info for non-tail calls. To avoiding
the pc address info in the case of tail calls makes sense to me, since debuggers
should avoid that info.
>>
>>
>> OK, so a few questions on that: 
>> 1) Why would low_pc not be required for tail calls?
> 
> I don’t think a meaningful return PC can be encoded at a tail call site.
Control doesn’t transfer to `PC+4` past the jump instruction when the callee
returns (the PC is set to whatever the last saved return address is instead).
> 
> My understanding is that the point of AT_call_return_pc is to allow the
debugger to present better backtraces, i.e. to implement a solver to figure out
where to insert artificial tail call frames in the backtrace.
+1, but the GCC still generates the low_pc (the GNU ext. v4) even for the tail
calls.
 >> 2) Why is the v4 low_pc predicated on GDB tuning too? If we're
producing the call_site tag, what's the point of that without an address?
> 
> I’m fuzzy on this but IIUC the low_pc attribute in a call site tag is the
GNU predecessor to AT_call_return_pc. And a tag without return PC information
just gives a hint to the debugger that the function contains a tail call.
Yes, there were no such attribute at that moment representing something like
that, and they picked the low_pc as a solution. In addition, if a call_site tag
corresponds to a tail call, it should have a flag
(DW_AT_call_tail_call/DW_AT_GNU_tail_call) indicating it is a tail call.
>> 3) What features do these call_site tags enable (in the absence of
call_site_parameters)?
> 
> At the moment, just artificial tail call frames, but there are some
interesting potential future applications. E.g.: disambiguating backtraces in
the presence of function merging (a bigger deal for swift than it is for clang -
the call site tag for a thunk-call could record the
“original”/unmerged/source-level callee), and surfacing rich(er) information
about CFI failures at call sites.
> 
>> 4) What's the end goal in terms of what calls should be described
in the DWARF? (describing literally every call sounds /super/ expensive) - they
currently seem quite different between GCC and Clang on a few test cases
I've tried, so it's hard to tell the logic
> 
> The goal is to describe all calls that aren’t optimized out. At least, I’m
not sure that there’s a leaner subset that would really be sufficient for
Apple’s use cases, and the size overhead hasn’t caused issues internally. We
could certainly add a mode to clang to elide some of this call site info,
though.
> 
>>
>> (& if I understand correctly, the call_site_parameters are intended
to work collaboratively between callees and callers, so if, say, a parameter
value is caller saved & then clobbered in the callee - you could still print
the value of that parameter by looking at the saved copy in the caller?)
> 
> Yep!
> 
> vedant
> 
Thanks,
Djordje

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jan 2020 - DebugInfo: Purpose of call site tags

[llvm-dev] DebugInfo: Purpose of call site tags

[llvm-dev] DebugInfo: Purpose of call site tags

[llvm-dev] DebugInfo: Purpose of call site tags

Reasonably Related Threads