thr3ads.net - llvm dev - [llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891) [Feb 2017]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2017-Feb-09 16:28 UTC

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)

On Wed, Feb 8, 2017 at 10:32 AM Adrian Prantl <aprantl at apple.com>
wrote:
> On Feb 8, 2017, at 10:17 AM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
>
> On Wed, Feb 8, 2017 at 9:56 AM Adrian Prantl <aprantl at apple.com>
wrote:
>
> On Feb 8, 2017, at 9:44 AM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
>
> On Wed, Feb 8, 2017 at 9:36 AM Adrian Prantl <aprantl at apple.com>
wrote:
>
>
> > On Feb 8, 2017, at 9:25 AM, David Blaikie <dblaikie at
gmail.com> wrote:
> >
> > So Reid came across a case where the current strategy (dropping
> locations when they move across basic blocks) proves to be at odds with
> another precept we have: inlinable calls must have locations, so that
> if/when they are inlined the location can be accurately described (due to
> the nature of our IR representation of inlining, a location must be given
> for the call site so that the DIEs for the inlined subroutine can be
> described)
> >
> > Any ideas on how we should approach this?
> >
> > We could have a bit in DebugLoc (or other choice of representation)
for
> the non-line location. For the line table this would work the same as
> absent DebugLoc - for calls it would cause call_file/line/discriminator (do
> we have a call_discriminator? We probably should, don't know if we do)
to
> be omitted which is something the LLVM IR representation currently
can't
> express.
>
> Even if the function calls are not inlined it could be preferable to keep
> a debug location. It's a very odd debugging experience to be stopped at
a
> breakpoint and then do "up" (or just "bt") to look at
the parent frame and
> not have a location for the call in the parent frame.
>
>
> Usually what will happen is it will have a location, it'll be the
flow-on
> from the previous location (ie: we won't emit any line entry for the
call,
> so it naturally ends up at the line of whatever instruction came before).
>
>
> Is the stripping in this case motivated by correctness for profiling or
> just to smooth the line table to improve the stepping experience?
>
>
> Mostly it seems the people pushing/contributing patches are motivated by
> profile correctness, as far as I understand (with the line table/debugging
> behavior 'improvement' being a nod to "there's at least an
argument to be
> made that this could be justified by debuggability too"). At least
that's
> the impression I get. I may've misunderstood.
>
>
> If its about correctness for profiling, we should consider having a
> separate profiling location or a skip-for-profiling bit as David suggests,
> that honored when the CU has the new tune-for-profiling flag set.
>
>
> I think that might be a bit too much, in my mind, for the
> tune-for-profiling flag. Adding things to make the profile more accurate
> seems good - removing things and 'hurting' debuggability because
you opted
> for profiling seems like a different and more difficult-to-tolerate
> situation.
>
>
> If it is about a better stepping experience, we just shouldn't do this
for
> call sites.
>
>
> Why? A call could produce weird stepping behavior as much as a non-call
> instruction, and could be similarly mis-attributed. CSE could hoist two
> distinct calls to pure functions into a common block and we'd pick one
> location - the stack trace and location would be confusing to the debugger
> user (it could end up indicating that the if branch was taken when it was
> really the else branch, being the canonical example)
>
> Calls don't seem fundamentally special here - except due to limitations
of
> the IR representation.
>
>
> The CSE example is a good one, though I *think* in this case we will end
> up calling getMergedDebugLocation() which will return a line 0 location.
>
>
> I think getMergedDebugLocation (or whatever it's called) currently
returns
> a null location, which is a bit different from line 0.
>
>
> Indeed it does, I misremembered. Looks like we have the same latent
> problem with all users of getMergedDebugLocation(), too.
>
Yep - though I imagine not all of them are liable to move calls.

>
> Line zero is tricky to use as it can make the line table larger (file
> size)/worse for the user (similarly jumpy behavior, possibly, depending on
> how the DWARF consumer handles it).
>
>
> LLDB handles line 0 on call sites well (and we could easily fix it if it
> didn't), I can't speak for gdb or the sce debugger. Do we have any
data?
>
Not off-hand, no.

>
>
>
> Which brings me to the next idea: Couldn't we just assign a line 0
> location instead of removing the DebugLoc and everybody would be happy?
>
>
> Will make the line table larger, which might not be ideal (I remember Paul
> tried this, then went to only using zero loc at the start of basic blocks
> to ensure that block ordering didn't affect locations by causing
un-lined
> instructions at the start of a block to inherit the location at the end of
> the previous block). But maybe for this new use case/concern the cost might
> be worthwhile? Not sure.
>
>
> If we just limit it to function calls, the size increase might be
> acceptable. We'd have to measure it.
>
Ah, true

>
>
> Or should we treat moving an instruction to a different BB different from
> CSE?
>
>
> CSE's just the most compelling example, even if it's only a single
move it
> could still be confusing to the user and the profile (eg: the block might
> be conditional, the destination unconditional - so as a debugger user
you'd
> still be mislead into believing the condition is true because your code is
> executing, when that's not the case)
>
> - Dave
>
>
>
> -- adrian
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170209/1b0eec25/attachment.html>

Robinson, Paul via llvm-dev

2017-Feb-10 00:09 UTC

head link

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)

Sorry to be late to the party, I was out yesterday.

I do want to point out that there are different motivations for the two cases. 
When moving/hoisting a single instruction, the motivation is to avoid bad
debugger stepping and bad sample-profile data (we get a lot of complaints about
bad debugger stepping, btw) for which we pay the price of not having totally
precise instruction-to-source mapping.  I think this is the right tradeoff,
fwiw.  But, when *combining* or merging instructions (with different source
locations), the problem is that we *can't* provide a correct source
location, because there is not a single correct source location and we can't
do a one-to-many mapping.  As it happens, we're using the same mechanism to
solve these two different problems, i.e. erase the debug location from the
moved/combined instruction, but the motivations are still different.

Why the fuss about motivation?  Because it means the moved-call and merged-call
cases are different, and might want different solutions.  It would be fine to
add an "I've been moved" flag to the debug location of a moved
instruction, with whatever necessary effects that flag might need to have; but
it would *not* be fine to use this flag for merged/combined instructions,
because in those cases we do not have a correct "original" source
location to preserve.  In the latter case, we still need to use line-0 for
calls, because there's no other correct thing to do.

Probably getMergedLocation should optionally return line-0 instead of null;
whether it's an optional don’t-return-null parameter or a separate
getMergedLocationForCall API is an implementation detail.  The code-motion cases
won't be using getMergedLocation anyway, so if we want to make a new API to
set a "been-moved" flag, that seems pretty orthogonal to the
merged-call problem.  Or, of course, we can always skip the flag idea, using
line-0 for moved calls and null for moved anything-else.

There was also a question about whether the SCE debugger would tolerate a call
instruction with line-0.  I expect it's fine, but I will check.  I might not
have an answer right away as I am off again tomorrow.
--paulr

From: David Blaikie [mailto:dblaikie at gmail.com]
Sent: Thursday, February 09, 2017 8:28 AM
To: Adrian Prantl
Cc: llvm-dev; Dehao Chen; Eric Christopher; Reid Kleckner; Robinson, Paul
Subject: Re: Stripping Debug Locations on cross BB moves, part 2 (PR31891)

On Wed, Feb 8, 2017 at 10:32 AM Adrian Prantl <aprantl at
apple.com<mailto:aprantl at apple.com>> wrote:
On Feb 8, 2017, at 10:17 AM, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Wed, Feb 8, 2017 at 9:56 AM Adrian Prantl <aprantl at
apple.com<mailto:aprantl at apple.com>> wrote:
On Feb 8, 2017, at 9:44 AM, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Wed, Feb 8, 2017 at 9:36 AM Adrian Prantl <aprantl at
apple.com<mailto:aprantl at apple.com>> wrote:
> On Feb 8, 2017, at 9:25 AM, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:
>
> So Reid came across a case where the current strategy (dropping locations
when they move across basic blocks) proves to be at odds with another precept we
have: inlinable calls must have locations, so that if/when they are inlined the
location can be accurately described (due to the nature of our IR representation
of inlining, a location must be given for the call site so that the DIEs for the
inlined subroutine can be described)
>
> Any ideas on how we should approach this?
>
> We could have a bit in DebugLoc (or other choice of representation) for the
non-line location. For the line table this would work the same as absent
DebugLoc - for calls it would cause call_file/line/discriminator (do we have a
call_discriminator? We probably should, don't know if we do) to be omitted
which is something the LLVM IR representation currently can't express.
Even if the function calls are not inlined it could be preferable to keep a
debug location. It's a very odd debugging experience to be stopped at a
breakpoint and then do "up" (or just "bt") to look at the
parent frame and not have a location for the call in the parent frame.

Usually what will happen is it will have a location, it'll be the flow-on
from the previous location (ie: we won't emit any line entry for the call,
so it naturally ends up at the line of whatever instruction came before).

Is the stripping in this case motivated by correctness for profiling or just to
smooth the line table to improve the stepping experience?

Mostly it seems the people pushing/contributing patches are motivated by profile
correctness, as far as I understand (with the line table/debugging behavior
'improvement' being a nod to "there's at least an argument to
be made that this could be justified by debuggability too"). At least
that's the impression I get. I may've misunderstood.

If its about correctness for profiling, we should consider having a separate
profiling location or a skip-for-profiling bit as David suggests, that honored
when the CU has the new tune-for-profiling flag set.

I think that might be a bit too much, in my mind, for the tune-for-profiling
flag. Adding things to make the profile more accurate seems good - removing
things and 'hurting' debuggability because you opted for profiling seems
like a different and more difficult-to-tolerate situation.

If it is about a better stepping experience, we just shouldn't do this for
call sites.

Why? A call could produce weird stepping behavior as much as a non-call
instruction, and could be similarly mis-attributed. CSE could hoist two distinct
calls to pure functions into a common block and we'd pick one location - the
stack trace and location would be confusing to the debugger user (it could end
up indicating that the if branch was taken when it was really the else branch,
being the canonical example)

Calls don't seem fundamentally special here - except due to limitations of
the IR representation.

The CSE example is a good one, though I *think* in this case we will end up
calling getMergedDebugLocation() which will return a line 0 location.

I think getMergedDebugLocation (or whatever it's called) currently returns a
null location, which is a bit different from line 0.

Indeed it does, I misremembered. Looks like we have the same latent problem with
all users of getMergedDebugLocation(), too.

Yep - though I imagine not all of them are liable to move calls.

Line zero is tricky to use as it can make the line table larger (file
size)/worse for the user (similarly jumpy behavior, possibly, depending on how
the DWARF consumer handles it).

LLDB handles line 0 on call sites well (and we could easily fix it if it
didn't), I can't speak for gdb or the sce debugger. Do we have any data?

Not off-hand, no.

Which brings me to the next idea: Couldn't we just assign a line 0 location
instead of removing the DebugLoc and everybody would be happy?

Will make the line table larger, which might not be ideal (I remember Paul tried
this, then went to only using zero loc at the start of basic blocks to ensure
that block ordering didn't affect locations by causing un-lined instructions
at the start of a block to inherit the location at the end of the previous
block). But maybe for this new use case/concern the cost might be worthwhile?
Not sure.

If we just limit it to function calls, the size increase might be acceptable.
We'd have to measure it.

Ah, true

Or should we treat moving an instruction to a different BB different from CSE?

CSE's just the most compelling example, even if it's only a single move
it could still be confusing to the user and the profile (eg: the block might be
conditional, the destination unconditional - so as a debugger user you'd
still be mislead into believing the condition is true because your code is
executing, when that's not the case)

- Dave

-- adrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170210/e26515e4/attachment.html>

Adrian Prantl via llvm-dev

2017-Feb-10 01:16 UTC

head link

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)

> On Feb 9, 2017, at 4:09 PM, Robinson, Paul <paul.robinson at
sony.com> wrote:
> 
> Sorry to be late to the party, I was out yesterday. 
>  
> I do want to point out that there are different motivations for the two
cases.  When moving/hoisting a single instruction, the motivation is to avoid
bad debugger stepping and bad sample-profile data (we get a lot of complaints
about bad debugger stepping, btw) for which we pay the price of not having
totally precise instruction-to-source mapping.  I think this is the right
tradeoff, fwiw.  But, when *combining* or merging instructions (with different
source locations), the problem is that we *can't* provide a correct source
location, because there is not a single correct source location and we can't
do a one-to-many mapping.  As it happens, we're using the same mechanism to
solve these two different problems, i.e. erase the debug location from the
moved/combined instruction, but the motivations are still different.
>  
> Why the fuss about motivation?  Because it means the moved-call and
merged-call cases are different, and might want different solutions.  It would
be fine to add an "I've been moved" flag to the debug location of
a moved instruction, with whatever necessary effects that flag might need to
have; but it would *not* be fine to use this flag for merged/combined
instructions, because in those cases we do not have a correct
"original" source location to preserve.  In the latter case, we still
need to use line-0 for calls, because there's no other correct thing to do.
>  
> Probably getMergedLocation should optionally return line-0 instead of null;
whether it's an optional don’t-return-null parameter or a separate
getMergedLocationForCall API is an implementation detail.
We need something to this end to support getMergedLocationForCall for
CallInstructions. I'm not yet sure what the best API design for this. It
feels dangerous to let the caller decide whether a nullptr is acceptable.
>  The code-motion cases won't be using getMergedLocation anyway, so if
we want to make a new API to set a "been-moved" flag, that seems
pretty orthogonal to the merged-call problem.  Or, of course, we can always skip
the flag idea, using line-0 for moved calls and null for moved anything-else.
David, do you want to make an argument for why the "been-moved" flag
is superior over using line-0?
> There was also a question about whether the SCE debugger would tolerate a
call instruction with line-0.  I expect it's fine, but I will check.  I
might not have an answer right away as I am off again tomorrow.
Thanks for checking!

-- adrian
> --paulr
>  
>  
> From: David Blaikie [mailto:dblaikie at gmail.com] 
> Sent: Thursday, February 09, 2017 8:28 AM
> To: Adrian Prantl
> Cc: llvm-dev; Dehao Chen; Eric Christopher; Reid Kleckner; Robinson, Paul
> Subject: Re: Stripping Debug Locations on cross BB moves, part 2 (PR31891)
>  
>  
> 
> On Wed, Feb 8, 2017 at 10:32 AM Adrian Prantl <aprantl at apple.com>
wrote:
> On Feb 8, 2017, at 10:17 AM, David Blaikie <dblaikie at gmail.com>
wrote:
>  
>  
> 
> On Wed, Feb 8, 2017 at 9:56 AM Adrian Prantl <aprantl at apple.com>
wrote:
> On Feb 8, 2017, at 9:44 AM, David Blaikie <dblaikie at gmail.com>
wrote:
>  
>  
> 
> On Wed, Feb 8, 2017 at 9:36 AM Adrian Prantl <aprantl at apple.com>
wrote:
> 
> > On Feb 8, 2017, at 9:25 AM, David Blaikie <dblaikie at
gmail.com> wrote:
> >
> > So Reid came across a case where the current strategy (dropping
locations when they move across basic blocks) proves to be at odds with another
precept we have: inlinable calls must have locations, so that if/when they are
inlined the location can be accurately described (due to the nature of our IR
representation of inlining, a location must be given for the call site so that
the DIEs for the inlined subroutine can be described)
> >
> > Any ideas on how we should approach this?
> >
> > We could have a bit in DebugLoc (or other choice of representation)
for the non-line location. For the line table this would work the same as absent
DebugLoc - for calls it would cause call_file/line/discriminator (do we have a
call_discriminator? We probably should, don't know if we do) to be omitted
which is something the LLVM IR representation currently can't express.
> 
> Even if the function calls are not inlined it could be preferable to keep a
debug location. It's a very odd debugging experience to be stopped at a
breakpoint and then do "up" (or just "bt") to look at the
parent frame and not have a location for the call in the parent frame.
>  
> Usually what will happen is it will have a location, it'll be the
flow-on from the previous location (ie: we won't emit any line entry for the
call, so it naturally ends up at the line of whatever instruction came before).
>  
> Is the stripping in this case motivated by correctness for profiling or
just to smooth the line table to improve the stepping experience?
>  
> Mostly it seems the people pushing/contributing patches are motivated by
profile correctness, as far as I understand (with the line table/debugging
behavior 'improvement' being a nod to "there's at least an
argument to be made that this could be justified by debuggability too"). At
least that's the impression I get. I may've misunderstood.
>  
> If its about correctness for profiling, we should consider having a
separate profiling location or a skip-for-profiling bit as David suggests, that
honored when the CU has the new tune-for-profiling flag set.
>  
> I think that might be a bit too much, in my mind, for the
tune-for-profiling flag. Adding things to make the profile more accurate seems
good - removing things and 'hurting' debuggability because you opted for
profiling seems like a different and more difficult-to-tolerate situation.
>  
> If it is about a better stepping experience, we just shouldn't do this
for call sites.
>  
> Why? A call could produce weird stepping behavior as much as a non-call
instruction, and could be similarly mis-attributed. CSE could hoist two distinct
calls to pure functions into a common block and we'd pick one location - the
stack trace and location would be confusing to the debugger user (it could end
up indicating that the if branch was taken when it was really the else branch,
being the canonical example)
> 
> Calls don't seem fundamentally special here - except due to limitations
of the IR representation.
>  
> The CSE example is a good one, though I *think* in this case we will end up
calling getMergedDebugLocation() which will return a line 0 location.
>  
> I think getMergedDebugLocation (or whatever it's called) currently
returns a null location, which is a bit different from line 0.
>  
> Indeed it does, I misremembered. Looks like we have the same latent problem
with all users of getMergedDebugLocation(), too.
>  
> Yep - though I imagine not all of them are liable to move calls.
>  
>  
> Line zero is tricky to use as it can make the line table larger (file
size)/worse for the user (similarly jumpy behavior, possibly, depending on how
the DWARF consumer handles it).
>  
> LLDB handles line 0 on call sites well (and we could easily fix it if it
didn't), I can't speak for gdb or the sce debugger. Do we have any data?
>  
> Not off-hand, no.
>  
> 
> 
>  
> Which brings me to the next idea: Couldn't we just assign a line 0
location instead of removing the DebugLoc and everybody would be happy?
>  
> Will make the line table larger, which might not be ideal (I remember Paul
tried this, then went to only using zero loc at the start of basic blocks to
ensure that block ordering didn't affect locations by causing un-lined
instructions at the start of a block to inherit the location at the end of the
previous block). But maybe for this new use case/concern the cost might be
worthwhile? Not sure.
>  
> If we just limit it to function calls, the size increase might be
acceptable. We'd have to measure it.
>  
> Ah, true
>  
>  
>  
> Or should we treat moving an instruction to a different BB different from
CSE?
>  
> CSE's just the most compelling example, even if it's only a single
move it could still be confusing to the user and the profile (eg: the block
might be conditional, the destination unconditional - so as a debugger user
you'd still be mislead into believing the condition is true because your
code is executing, when that's not the case)
> 
> - Dave
>  
>  
> -- adrian

llvm dev - Feb 2017 - Stripping Debug Locations on cross BB moves, part 2 (PR31891)

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)

[llvm-dev] Stripping Debug Locations on cross BB moves, part 2 (PR31891)