thr3ads.net - llvm dev - [llvm-dev] Propagation of debug information for variable into basic blocks. [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Keith Walker via llvm-dev

2016-Sep-21 17:29 UTC

[llvm-dev] Propagation of debug information for variable into basic blocks.

Adrian,

I am currently investigating issues where variables that one would expect to be
available in a debugger are not in code that is compiled at optimisations other
than -O0

The main problem appears to be with the LiveDebugValues::join() method because
it does not allow variables to be propagated into blocks unless all predecessor
blocks have an Outgoing Location for that variable.

As a simple example in the C code:

int func2( int);
void func(int a) {
        int b = func2(10);
        for(int i = 1; i < a; i++) {
                func2(i+b);
        }
}

One would reasonable expect when stopped within the body of the for loop that
you could access the variable b in a debugger (especially as it is actually
referenced in the loop).

Unfortunately this is often not the case.   I believe that this is due to the
requirement stated in the descriptive comment of LiveDebugValues::join() which
states:
  "if the same source variable in all the predecessors of @MBB reside in
the same location."

In our simple example we end up with a series of blocks like

  BB#0   Initial-block         Predecessor:                                    
Successor: BB#2

  BB#1  for-body                Predecessor: BB#2                         
Successor: BB#2

  BB#2  for-condition       Predecessor: BB#0 BB#1               Successor: BB#1
BB#3

  BB#3  after-for                Predecessor: BB#2                         
Successor :

Now b is initially defined to be an "Outgoing Location" to BB#0,  but
it isn't imported into BB#2 because it is not defined as an "Outgoing
Location" for both predecessor blocks BB#0 and BB#1.

So the outcome is that the variable b is not available in the debugging
information while in BB#2 (or BB#1).

Now changing the algorithm in LiveDebugValues::join() to include all Outgoing
Locations from predecessor blocks appears to significantly improve the
visibility of variables in such cases.    However I am worried that doing this
possibly propagates the variables more than intended ... or maybe it is the
right thing to do.

So if you have any suggestions or alternative approaches to consider then please
let me know.

Keith



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/cae5b4ea/attachment.html>

Adrian Prantl via llvm-dev

2016-Sep-21 20:59 UTC

head link

[llvm-dev] Propagation of debug information for variable into basic blocks.

> On Sep 21, 2016, at 10:29 AM, Keith Walker <Keith.Walker at arm.com>
wrote:
> 
> Adrian,
>  
> I am currently investigating issues where variables that one would expect
to be available in a debugger are not in code that is compiled at optimisations
other than –O0
>  
> The main problem appears to be with the LiveDebugValues::join() method
because it does not allow variables to be propagated into blocks unless all
predecessor blocks have an Outgoing Location for that variable.
>  
> As a simple example in the C code:
>  
> int func2( int);
> void func(int a) {
>         int b = func2(10);
>         for(int i = 1; i < a; i++) {
>                 func2(i+b);
>         }
> }
>  
> One would reasonable expect when stopped within the body of the for loop
that you could access the variable b in a debugger (especially as it is actually
referenced in the loop).
Side note:
In optimized code I would expect the loop to be rewritten into something like
  int func2( int);
  void func(int a) {
        int b = func2(10);
        for(int i = b+1; i < a; i++)
                func2(i);
  }
so I would expect the primary reason for b being unavailable in the loop body to
be that b is effectively dead and there is no reason to keep it in a register.
But that's not what happens in your example.
>  
> Unfortunately this is often not the case.   I believe that this is due to
the requirement stated in the descriptive comment of LiveDebugValues::join()
which states:
>   “if the same source variable in all the predecessors of @MBB reside in
the same location.”
>  
> In our simple example we end up with a series of blocks like
>  
>   BB#0   Initial-block         Predecessor:                                
Successor: BB#2
>  
>   BB#1  for-body                Predecessor: BB#2                         
Successor: BB#2
>  
>   BB#2  for-condition       Predecessor: BB#0 BB#1               Successor:
BB#1 BB#3
>  
>   BB#3  after-for                Predecessor: BB#2                         
Successor :
>  
> Now b is initially defined to be an “Outgoing Location” to BB#0,  but it
isn’t imported into BB#2 because it is not defined as an “Outgoing Location” for
both predecessor blocks BB#0 and BB#1.
>  
> So the outcome is that the variable b is not available in the debugging
information while in BB#2 (or BB#1).
>  
> Now changing the algorithm in LiveDebugValues::join() to include all
Outgoing Locations from predecessor blocks appears to significantly improve the
visibility of variables in such cases.    However I am worried that doing this
possibly propagates the variables more than intended ... or maybe it is the
right thing to do.
>  
> So if you have any suggestions or alternative approaches to consider then
please let me know.
Conceptually, the LiveDebugValues data flow analysis should be using
three-valued logic arranged in a lattice

    ⊥ (uninitialized / don't know)
   / \
true false (is (not) available)

where join(x, ⊥) = x, otherwise it behaves like boolean &.

All debug variable values are initialized to the bottom element first. After
processing BB#0 we have var[b, reg23] = true. When we join this with the unknown
⊥ from BB#1, we propagate var[b, reg23] into BB#1. Next time we join at BB#2 we
will have consistent information in both predecessors and the algorithm
converges. If, for example, BB#1 had conflicing information for b the next join
at BB#2 would delete the information for b and the result would still be
correct.
This is guaranteed to terminate because the information at the nodes can only
move in one direction in the lattice and can change at most once.

I haven't thought this through entirely, but it looks like we could
implement this by keeping track of which basic blocks we never visited before
and special-casing previously unvisited basic blocks in join().

-- adrian
>  
> Keith

Daniel Berlin via llvm-dev

2016-Sep-21 21:03 UTC

head link

[llvm-dev] Propagation of debug information for variable into basic blocks.

On Wed, Sep 21, 2016 at 12:29 PM, Keith Walker via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Adrian,
>
>
>
> I am currently investigating issues where variables that one would expect
> to be available in a debugger are not in code that is compiled at
> optimisations other than –O0
>
>
>
> The main problem appears to be with the LiveDebugValues::join() method
> because it does not allow variables to be propagated into blocks unless all
> predecessor blocks have an Outgoing Location for that variable.
>
>
>
> As a simple example in the C code:
>
>
>
> int func2( int);
>
> void func(int a) {
>
>         int b = func2(10);
>
>         for(int i = 1; i < a; i++) {
>
>                 func2(i+b);
>
>         }
>
> }
>
>
>
> One would reasonable expect when stopped within the body of the for loop
> that you could access the variable b in a debugger (especially as it is
> actually referenced in the loop).
>
>
>
> Unfortunately this is often not the case.   I believe that this is due to
> the requirement stated in the descriptive comment of
> LiveDebugValues::join() which states:
>
>   “if the same source variable in all the predecessors of @MBB reside in
> the same location.”
>
>
>
> In our simple example we end up with a series of blocks like
>
>
>
>   BB#0   Initial-block         Predecessor:
>                 Successor: BB#2
>
>
>
>   BB#1  for-body                Predecessor: BB#2
> Successor: BB#2
>
>
>
>   BB#2  for-condition       Predecessor: BB#0 BB#1
> Successor: BB#1 BB#3
>
>
>
>   BB#3  after-for                Predecessor:
> BB#2                          Successor :
>
>
>
> Now b is initially defined to be an “Outgoing Location” to BB#0,  but it
> isn’t imported into BB#2 because it is not defined as an “Outgoing
> Location” for both predecessor blocks BB#0 and BB#1.
>
>
>
> So the outcome is that the variable b is not available in the debugging
> information while in BB#2 (or BB#1).
>
>
>
> Now changing the algorithm in LiveDebugValues::join() to include all
> Outgoing Locations from predecessor blocks appears to significantly improve
> the visibility of variables in such cases.    However I am worried that
> doing this possibly propagates the variables more than intended ... or
> maybe it is the right thing to do.
>

GCC uses union of predecessor outs.

This looks to be trying to convert it to a lattice problem, but isn't
handling of the lattice looks a little odd.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/125043d4/attachment.html>

Daniel Berlin via llvm-dev

2016-Sep-21 21:09 UTC

head link

[llvm-dev] Propagation of debug information for variable into basic blocks.

>
>
> Conceptually, the LiveDebugValues data flow analysis should be using
> three-valued logic arranged in a lattice
>
>     ⊥ (uninitialized / don't know)
>    / \
> true false (is (not) available)
>
> where join(x, ⊥) = x, otherwise it behaves like boolean &.
>
> All debug variable values are initialized to the bottom element first.
> After processing BB#0 we have var[b, reg23] = true. When we join this with
> the unknown ⊥ from BB#1, we propagate var[b, reg23] into BB#1. Next time we
> join at BB#2 we will have consistent information in both predecessors and
> the algorithm converges.


FWIW: GCC does this as a union, so you get the maximal info available. If
it's not available along a given path, it's simply not there for that
path.
This will discard it if *any* path has missing info (not just inconsistent
info).

I'll skip whether this is or is not the right thing to do :)


> If, for example, BB#1 had conflicing information for b the next join at
> BB#2 would delete the information for b and the result would still be
> correct.
> This is guaranteed to terminate because the information at the nodes can
> only move in one direction in the lattice and can change at most once.
>
> I haven't thought this through entirely, but it looks like we could
> implement this by keeping track of which basic blocks we never visited
> before and special-casing previously unvisited basic blocks in join().
>
This is because you don't really init all the info to bottom for real. It
tries to be lazy.
Otherwise, they'd all have outlocs of bottom.
They are only theoretically initialized, so things get the wrong answer.

For example, this code is not right:


  // For all predecessors of this MBB, find the set of VarLocs that
  // can be joined.
  for (auto p : MBB.predecessors()) {
    auto OL = OutLocs.find(p);
    // Join is null in case of empty OutLocs from any of the pred.
    if (OL == OutLocs.end())
      return false;


This is wrong  if the block is unvisited (as you say)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/73ec80e2/attachment.html>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Sep 2016 - Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

Maybe Matching Threads