thr3ads.net - llvm dev - [llvm-dev] Propagation of debug information for variable into basic blocks. [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Daniel Berlin via llvm-dev

2016-Sep-21 21:09 UTC

[llvm-dev] Propagation of debug information for variable into basic blocks.

>
>
> Conceptually, the LiveDebugValues data flow analysis should be using
> three-valued logic arranged in a lattice
>
>     ⊥ (uninitialized / don't know)
>    / \
> true false (is (not) available)
>
> where join(x, ⊥) = x, otherwise it behaves like boolean &.
>
> All debug variable values are initialized to the bottom element first.
> After processing BB#0 we have var[b, reg23] = true. When we join this with
> the unknown ⊥ from BB#1, we propagate var[b, reg23] into BB#1. Next time we
> join at BB#2 we will have consistent information in both predecessors and
> the algorithm converges.


FWIW: GCC does this as a union, so you get the maximal info available. If
it's not available along a given path, it's simply not there for that
path.
This will discard it if *any* path has missing info (not just inconsistent
info).

I'll skip whether this is or is not the right thing to do :)


> If, for example, BB#1 had conflicing information for b the next join at
> BB#2 would delete the information for b and the result would still be
> correct.
> This is guaranteed to terminate because the information at the nodes can
> only move in one direction in the lattice and can change at most once.
>
> I haven't thought this through entirely, but it looks like we could
> implement this by keeping track of which basic blocks we never visited
> before and special-casing previously unvisited basic blocks in join().
>
This is because you don't really init all the info to bottom for real. It
tries to be lazy.
Otherwise, they'd all have outlocs of bottom.
They are only theoretically initialized, so things get the wrong answer.

For example, this code is not right:


  // For all predecessors of this MBB, find the set of VarLocs that
  // can be joined.
  for (auto p : MBB.predecessors()) {
    auto OL = OutLocs.find(p);
    // Join is null in case of empty OutLocs from any of the pred.
    if (OL == OutLocs.end())
      return false;


This is wrong  if the block is unvisited (as you say)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/73ec80e2/attachment.html>

Daniel Berlin via llvm-dev

2016-Sep-21 21:23 UTC

head link

[llvm-dev] Propagation of debug information for variable into basic blocks.

>
>
>
>   // For all predecessors of this MBB, find the set of VarLocs that
>   // can be joined.
>   for (auto p : MBB.predecessors()) {
>     auto OL = OutLocs.find(p);
>     // Join is null in case of empty OutLocs from any of the pred.
>     if (OL == OutLocs.end())
>       return false;
>
>
> This is wrong  if the block is unvisited (as you say)
>
This is actually non-trivial to fix, IMHO.

For example, the following looks kinda right:

for (auto p : MBB.predecessors()) {
    if (p is not visited)
     continue;
    auto OL = OutLocs.find(p);
    // Join is null in case of empty OutLocs from any of the pred.
    if (OL == OutLocs.end())
      return false;

(and will work in a lot of cases)

This can be wrong, however, if you have more than a single backedge in the
program.
(it may be possible to have a block that has multiple uninited predecessors
that will require multiple iterations to resolve)

The problem here is that to get the maximal answer in an intersection
problem, you *must* initialize the sets to contain every value.  You can
look at dataflow algorithm papers for formal proofs if you want.

So you *actually* have to treat the default state of every unvisited block
as the equivalent of "containing every value" to get the maximal
answer.

How to achieve this for real is ... not obvious at a glance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/4c17d31c/attachment-0001.html>

Adrian Prantl via llvm-dev

2016-Sep-21 22:42 UTC

head link

[llvm-dev] Propagation of debug information for variable into basic blocks.

> On Sep 21, 2016, at 2:23 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:
> 
> 
> 
>   // For all predecessors of this MBB, find the set of VarLocs that
>   // can be joined.
>   for (auto p : MBB.predecessors()) {
>     auto OL = OutLocs.find(p);
>     // Join is null in case of empty OutLocs from any of the pred.
>     if (OL == OutLocs.end())
>       return false;
> 
> 
> This is wrong  if the block is unvisited (as you say)
> 
> This is actually non-trivial to fix, IMHO.
> 
> For example, the following looks kinda right:
>   
> for (auto p : MBB.predecessors()) {
>     if (p is not visited)
>      continue;
>     auto OL = OutLocs.find(p);
>     // Join is null in case of empty OutLocs from any of the pred.
>     if (OL == OutLocs.end())
>       return false;
> 
> (and will work in a lot of cases)
> 
> This can be wrong, however, if you have more than a single backedge in the
program.
> (it may be possible to have a block that has multiple uninited predecessors
that will require multiple iterations to resolve)
> 
> The problem here is that to get the maximal answer in an intersection
problem, you *must* initialize the sets to contain every value.  You can look at
dataflow algorithm papers for formal proofs if you want.
> 
> So you *actually* have to treat the default state of every unvisited block
as the equivalent of "containing every value" to get the maximal
answer.
> 
> How to achieve this for real is ... not obvious at a glance.
> 
I had to construct an example to see the problem, but my example is bit
contrived (it depends on starting in the middle of the graph) so I wonder if
there is a better counterexample.

BB0:y   BB1:x,y
 |        |
BB2:⊥  BB3:⊥  BB4:⊥
 |    /   |    /
BB5:x     |   /  
      \   |  /
        BB6: ?
         |
         +-> back-edge to BB1, BB4

BB5 has a definition for x but neither kills nor defines y. No block ever kills
y. Let's assume we for whatever reason started in BB5. The first join() at
BB6 will yield the set [x] (BB3 and BB4 are unvisited and thus skipped, just as
if they contained every value). This causes BB2 and BB4 to be added to the
working set. If we now visit BB4 first, it will see only [x] and it will be
impossible for y be propagated to BB6.

Is there a better example?

-- adrian

llvm dev - Sep 2016 - Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.

[llvm-dev] Propagation of debug information for variable into basic blocks.