Adrian Prantl via llvm-dev
2016-Sep-21 22:42 UTC
[llvm-dev] Propagation of debug information for variable into basic blocks.
> On Sep 21, 2016, at 2:23 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > // For all predecessors of this MBB, find the set of VarLocs that > // can be joined. > for (auto p : MBB.predecessors()) { > auto OL = OutLocs.find(p); > // Join is null in case of empty OutLocs from any of the pred. > if (OL == OutLocs.end()) > return false; > > > This is wrong if the block is unvisited (as you say) > > This is actually non-trivial to fix, IMHO. > > For example, the following looks kinda right: > > for (auto p : MBB.predecessors()) { > if (p is not visited) > continue; > auto OL = OutLocs.find(p); > // Join is null in case of empty OutLocs from any of the pred. > if (OL == OutLocs.end()) > return false; > > (and will work in a lot of cases) > > This can be wrong, however, if you have more than a single backedge in the program. > (it may be possible to have a block that has multiple uninited predecessors that will require multiple iterations to resolve) > > The problem here is that to get the maximal answer in an intersection problem, you *must* initialize the sets to contain every value. You can look at dataflow algorithm papers for formal proofs if you want. > > So you *actually* have to treat the default state of every unvisited block as the equivalent of "containing every value" to get the maximal answer. > > How to achieve this for real is ... not obvious at a glance. >I had to construct an example to see the problem, but my example is bit contrived (it depends on starting in the middle of the graph) so I wonder if there is a better counterexample. BB0:y BB1:x,y | | BB2:⊥ BB3:⊥ BB4:⊥ | / | / BB5:x | / \ | / BB6: ? | +-> back-edge to BB1, BB4 BB5 has a definition for x but neither kills nor defines y. No block ever kills y. Let's assume we for whatever reason started in BB5. The first join() at BB6 will yield the set [x] (BB3 and BB4 are unvisited and thus skipped, just as if they contained every value). This causes BB2 and BB4 to be added to the working set. If we now visit BB4 first, it will see only [x] and it will be impossible for y be propagated to BB6. Is there a better example? -- adrian
Daniel Berlin via llvm-dev
2016-Sep-21 23:34 UTC
[llvm-dev] Propagation of debug information for variable into basic blocks.
On Wed, Sep 21, 2016 at 5:42 PM, Adrian Prantl <aprantl at apple.com> wrote:> > > On Sep 21, 2016, at 2:23 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > > > > > // For all predecessors of this MBB, find the set of VarLocs that > > // can be joined. > > for (auto p : MBB.predecessors()) { > > auto OL = OutLocs.find(p); > > // Join is null in case of empty OutLocs from any of the pred. > > if (OL == OutLocs.end()) > > return false; > > > > > > This is wrong if the block is unvisited (as you say) > > > > This is actually non-trivial to fix, IMHO. > > > > For example, the following looks kinda right: > > > > for (auto p : MBB.predecessors()) { > > if (p is not visited) > > continue; > > auto OL = OutLocs.find(p); > > // Join is null in case of empty OutLocs from any of the pred. > > if (OL == OutLocs.end()) > > return false; > > > > (and will work in a lot of cases) > > > > This can be wrong, however, if you have more than a single backedge in > the program. > > (it may be possible to have a block that has multiple uninited > predecessors that will require multiple iterations to resolve) > > > > The problem here is that to get the maximal answer in an intersection > problem, you *must* initialize the sets to contain every value. You can > look at dataflow algorithm papers for formal proofs if you want. > > > > So you *actually* have to treat the default state of every unvisited > block as the equivalent of "containing every value" to get the maximal > answer. > > > > How to achieve this for real is ... not obvious at a glance. > > > > I had to construct an example to see the problem, but my example is bit > contrived (it depends on starting in the middle of the graph) so I wonder > if there is a better counterexample. > > BB0:y BB1:x,y > | | > BB2:⊥ BB3:⊥ BB4:⊥ > | / | / > BB5:x | / > \ | / > BB6: ? > | > +-> back-edge to BB1, BB4 > > BB5 has a definition for x but neither kills nor defines y. No block ever > kills y. Let's assume we for whatever reason started in BB5. The first > join() at BB6 will yield the set [x] (BB3 and BB4 are unvisited and thus > skipped, just as if they contained every value). This causes BB2 and BB4 to > be added to the working set. If we now visit BB4 first, it will see only > [x] and it will be impossible for y be propagated to BB6. > > Is there a better example? >So staring a bit at it, i've convinced myself the fix i listed may actually be correct *as long* as we guarantee the proper iteration order. The only reason i know about this at all is because it happened many many moons ago in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27755 :) For a while, we actually computed the maximal set and used it. Now, we use implicit ones, which for an intersection problem, *should be* the equivalent of skipping the predecessor (if the predecessor is maximal, anding it with 1's will simply never remove anything). However, this *only* works if you can guarantee you have visited at least one predecessor/successor (for forward/backwards problems) by the time you get to a given block otherwise, an implicit in/out set will be empty, instead of all ones. This is easy to prove. If you skip all of the predecessors or successors, and the set was not actually initialized to all 1's, it will contain absolutely nothing, and this nothing will propagate when it should be 1's :) If you have at least one predecessor/successor with some value, you can use that, and you are guaranteed a correct answer. The code i ended up writing for gcc looks STH like this (note, this was for ANTIC, which is a backwards problem, so it's intersection of successors instead of intersection of preds): SmallVector worklist: BasicBlock first = nullptr; for each successor s { if (!first && visited.count(s)) first = s; else if (visisted.count(s)) worklist.push_back(s); } assert (first != nullptr && "We should have visited at least one succ") for each thing on worklist intersect with first If you use preds instead of succs, it should work here, as long as we continue to use RPO. I *think* that code i wrote is equivalent to adding the continue where i have it, and an assert that at least one block was visited during the loop -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160921/1e930280/attachment.html>
Adrian Prantl via llvm-dev
2016-Sep-21 23:52 UTC
[llvm-dev] Propagation of debug information for variable into basic blocks.
> On Sep 21, 2016, at 4:34 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > On Wed, Sep 21, 2016 at 5:42 PM, Adrian Prantl <aprantl at apple.com> wrote: > > > On Sep 21, 2016, at 2:23 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > > > > > // For all predecessors of this MBB, find the set of VarLocs that > > // can be joined. > > for (auto p : MBB.predecessors()) { > > auto OL = OutLocs.find(p); > > // Join is null in case of empty OutLocs from any of the pred. > > if (OL == OutLocs.end()) > > return false; > > > > > > This is wrong if the block is unvisited (as you say) > > > > This is actually non-trivial to fix, IMHO. > > > > For example, the following looks kinda right: > > > > for (auto p : MBB.predecessors()) { > > if (p is not visited) > > continue; > > auto OL = OutLocs.find(p); > > // Join is null in case of empty OutLocs from any of the pred. > > if (OL == OutLocs.end()) > > return false; > > > > (and will work in a lot of cases) > > > > This can be wrong, however, if you have more than a single backedge in the program. > > (it may be possible to have a block that has multiple uninited predecessors that will require multiple iterations to resolve) > > > > The problem here is that to get the maximal answer in an intersection problem, you *must* initialize the sets to contain every value. You can look at dataflow algorithm papers for formal proofs if you want. > > > > So you *actually* have to treat the default state of every unvisited block as the equivalent of "containing every value" to get the maximal answer. > > > > How to achieve this for real is ... not obvious at a glance. > > > > I had to construct an example to see the problem, but my example is bit contrived (it depends on starting in the middle of the graph) so I wonder if there is a better counterexample. > > BB0:y BB1:x,y > | | > BB2:⊥ BB3:⊥ BB4:⊥ > | / | / > BB5:x | / > \ | / > BB6: ? > | > +-> back-edge to BB1, BB4 > > BB5 has a definition for x but neither kills nor defines y. No block ever kills y. Let's assume we for whatever reason started in BB5. The first join() at BB6 will yield the set [x] (BB3 and BB4 are unvisited and thus skipped, just as if they contained every value). This causes BB2 and BB4 to be added to the working set. If we now visit BB4 first, it will see only [x] and it will be impossible for y be propagated to BB6. > > Is there a better example? > > So staring a bit at it, i've convinced myself the fix i listed may actually be correct *as long* as we guarantee the proper iteration order. >Cool! That explains why I struggled to come up with an example that did not start processing in the middle of the graph :-)> The only reason i know about this at all is because it happened many many moons ago in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27755 :) > > For a while, we actually computed the maximal set and used it. > > Now, we use implicit ones, which for an intersection problem, *should be* the equivalent of skipping the predecessor (if the predecessor is maximal, anding it with 1's will simply never remove anything). > > However, this *only* works if you can guarantee you have visited at least one predecessor/successor (for forward/backwards problems) by the time you get to a given block > otherwise, an implicit in/out set will be empty, instead of all ones. > > This is easy to prove. If you skip all of the predecessors or successors, and the set was not actually initialized to all 1's, it will contain absolutely nothing, and this nothing will propagate when it should be 1's :) > If you have at least one predecessor/successor with some value, you can use that, and you are guaranteed a correct answer.Exactly.> The code i ended up writing for gcc looks STH like this (note, this was for ANTIC, which is a backwards problem, so it's intersection of successors instead of intersection of preds): > > SmallVector worklist: > > BasicBlock first = nullptr; > for each successor s { > if (!first && visited.count(s)) > first = s; > else if (visisted.count(s)) > worklist.push_back(s); > } > assert (first != nullptr && "We should have visited at least one succ") > > for each thing on worklist > intersect with first > > > If you use preds instead of succs, it should work here, as long as we continue to use RPO. > > I *think* that code i wrote is equivalent to adding the continue where i have it, and an assert that at least one block was visited during the loopThat makes sense: either there are no predecessors or at least one of them must be in the visited set. thanks! -- adrian