Jonathan Ragan-Kelley
2009-Aug-24 23:58 UTC
[LLVMdev] Post-dominance analysis for multiple-exit functions
Many published analyses which build on post-dominance assume a canonical single-dominator-tree form induced by unifying all exits (and often adding a virtual edge from START to END). In contrast, it seems that the current LLVM post-dominator analysis only operates in a mode in which it generates a forest of post-dominator trees, with one rooted at each exit node. The problem this can cause is that, by nature, these trees omit some nodes which are not post-dominated by any one exit. This invalidates analyses which build on a traversal of the post-dominator tree. (In my case, I'm implementing Pingali & Bilardi's optimal control dependence algorithms: http://doi.acm.org/10.1145/256167.256217.) Consider, for example, a simple function with a single branch and two exits: define i32 @f(i32 %X) { entry: %0 = icmp eq i32 %X, 0 br i1 %0, label %exit1, label %exit2 exit1: ret i32 1 exit2: ret i32 0 } The existing PostDominatorTree analysis will produce a forest of two trees, each containing one node (exit1 and exit2, respectively). Critically, entry will not be present in either of these trees, because it is not post-dominated by either exit1 or exit2. Traditional analyses seem to assume that a shared virtual END node is added after all exit nodes, which would create a single tree like: [ END ] / | \ entry exit1 exit2 containing all nodes in the original function. Clearly it is possible to do this *destructively*, using the existing return unification transform, but for many uses it would seem useful to allow this as a non-destructive mode of operation of the post- dominator tree analysis. Am I missing something? Has this just not been an issue for previous analyses? Is there guidance on what the right thing to do should be to facilitate analyses which require complete post-dominance information? I was hoping to submit a control dependence analysis for consideration in the trunk at some point, since it seems to come up semi-frequently on the list as something people want to use, but if it requires significantly extending the existing post-dominator analysis, I would hope to be able to do so both with a minimum of hackery, and in an officially sanctioned style. Thanks.
Dan Gohman
2009-Aug-25 19:21 UTC
[LLVMdev] Post-dominance analysis for multiple-exit functions
On Aug 24, 2009, at 4:58 PM, Jonathan Ragan-Kelley wrote:> Many published analyses which build on post-dominance assume a > canonical single-dominator-tree form induced by unifying all exits > (and often adding a virtual edge from START to END). In contrast, it > seems that the current LLVM post-dominator analysis only operates in a > mode in which it generates a forest of post-dominator trees, with one > rooted at each exit node. The problem this can cause is that, by > nature, these trees omit some nodes which are not post-dominated by > any one exit. This invalidates analyses which build on a traversal of > the post-dominator tree. > > (In my case, I'm implementing Pingali & Bilardi's optimal control > dependence algorithms: http://doi.acm.org/10.1145/256167.256217.) > > Consider, for example, a simple function with a single branch and two > exits: > > define i32 @f(i32 %X) { > entry: > %0 = icmp eq i32 %X, 0 > br i1 %0, label %exit1, label %exit2 > exit1: > ret i32 1 > exit2: > ret i32 0 > } > > The existing PostDominatorTree analysis will produce a forest of two > trees, each containing one node (exit1 and exit2, respectively). > Critically, entry will not be present in either of these trees, > because it is not post-dominated by either exit1 or exit2. > > Traditional analyses seem to assume that a shared virtual END node is > added after all exit nodes, which would create a single tree like: > > [ END ] > / | \ > entry exit1 exit2 > > containing all nodes in the original function. > > Clearly it is possible to do this *destructively*, using the existing > return unification transform, but for many uses it would seem useful > to allow this as a non-destructive mode of operation of the post- > dominator tree analysis. > > Am I missing something? Has this just not been an issue for previous > analyses? Is there guidance on what the right thing to do should be to > facilitate analyses which require complete post-dominance information?Running opt -disable-output -analyze -postdomtree on your example gives Inorder PostDominator Tree: [1] <<exit node>> {4294967295,4294967295} [2] %exit1 {4294967295,4294967295} [2] %entry {4294967295,4294967295} [2] %exit2 {4294967295,4294967295} The PostDominatorTree pass uses an artificial "exit node" to represent the post-dominator parent of all the exits. Is this not what you're looking for?> > I was hoping to submit a control dependence analysis for consideration > in the trunk at some point, since it seems to come up semi-frequently > on the list as something people want to use, but if it requires > significantly extending the existing post-dominator analysis, I would > hope to be able to do so both with a minimum of hackery, and in an > officially sanctioned style.Sounds great, Dan
Jonathan Ragan-Kelley
2009-Aug-26 00:35 UTC
[LLVMdev] Post-dominance analysis for multiple-exit functions
> The PostDominatorTree pass uses an artificial "exit node" to > represent the post-dominator parent of all the exits. Is this > not what you're looking for?You are correct. This falls under the rubric of "am I missing something?" -- I was missing this: /// getRootNode - This returns the entry node for the CFG of the function. If /// this tree represents the post-dominance relations for a function, however, /// this root may be a node with the block == NULL. This is the case when /// there are multiple exit nodes from a particular function. Consumers of /// post-dominance information must be capable of dealing with this /// possibility. and improperly assuming that I had to use the getRoots() interface for sane results from a post-dominator tree, because I was caught up with: /// getRoots - Return the root blocks of the current CFG. This may include /// multiple blocks if we are computing post dominators. For forward /// dominators, this will always be a single block (the entry node). The world is sane once again (and my control dependence analysis consistently works without requiring a prior MergeReturns pass). Thanks for indulging my oversight.