thr3ads.net - llvm dev - [LLVMdev] Post-dominance analysis for multiple-exit functions [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Jonathan Ragan-Kelley

2009-Aug-24 23:58 UTC

[LLVMdev] Post-dominance analysis for multiple-exit functions

Many published analyses which build on post-dominance assume a
canonical single-dominator-tree form induced by unifying all exits
(and often adding a virtual edge from START to END). In contrast, it
seems that the current LLVM post-dominator analysis only operates in a
mode in which it generates a forest of post-dominator trees, with one
rooted at each exit node. The problem this can cause is that, by
nature, these trees omit some nodes which are not post-dominated by
any one exit. This invalidates analyses which build on a traversal of
the post-dominator tree.

(In my case, I'm implementing Pingali & Bilardi's optimal control
dependence algorithms: http://doi.acm.org/10.1145/256167.256217.)

Consider, for example, a simple function with a single branch and two
exits:

    define i32 @f(i32 %X) {
    entry:
    	%0 = icmp eq i32 %X, 0
    	br i1 %0, label %exit1, label %exit2
    exit1:
    	ret i32 1
    exit2:
    	ret i32 0
    }

The existing PostDominatorTree analysis will produce a forest of two
trees, each containing one node (exit1 and exit2, respectively).
Critically, entry will not be present in either of these trees,
because it is not post-dominated by either exit1 or exit2.

Traditional analyses seem to assume that a shared virtual END node is
added after all exit nodes, which would create a single tree like:

        [   END   ]
        /    |    \
    entry  exit1  exit2

containing all nodes in the original function.

Clearly it is possible to do this *destructively*, using the existing
return unification transform, but for many uses it would seem useful
to allow this as a non-destructive mode of operation of the post-
dominator tree analysis.

Am I missing something? Has this just not been an issue for previous
analyses? Is there guidance on what the right thing to do should be to
facilitate analyses which require complete post-dominance information?

I was hoping to submit a control dependence analysis for consideration
in the trunk at some point, since it seems to come up semi-frequently
on the list as something people want to use, but if it requires
significantly extending the existing post-dominator analysis, I would
hope to be able to do so both with a minimum of hackery, and in an
officially sanctioned style.

Thanks.

Dan Gohman

2009-Aug-25 19:21 UTC

head link

[LLVMdev] Post-dominance analysis for multiple-exit functions

On Aug 24, 2009, at 4:58 PM, Jonathan Ragan-Kelley wrote:

> Many published analyses which build on post-dominance assume a
> canonical single-dominator-tree form induced by unifying all exits
> (and often adding a virtual edge from START to END). In contrast, it
> seems that the current LLVM post-dominator analysis only operates in a
> mode in which it generates a forest of post-dominator trees, with one
> rooted at each exit node. The problem this can cause is that, by
> nature, these trees omit some nodes which are not post-dominated by
> any one exit. This invalidates analyses which build on a traversal of
> the post-dominator tree.
>
> (In my case, I'm implementing Pingali & Bilardi's optimal
control
> dependence algorithms: http://doi.acm.org/10.1145/256167.256217.)
>
> Consider, for example, a simple function with a single branch and two
> exits:
>
>    define i32 @f(i32 %X) {
>    entry:
>    	%0 = icmp eq i32 %X, 0
>    	br i1 %0, label %exit1, label %exit2
>    exit1:
>    	ret i32 1
>    exit2:
>    	ret i32 0
>    }
>
> The existing PostDominatorTree analysis will produce a forest of two
> trees, each containing one node (exit1 and exit2, respectively).
> Critically, entry will not be present in either of these trees,
> because it is not post-dominated by either exit1 or exit2.
>
> Traditional analyses seem to assume that a shared virtual END node is
> added after all exit nodes, which would create a single tree like:
>
>        [   END   ]
>        /    |    \
>    entry  exit1  exit2
>
> containing all nodes in the original function.
>
> Clearly it is possible to do this *destructively*, using the existing
> return unification transform, but for many uses it would seem useful
> to allow this as a non-destructive mode of operation of the post-
> dominator tree analysis.
>
> Am I missing something? Has this just not been an issue for previous
> analyses? Is there guidance on what the right thing to do should be to
> facilitate analyses which require complete post-dominance information?
Running opt -disable-output -analyze -postdomtree on your example gives

Inorder PostDominator Tree:
   [1]  <<exit node>> {4294967295,4294967295}
     [2] %exit1 {4294967295,4294967295}
     [2] %entry {4294967295,4294967295}
     [2] %exit2 {4294967295,4294967295}

The PostDominatorTree pass uses an artificial "exit node" to
represent the post-dominator parent of all the exits.  Is this
not what you're looking for?
>
> I was hoping to submit a control dependence analysis for consideration
> in the trunk at some point, since it seems to come up semi-frequently
> on the list as something people want to use, but if it requires
> significantly extending the existing post-dominator analysis, I would
> hope to be able to do so both with a minimum of hackery, and in an
> officially sanctioned style.
Sounds great,

Dan

Jonathan Ragan-Kelley

2009-Aug-26 00:35 UTC

head link

[LLVMdev] Post-dominance analysis for multiple-exit functions

> The PostDominatorTree pass uses an artificial "exit node" to
> represent the post-dominator parent of all the exits.  Is this
> not what you're looking for?
You are correct.

This falls under the rubric of "am I missing something?" -- I was  
missing this:

   /// getRootNode - This returns the entry node for the CFG of the  
function.  If
   /// this tree represents the post-dominance relations for a  
function, however,
   /// this root may be a node with the block == NULL.  This is the  
case when
   /// there are multiple exit nodes from a particular function.   
Consumers of
   /// post-dominance information must be capable of dealing with this
   /// possibility.

and improperly assuming that I had to use the getRoots() interface for  
sane results from a post-dominator tree, because I was caught up with:

   /// getRoots -  Return the root blocks of the current CFG.  This  
may include
   /// multiple blocks if we are computing post dominators.  For forward
   /// dominators, this will always be a single block (the entry node).

The world is sane once again (and my control dependence analysis  
consistently works without requiring a prior MergeReturns pass).  
Thanks for indulging my oversight.

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Aug 2009 - [LLVMdev] Post-dominance analysis for multiple-exit functions

[LLVMdev] Post-dominance analysis for multiple-exit functions

[LLVMdev] Post-dominance analysis for multiple-exit functions

[LLVMdev] Post-dominance analysis for multiple-exit functions

Possibly Parallel Threads