thr3ads.net - llvm dev - [LLVMdev] Some thinking for callgraph builder [May 2006]

If this information is useful, please help other people find it:
Share via:

Nai Xia

2006-May-24 16:04 UTC

[LLVMdev] Some thinking for callgraph builder

Hi,

In response to Chris's suggestion to construct a more precise callgraph 
builder.
I express my personal thinking here on possible refinement. Any advice is 
welcome.

1. Basic data structure changes. Current callgraph builder maintains two null 
nodes, expressing the meaning of  "anything calls !internal linkage 
functions" and "functions not defined in this unit call anything"
respectively. However, although we know some functions call "anything"
and
others called by "anything", they just don't have explicit
callgraph edges
between them.
So I just want to:
    Merge these two null nodes to one representing "external unknown 
functions". It has edges to everything *might* be visible outside this
unit,
and has edges to itself from everything might call external functions having 
not appeared in this unit.  External functions already declared in this unit 
have their own callnodes and should have explicit edges to/from others. It 
gets its improvement if we have some "pre-knowledge" of external
function
already declared in this unit. For example, we are sure that standard libc 
function "strcmp" never calls *anything*(no matter what linkage it
might be)
in our own unit. 
 
2. Improvements regarding indirect calls and functions whose address being 
taken. Existing callgraph builder considers them rather conservatively.
I think the possible refinements are:
   a). Further classify the call edges to: MUST, ALIAS, MAY edges.
      MUST: for edges found though direct callsites or if we can prove that a 
function pointer called by some indirect call must alias some function 
(Actually I think this should be promoted to be direct callsite by 
some optimization). We have strong evidence that if A --> B, there must be 
some(if not all) run traces of the program that do make A call B. 
      ALIAS: for edges comes from indirect calls that may alias a set of 
functions. Furthur information could be maintained for which alias set a 
ALIAS edge belongs to. It means we have strong evidence that there must be 
some run traces of the program that make a caller call at least one of the 
functions belonging to an alias set, but we are not sure about which subset 
of it must be get called due to the convertive nature of any alias analysis.
      May: If there is a MAY edge from A to B, it only means we do not have 
strong evidence that A cannot call B. For example, if B's address is passed 
as a parameter to external A.
      Besides these three calling behaviors, if node A does not have an edge 
to B, We *MUST* have strong evidence that A cannot call B. Otherwise at least 
a MAY edge must exist from A to B.     



  b). Get more strong evidence about indirect calls and for functions whose 
address is being taken more than direct calls.
     Indirect call information could be constructed either from partial result 
of DSA as suggested by Andrew. And alternatively, it can use the alias 
information constructed by any AA. I think both interface may be 
provided to let clients chose between the generality and the tight 
integration with wonderful DSA.  
     As for the case of function address being taken. I think further analysis 
could help to find the evidence if the address is only propagated to be 
used within this unit and thus should not be visable to external node.  

  c) Light weight alias analysis for only function pointers may be good for 
clients whose interest is only function pointers alias or the construction of 
callgraph but not general point-to analysis. Because just as Milanova, et al 
pointed out in "Precise and Efficient Call Graph Construction for C
programs
with function pointers", "even inexpensive pointer analyses may
provide
precision comparable to the precision of expensive pointer anlayses, and 
therefore the use of more expensive analyses may be unnecessary." It could 
improve the performance and space used.


 
-- 
Regards,
Nai

Andrew Lenharth

2006-May-30 14:26 UTC

head link

[LLVMdev] Some thinking for callgraph builder

On Wed, 2006-05-24 at 11:04, Nai Xia wrote:>      Indirect call information could be constructed either from partial
result
> of DSA as suggested by Andrew. And alternatively, it can use the alias 
> information constructed by any AA. I think both interface may be 
> provided to let clients chose between the generality and the tight 
> integration with wonderful DSA.  
I committed a utility pass I use in a few out of tree projects.  It
gives very simple access to the function lists computed by DSA for
indirect call sites.  I've used it in several places where I needed
indirect call information.  See CallTarget.cpp in the DSA directory.

Andrew

Nai Xia

2006-May-31 03:00 UTC

head link

[LLVMdev] Some thinking for callgraph builder

On Tuesday 30 May 2006 22:26, Andrew Lenharth wrote:> On Wed, 2006-05-24 at 11:04, Nai Xia wrote:
> >      Indirect call information could be constructed either from
partial result
> > of DSA as suggested by Andrew. And alternatively, it can use the alias
> > information constructed by any AA. I think both interface may be 
> > provided to let clients chose between the generality and the tight 
> > integration with wonderful DSA.  
> 
> I committed a utility pass I use in a few out of tree projects.  It
> gives very simple access to the function lists computed by DSA for
> indirect call sites.  I've used it in several places where I needed
> indirect call information.  See CallTarget.cpp in the DSA directory.
I've checked it out. Thanks a lot.
> 
> Andrew
> 
> 
-- 
Regards,
Nai

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - May 2006 - [LLVMdev] Some thinking for callgraph builder

[LLVMdev] Some thinking for callgraph builder

[LLVMdev] Some thinking for callgraph builder

[LLVMdev] Some thinking for callgraph builder

Seemingly Similar Threads