thr3ads.net - llvm dev - [llvm-dev] Intended behavior of CGSCC pass manager. [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Sean Silva via llvm-dev

2016-Jun-16 11:48 UTC

[llvm-dev] Intended behavior of CGSCC pass manager.

On Tue, Jun 14, 2016 at 11:29 PM, Xinliang David Li <davidxl at
google.com>
wrote:
>
>
> On Thu, Jun 9, 2016 at 3:26 AM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>>
>>
>> On Wed, Jun 8, 2016 at 8:14 PM, Xinliang David Li <davidxl at
google.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Jun 8, 2016 at 4:38 PM, Sean Silva <chisophugis at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 8, 2016 at 12:31 PM, Xinliang David Li <davidxl
at google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 8, 2016 at 4:19 AM, Sean Silva <chisophugis
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Chandler, Philip, Mehdi, (and llvm-dev,)
>>>>>>
>>>>>> (this is partially a summary of some discussions that
happened at the
>>>>>> last LLVM bay area social, and partially a discussion
about the direction
>>>>>> of the CGSCC pass manager)
>>>>>>
>>>>>>
>>>>>> A the last LLVM social we discussed the progress on the
CGSCC pass
>>>>>> manager. It seems like Chandler has a CGSCC pass
manager working, but it is
>>>>>> still unresolved exactly which semantics we want (more
about this below)
>>>>>> that are reasonably implementable.
>>>>>>
>>>>>> AFAICT, there has been no public discussion about what
exact
>>>>>> semantics we ultimately want to have. We should figure
that out.
>>>>>>
>>>>>> The main difficulty which Chandler described is the
apparently quite
>>>>>> complex logic surrounding needing to run function
passes nested within an
>>>>>> SCC pass manager, while providing some guarantees about
exactly what order
>>>>>> the function passes are run. The existing CGSCC pass
manager just punts on
>>>>>> some of the problems that arise (look in
CGPassManager::runOnModule,
>>>>>> CGPassManager::RunAllPassesOnSCC, and
CGPassManager::RunPassOnSCC in
>>>>>> llvm/lib/Analysis/CallGraphSCCPass.cpp), and these are
the problems that
>>>>>> Chandler has been trying to solve.
>>>>>>
>>>>>> (
>>>>>> Why is this "function passes inside CGSCC
passes" stuff interesting?
>>>>>> Because LLVM can do inlining on an SCC (often just a
single function) and
>>>>>> then run function passes to simplify the function(s) in
the SCC before it
>>>>>> tries to inline into a parent SCC. (the SCC visitation
order is post-order)
>>>>>> For example, we may inline a bunch of code, but after
inlining we can
>>>>>> tremendously simplify the function, and we want to do
so before considering
>>>>>> this function for inlining into its callers so that we
get an accurate
>>>>>> evaluation of the inline cost.
>>>>>> Based on what Chandler said, it seems that LLVM is
fairly unique in
>>>>>> this regard and other compilers don't do this
(which is why we can't just
>>>>>> look at how other compilers solve this problem; they
don't have this
>>>>>> problem (maybe they should? or maybe we
shouldn't?)). For example, he
>>>>>> described that GCC uses different inlining
"phases"; e.g. it does early
>>>>>> inlining on the entire module, then does
simplifications on the entire
>>>>>> module, then does late inlining on the entire module;
so it is not able to
>>>>>> incrementally simplify as it inlines like LLVM does.
>>>>>> )
>>>>>>
>>>>>> As background for what is below, the LazyCallGraph
tracks two graphs:
>>>>>> the "call graph" and the "ref
graph".
>>>>>> Conceptually, the call graph is the graph of direct
calls, where
>>>>>> indirect calls and calls to external functions do not
appear (or are
>>>>>> connected to dummy nodes). The ref graph is basically
the graph of all
>>>>>> functions transitively accessible based on the
globals/constants/etc.
>>>>>> referenced by a function (e.g. if a function `foo`
references a vtable that
>>>>>> is defined in the module, there is an edge in the ref
graph from `foo` to
>>>>>> every function in the vtable).
>>>>>> The call graph is a strict subset of the ref graph.
>>>>>>
>>>>>> Chandler described that he had a major breakthrough in
that the CGSCC
>>>>>> pass manager only had to deal with 3 classes of
modifications that can
>>>>>> occur:
>>>>>> - a pass may e.g. propagate a load of a function
pointer into an
>>>>>> indirect call, turning it into an direct call. This
requires adding an edge
>>>>>> in the CG but not in the ref graph.
>>>>>> - a pass may take a direct call and turn it into an
indirect call.
>>>>>> This requires removing an edge from the CG, but not in
the ref graph.
>>>>>> - a pass may delete a direct call. This removes an edge
in the CG and
>>>>>> also in the ref graph.
>>>>>>
>>>>>> From the perspective of the CGSCC pass manager, these
operations can
>>>>>> affect the SCC structure. Adding an edge might merge
SCC's and deleting an
>>>>>> edge might split SCC's. Chandler mentioned that
apparently the issues of
>>>>>> splitting and merging SCC's within the current
infrastructure are actually
>>>>>> quite challenging and lead to e.g. iterator
invalidation issues, and that
>>>>>> is what he is working on.
>>>>>>
>>>>>> (
>>>>>> The ref graph is important to guide the overall SCC
visitation order
>>>>>> because it basically represents "the largest graph
that the CG may turn
>>>>>> into due to our static analysis of this module".
I.e. no transformation we
>>>>>> can statically make in the CGSCC passes can ever cause
us to need to merge
>>>>>> SCC's in the ref graph.
>>>>>> )
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have a couple overall questions/concerns:
>>>>>>
>>>>>>
>>>>>> 1. The ref graph can easily go quadratic. E.g.
>>>>>>
>>>>>> typedef void (*fp)();
>>>>>> fp funcs[] = {
>>>>>>   &foo1,
>>>>>>   &foo2,
>>>>>>   ...
>>>>>>   &fooN
>>>>>> }
>>>>>> void foo1() { funcs[something](); }
>>>>>> void foo2() { funcs[something](); }
>>>>>> ...
>>>>>> void fooN() { funcs[something](); }
>>>>>>
>>>>>> One real-world case where this might come about is in
the presence of
>>>>>> vtables.
>>>>>>
>>>>>> The existing CGSCC pass manager does not have this
issue AFAIK
>>>>>> because it does not consider the ref graph.
>>>>>>
>>>>>> Does anybody have any info/experience about how densely
connected the
>>>>>> ref graph can get in programs that might reasonably be
fed to the compiler?
>>>>>> I just did a quick sanity check with LLD/ELF using
>>>>>> `--lto-newpm-passes=cgscc(no-op-cgscc)` and it at least
seemed to terminate
>>>>>> / not run out of memory. Based on some rough
calculations looking at the
>>>>>> profile, it seem like the entire run of the inliner in
the old LTO pipeline
>>>>>> (which is about 5% of total LTO time on this particular
example I looked
>>>>>> at) is only 2-3x as expensive as just
>>>>>> `--lto-newpm-passes=cgscc(no-op-cgscc)`, so the
LazyCallGraph construction
>>>>>> is certainly not free.
>>>>>>
>>>>>
>>>>>
>>>>> Conceptually, reference graph should also include  variable
nodes.
>>>>> With variable nodes introduced, the quadratic behavior
mentioned can be
>>>>> avoided.
>>>>>
>>>>
>>>> Yeah, this is what I was trying to get at with the statement
"In the
>>>> current LazyCallGraph, this would require adding some sort of
notion of
>>>> hyperedge."
>>>> But you are right: from an implementation perspective of a call
graph
>>>> data structure that is trying to model the "ref
graph" of LazyCallGraph, it
>>>> is cleaner to have nodes that are not functions than to
introduce a notion
>>>> of hyperedge.
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> 2. What is the intended behavior of CGSCC passes when
SCC's are split
>>>>>> or merged? E.g. a CGSCC pass runs on an SCC (e.g. the
inliner). Now we run
>>>>>> some function passes nested inside the CGSCC pass
manager (e.g. to simplify
>>>>>> things after inlining). Consider:
>>>>>>
>>>>>> a) These function passes are e.g. now able to
devirtualize a call,
>>>>>> adding an edge to the CG, forming a larger CG SCC. Do
you re-run the CGSCC
>>>>>> pass (say, the inliner) on this larger SCC?
>>>>>>
>>>>>> b) These function passes are e.g. able to DCE a call,
removing an
>>>>>> edge from the CG. This converts, say, a CG SCC which is
a cycle graph (like
>>>>>> a->b->c->a) into a path graph (a->b->c,
with no edge back to a). The
>>>>>> inliner had already visited a, b, and c as a single
SCC. Now does it have
>>>>>> to re-visit c, then b, then a, as single-node
SCC's?
>>>>>>
>>>>>>
>>>>>> btw:
>>>>>>
>>>>>> One way that I have found it useful to think about this
is in terms
>>>>>> of the visitation during Tarjan's SCC algorithm.
I'll reference the
>>>>>> pseudocode in
>>>>>>
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm.
>>>>>> Inside the "strongconnect" routine when we
have identified an SCC (the true
>>>>>> branch of `if (v.lowlink = v.index)` test ) we can
visit
>>>>>> stack[v.index:stack.size()] as an SCC. This may or may
not invalidate some
>>>>>> things on the stack (the variable `S` in the
pseudocode) and we may need to
>>>>>> fix it up (e.g. inlining deleted a function, so we
can't have an entry on
>>>>>> the stack). Then, we can run function passes as we pop
individual functions
>>>>>> off the stack, but it is easier to think about IMO than
merging of SCC data
>>>>>> structures: if we add edges to the CG then we have to
do more DFS on the
>>>>>> new edges and if we delete edges then the DFS order of
the stack gives us
>>>>>> certain guarantees.
>>>>>> Personally I find this much easier to reason about than
the
>>>>>> description in terms of splitting and merging SCC's
in the CG and ref graph
>>>>>> (which the LazyCallGraph API makes one to think about
since it hides the
>>>>>> underlying Tarjan's algorithm).
>>>>>> The LazyCallGraph API makes the current loop in
>>>>>>
http://reviews.llvm.org/diffusion/L/browse/llvm/trunk/include/llvm/Analysis/CGSCCPassManager.h;272124$100
>>>>>> very clean, but at least for my thinking about the
problem, it seems like
>>>>>> the wrong abstraction (and most of the LazyCallGraph
API seems to be
>>>>>> unused, so it seems like it may be overly heavyweight).
>>>>>> E.g. I think that maybe the easiest thing to do is to
turn the
>>>>>> current approach inside out: instead of having the pass
manager logic be
>>>>>> the "normal code" and forcing the Tarjan
algorithm to become a state
>>>>>> machine of iterators, use an open-coded Tarjan
algorithm with some
>>>>>> callbacks and make the pass management logic be the
state machine.
>>>>>> This will also open the door to avoiding the
potentially quadratic
>>>>>> size of the ref graph, since e.g. in the example I gave
above, we can mark
>>>>>> the `funcs` array itself as already having been visited
during the walk. In
>>>>>> the current LazyCallGraph, this would require adding
some sort of notion of
>>>>>> hyperedge.
>>>>>>
>>>>>> Since this is such a high priority (due to blocking PGO
inlining), I
>>>>>> will probably try my hand at implementing the CGSCC
pass manager sometime
>>>>>> soon unless somebody beats me to it. (I'll probably
try the "open-coded SCC
>>>>>> visit" approach).
>>>>>>
>>>>>> Another possibility is implementing the new CGSCC pass
manager that
>>>>>> uses the same visitation semantics as the one in the
old PM, and then we
>>>>>> can refactor that as needed. In fact, that may be the
best approach so that
>>>>>> porting to the new PM is as NFC as possible and we can
isolate the
>>>>>> functional (i.e., need benchmarks, measurements ...)
changes in separate
>>>>>> commits.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> A very high level comment: why do we need to update
callgraph on the
>>>>> fly ? Can we have a more general support of iterative SCC
pass invocation?
>>>>>
>>>>> something like:
>>>>>
>>>>> 1) build the callgraph
>>>>> 2) cache the post-order traversal order
>>>>>
>>>>> 3) if the order list is empty -- done
>>>>> 4) traversal: invoke function passes for each function on
the order
>>>>> (step 2 or 5). The call graph gets updated on the fly (with
new edges, or
>>>>> new nodes for cloned functions)
>>>>> 5) update the function traversal order from new nodes and
new edges
>>>>> created in 4)
>>>>> 6) go to step 3).
>>>>>
>>>>
>>>> (sorry for the delayed reply... this is a very poignant
question /
>>>> example)
>>>>
>>>> From the discussion with Chandler, I think he wants to provide
more
>>>> guarantees to function passes about the visitation order. He
will need to
>>>> explain his exact concerns. But IIRC the essence of one of the
issues is
>>>> captured in the example I gave in 2.b) in the OP:
>>>>
>>>> "These function passes are e.g. able to DCE a call,
removing an edge
>>>> from the CG. This converts, say, a CG SCC which is a cycle
graph (like
>>>> a->b->c->a) into a path graph (a->b->c, with no
edge back to a)."
>>>>
>>>> The issue that I remember Chandler brought up is that before
deleting
>>>> an edge from an SCC, the nodes in the SCC are
"unordered" w.r.t. each
>>>> other; but after deleting an edge from the CG which splits the
SCC, there
>>>> is an order.
>>>> In other words, that the cached post-order may no longer be a
>>>> post-order traversal of the new graph after a function pass
runs. For
>>>> example, in the cached post-order traversal we may have entered
the SCC
>>>> `a->b->c->a` via an edge `x->b` and so our cached
post-order traversal
>>>> visits the functions in the order `a, then c, then b`. If a
function pass
>>>> visits `c` and deletes the call `c->a`, then "a, then
c, then b" is not a
>>>> valid post-order for visiting the functions.
>>>>
>>>> Looking at the explanation I gave, I think there isn't
really a problem
>>>> here. The cached post-order can only be invalidated for
functions that have
>>>> already been visited.
>>>>
>>>
>>> If we merge CGSCC based transformation passes with an IPA analysis
pass
>>> that requires strict/valid bottom up order, then updating the SCCs
on the
>>> fly might be needed (or not if the result will still be
conservatively
>>> correct) -- but  interleaving IPA analyses with IPA transformations
like
>>> this is not the right thing to do also for other reasons.
>>>
>>> Assuming CGSCC pass just does a predefined set of transformations
at
>>> function level with some order, there is no issue of
'invalidation' -- the
>>> cached order will always be valid. In 2.b),  why bother to revisit
the
>>> nodes?
>>>
>>
>> Indeed. That is the type of doubt for why I was asking this question in
>> the OP.
>>
>>>
>>>
>>>> Speaking more broadly about the algorithm you just described,
did you
>>>> intend to omit an SCC visitation step?
>>>>
>>>
>>> It is independent of this decision.  However we can actually go
back one
>>> step and ask the question: is adding this layer really necessary?
Why not
>>> just traversing the cgraph nodes in reverse topo-order (after
removing the
>>> cycles)?
>>>
>>>
>>>> The goal of the CGSCC pass manager is the ability to visit an
SCC (e.g.
>>>> inliner visits an SCC), then immediately run function passes to
simplify
>>>> the result of inlining.
>>>>
>>>
>>> This is how current implementation is. Is it a fundamental
requirement?
>>>
>>
>> I think this is a good question. Hopefully Chandler will be able to
reply
>> to this thread. From what I can tell, the exact requirements have never
>> really been discussed.
>>
>>
>>>
>>>
>>>
>>>> Only after the simplification has occurred do we visit the
parent SCC.
>>>> By running the simplifications in lock-step with the inliner
SCC visitation
>>>> we give parent SCC's a more accurate view of the real cost
of inlining a
>>>> function from a child SCC.
>>>>
>>>
>>> This works for all bottom up scheme regardless of a SCC layer is
added
>>> or not.
>>>
>>>
>>>>
>>>> Issues similar to 2.a) (i.e. adding edges to the CG) also
affect the
>>>> visitation order of SCC's (not just functions). For
example, we visit an
>>>> SCC with a CGSCC pass (e.g. inliner). Then we run the first
function pass
>>>> on that SCC, which may add an edge (e.g. promote indirect call
to direct
>>>> call) that may enlarge the SCC. Do we continue running the
remaining
>>>> function passes? Do we re-visit this new enlarged SCC? (if so,
when?) These
>>>> are the types of questions that motivated this post (hence the
name
>>>> "Intended behavior of CGSCC pass manager.").
>>>>
>>>> yes, I understand the motivation -- that is way I propose the
agorithm
>>> that uses cached iteration order + worklist based iterative
approach. In
>>> your example, when a new edge is exposed via devirtualization, only
its
>>> caller/ancestor nodes need to be revisited in the next iteration.
>>>
>>
>> Your proposal certainly sounds simpler and easier to reason about.
(e.g.
>> it naturally avoids any issues with deletion)
>>
>> Since there are only 3 non-inliner CGSCC passes (ArgPromotion,
>> PostOrderFunctionAttrsLegacyPass,  PruneEH) it may be feasible to
remove
>> the entire abstraction from LLVM and replace it with a cached
post-order
>> function pass visitation like you suggest. The inliner doesn't seem
to do
>> anything special with the knowledge that it is visiting an SCC (besides
>> moving call sites that call within the SCC to the end of its worklist)
and
>> so this may be fine.
>>
>>
> A) Using cached callgraph walk (nodes or SCC) to avoid cgraph invalidation
> and B) whether SCC based walk has more advantages are two different things
> to answer.
>
> For the later, I now tend to believe that introducing additional SCC layer
> has more advantages than alternative that uses node based bottom-up walk.
> 1) Bottom-up based IPA algorithms can be implemented more cleanly by using
> SCC. For instance the pruneEH or any bottom-up attribute propagations.
> Without using SCC, we will need to defined a generic (using template)
> worklist based bottom-up propagation engine.
> 2) All SCC passes can be nicely grouped together -- for compile time
> reasons or to create synergies for better performance.
>
>
> For A), to repeat myself, SCC DAG caching to avoid mutation for one round
> of CG walk has many advantages. We should probably discuss more on
> pros/cons and various scenarios we want to handle ..
>
One question is what invariants we want to provide for the visitation.

For example, should a CGSCC pass be able to assume that all "child"
SCC's
(SCC's it can reach via direct calls emanating from the SCC being visited)
have already been visited? Currently I don't think it can, and IIRC from
the discussion at the social this is one thing that Chandler is hoping to
fix. The "ref edge" notion in LazyCallGraph ensures that we cannot
create a
call edge (devirtualizing a ref edge) that will point at an SCC that has
not yet been visited.

E.g. consider this graph:

digraph G {
  A -> B; B -> A; // SCC {A,B}
  S -> T; T -> S; // SCC {S,T}
  X -> Y; Y -> X; // SCC {X,Y}

  B -> X;
  B -> S;
  T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
{S,T}",constraint=false,style=dashed]
}

(can visualize conveniently at http://www.webgraphviz.com/ or I have put an
image at http://reviews.llvm.org/F2073104)

If we do not consider the ref graph, then it is possible for SCC {S,T} to
be visited before SCC {X,Y}. So after devirtualizing the call T->Y we can
no longer assume that the SCC's are visited in post-order (or must somehow
try to ignore that the call edge T->Y exists).

A more complicated case is when SCC {S,T} and SCC {X,Y} both call into each
other via function pointers. So eventually after devirtualizing the calls
in both directions there will be a single SCC {S,T,X,Y}.

digraph G {
  A -> B; B -> A; // SCC {A,B}
  S -> T; T -> S; // SCC {S,T}
  X -> Y; Y -> X; // SCC {X,Y}

  B -> X;
  B -> S;
  T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
{S,T}",constraint=false,style=dashed]
  X -> S [label="Ref edge that is devirtualized\nwhen visiting SCC
{X,Y}",constraint=false,style=dashed]
}

(rendering at: http://reviews.llvm.org/F2073479)

Due to the cyclic dependence there is no SCC visitation order that can
directly provide the invariant above. Indeed, the problem of maintaining
the above invariant is ill-posed in this scenario.
Consider the pipeline `cgscc(function(...simplifications that can
devirtualize...),foo-cgscc-pass)`. A possible visitation is as follows:

1. Visit SCC {S,T} and run `function(...simplifications that can
devirtualize...)`. This reveals the call edge T->Y.
2. We continue visiting SCC {S,T} and run foo-cgscc-pass on SCC {S,T}.
3. Visit SCC {X,Y} and run `function(...simplifications that can
devirtualize...)`. This reveals the call edge X->S.
4. ??? what do we do now.
Alternative 4.a) Should we continue the visitation and call foo-cgscc-pass
on "SCC" {X,Y}?
Alternative 4.b) Should foo-cgscc-pass now run on SCC {S,T,X,Y}?
Alternative 4.c) Should we restart the entire outer `cgscc(...)` visitation
on SCC {S,T,X,Y}?

(Without a cap both 4.b and 4.c could become quadratic on a graph like
http://reviews.llvm.org/F2073607)

-- Sean Silva

>
> thanks,
>
> David
>
>
>
>
>> Sean:~/pg/llvm % git grep 'public CallGraphSCCPass'
>> include/llvm/Transforms/IPO/InlinerPass.h:struct Inliner : public
>> CallGraphSCCPass {
>> lib/Transforms/IPO/ArgumentPromotion.cpp:  struct ArgPromotion : public
>> CallGraphSCCPass {
>> lib/Transforms/IPO/FunctionAttrs.cpp:struct
>> PostOrderFunctionAttrsLegacyPass : public CallGraphSCCPass {
>> lib/Transforms/IPO/PruneEH.cpp:  struct PruneEH : public
CallGraphSCCPass
>> {
>> lib/Analysis/CallGraphSCCPass.cpp:  class PrintCallGraphPass : public
>> CallGraphSCCPass {
>> tools/opt/PassPrinters.cpp:struct CallGraphSCCPassPrinter : public
>> CallGraphSCCPass {
>>
>> CGSCC passes seem to have been added in what is now SVN r8247 (~Aug
2003)
>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20030825/006619.html
>> (LLVM appears to have been in CVS at the time).
>>
>> Chris, do you remember the motivation for doing the CGSCC visitation
>> instead of a pure post-order function visitation like David is
mentioning?
>> (or was it just an oversight / hindsight-20-20 thing?) Do you think it
>> would make sense to replace CGSCC visitation with post-order function
>> visitation in the current LLVM?
>>
>> -- Sean Silva
>>
>>
>>>
>>> David
>>>
>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Sorry for the wall of text.
>>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160616/d64591d5/attachment-0001.html>

Hal Finkel via llvm-dev

2016-Jun-16 17:13 UTC

head link

[llvm-dev] Intended behavior of CGSCC pass manager.

----- Original Message -----
> From: "Sean Silva via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Xinliang David Li" <davidxl at google.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Thursday, June 16, 2016 6:48:30 AM
> Subject: Re: [llvm-dev] Intended behavior of CGSCC pass manager.
> On Tue, Jun 14, 2016 at 11:29 PM, Xinliang David Li <
> davidxl at google.com > wrote:
> > On Thu, Jun 9, 2016 at 3:26 AM, Sean Silva < chisophugis at
gmail.com
> > >
> > wrote:
> 
> > > On Wed, Jun 8, 2016 at 8:14 PM, Xinliang David Li <
> > > davidxl at google.com > wrote:
> > 
> 
> > > > On Wed, Jun 8, 2016 at 4:38 PM, Sean Silva <
> > > > chisophugis at gmail.com
> > > > >
> > > > wrote:
> > > 
> > 
> 
> > > > > On Wed, Jun 8, 2016 at 12:31 PM, Xinliang David Li <
> > > > > davidxl at google.com > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Wed, Jun 8, 2016 at 4:19 AM, Sean Silva <
> > > > > > chisophugis at gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Hi Chandler, Philip, Mehdi, (and llvm-dev,)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > (this is partially a summary of some
discussions that
> > > > > > > happened
> > > > > > > at
> > > > > > > the
> > > > > > > last LLVM bay area social, and partially a
discussion
> > > > > > > about
> > > > > > > the
> > > > > > > direction of the CGSCC pass manager)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > A the last LLVM social we discussed the
progress on the
> > > > > > > CGSCC
> > > > > > > pass
> > > > > > > manager. It seems like Chandler has a CGSCC
pass manager
> > > > > > > working,
> > > > > > > but it is still unresolved exactly which
semantics we
> > > > > > > want
> > > > > > > (more
> > > > > > > about this below) that are reasonably
implementable.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > AFAICT, there has been no public discussion
about what
> > > > > > > exact
> > > > > > > semantics we ultimately want to have. We
should figure
> > > > > > > that
> > > > > > > out.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The main difficulty which Chandler described
is the
> > > > > > > apparently
> > > > > > > quite
> > > > > > > complex logic surrounding needing to run
function passes
> > > > > > > nested
> > > > > > > within an SCC pass manager, while providing
some
> > > > > > > guarantees
> > > > > > > about
> > > > > > > exactly what order the function passes are
run. The
> > > > > > > existing
> > > > > > > CGSCC
> > > > > > > pass manager just punts on some of the
problems that
> > > > > > > arise
> > > > > > > (look
> > > > > > > in
> > > > > > > CGPassManager::runOnModule,
> > > > > > > CGPassManager::RunAllPassesOnSCC,
> > > > > > > and
> > > > > > > CGPassManager::RunPassOnSCC in
> > > > > > > llvm/lib/Analysis/CallGraphSCCPass.cpp), and
these are
> > > > > > > the
> > > > > > > problems
> > > > > > > that Chandler has been trying to solve.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > (
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Why is this "function passes inside
CGSCC passes" stuff
> > > > > > > interesting?
> > > > > > > Because LLVM can do inlining on an SCC (often
just a
> > > > > > > single
> > > > > > > function) and then run function passes to
simplify the
> > > > > > > function(s)
> > > > > > > in the SCC before it tries to inline into a
parent SCC.
> > > > > > > (the
> > > > > > > SCC
> > > > > > > visitation order is post-order)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > For example, we may inline a bunch of code,
but after
> > > > > > > inlining
> > > > > > > we
> > > > > > > can
> > > > > > > tremendously simplify the function, and we
want to do so
> > > > > > > before
> > > > > > > considering this function for inlining into
its callers
> > > > > > > so
> > > > > > > that
> > > > > > > we
> > > > > > > get an accurate evaluation of the inline
cost.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Based on what Chandler said, it seems that
LLVM is fairly
> > > > > > > unique
> > > > > > > in
> > > > > > > this regard and other compilers don't do
this (which is
> > > > > > > why
> > > > > > > we
> > > > > > > can't
> > > > > > > just look at how other compilers solve this
problem; they
> > > > > > > don't
> > > > > > > have
> > > > > > > this problem (maybe they should? or maybe we
> > > > > > > shouldn't?)).
> > > > > > > For
> > > > > > > example, he described that GCC uses different
inlining
> > > > > > > "phases";
> > > > > > > e.g. it does early inlining on the entire
module, then
> > > > > > > does
> > > > > > > simplifications on the entire module, then
does late
> > > > > > > inlining
> > > > > > > on
> > > > > > > the
> > > > > > > entire module; so it is not able to
incrementally
> > > > > > > simplify
> > > > > > > as
> > > > > > > it
> > > > > > > inlines like LLVM does.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > )
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > As background for what is below, the
LazyCallGraph tracks
> > > > > > > two
> > > > > > > graphs:
> > > > > > > the "call graph" and the "ref
graph".
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Conceptually, the call graph is the graph of
direct
> > > > > > > calls,
> > > > > > > where
> > > > > > > indirect calls and calls to external
functions do not
> > > > > > > appear
> > > > > > > (or
> > > > > > > are
> > > > > > > connected to dummy nodes). The ref graph is
basically the
> > > > > > > graph
> > > > > > > of
> > > > > > > all functions transitively accessible based
on the
> > > > > > > globals/constants/etc. referenced by a
function (e.g. if
> > > > > > > a
> > > > > > > function
> > > > > > > `foo` references a vtable that is defined in
the module,
> > > > > > > there
> > > > > > > is
> > > > > > > an
> > > > > > > edge in the ref graph from `foo` to every
function in the
> > > > > > > vtable).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The call graph is a strict subset of the ref
graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Chandler described that he had a major
breakthrough in
> > > > > > > that
> > > > > > > the
> > > > > > > CGSCC
> > > > > > > pass manager only had to deal with 3 classes
of
> > > > > > > modifications
> > > > > > > that
> > > > > > > can occur:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > - a pass may e.g. propagate a load of a
function pointer
> > > > > > > into
> > > > > > > an
> > > > > > > indirect call, turning it into an direct
call. This
> > > > > > > requires
> > > > > > > adding
> > > > > > > an edge in the CG but not in the ref graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > - a pass may take a direct call and turn it
into an
> > > > > > > indirect
> > > > > > > call.
> > > > > > > This requires removing an edge from the CG,
but not in
> > > > > > > the
> > > > > > > ref
> > > > > > > graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > - a pass may delete a direct call. This
removes an edge
> > > > > > > in
> > > > > > > the
> > > > > > > CG
> > > > > > > and
> > > > > > > also in the ref graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > From the perspective of the CGSCC pass
manager, these
> > > > > > > operations
> > > > > > > can
> > > > > > > affect the SCC structure. Adding an edge
might merge
> > > > > > > SCC's
> > > > > > > and
> > > > > > > deleting an edge might split SCC's.
Chandler mentioned
> > > > > > > that
> > > > > > > apparently the issues of splitting and
merging SCC's
> > > > > > > within
> > > > > > > the
> > > > > > > current infrastructure are actually quite
challenging and
> > > > > > > lead
> > > > > > > to
> > > > > > > e.g. iterator invalidation issues, and that
is what he is
> > > > > > > working
> > > > > > > on.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > (
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The ref graph is important to guide the
overall SCC
> > > > > > > visitation
> > > > > > > order
> > > > > > > because it basically represents "the
largest graph that
> > > > > > > the
> > > > > > > CG
> > > > > > > may
> > > > > > > turn into due to our static analysis of this
module".
> > > > > > > I.e.
> > > > > > > no
> > > > > > > transformation we can statically make in the
CGSCC passes
> > > > > > > can
> > > > > > > ever
> > > > > > > cause us to need to merge SCC's in the
ref graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > )
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I have a couple overall questions/concerns:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > 1. The ref graph can easily go quadratic.
E.g.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > typedef void (*fp)();
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > fp funcs[] = {
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > &foo1,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > &foo2,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ...
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > &fooN
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > void foo1() { funcs[something](); }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > void foo2() { funcs[something](); }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ...
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > void fooN() { funcs[something](); }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > One real-world case where this might come
about is in the
> > > > > > > presence
> > > > > > > of
> > > > > > > vtables.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The existing CGSCC pass manager does not have
this issue
> > > > > > > AFAIK
> > > > > > > because it does not consider the ref graph.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Does anybody have any info/experience about
how densely
> > > > > > > connected
> > > > > > > the
> > > > > > > ref graph can get in programs that might
reasonably be
> > > > > > > fed
> > > > > > > to
> > > > > > > the
> > > > > > > compiler?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I just did a quick sanity check with LLD/ELF
using
> > > > > > > `--lto-newpm-passes=cgscc(no-op-cgscc)` and
it at least
> > > > > > > seemed
> > > > > > > to
> > > > > > > terminate / not run out of memory. Based on
some rough
> > > > > > > calculations
> > > > > > > looking at the profile, it seem like the
entire run of
> > > > > > > the
> > > > > > > inliner
> > > > > > > in the old LTO pipeline (which is about 5% of
total LTO
> > > > > > > time
> > > > > > > on
> > > > > > > this
> > > > > > > particular example I looked at) is only 2-3x
as expensive
> > > > > > > as
> > > > > > > just
> > > > > > > `--lto-newpm-passes=cgscc(no-op-cgscc)`, so
the
> > > > > > > LazyCallGraph
> > > > > > > construction is certainly not free.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Conceptually, reference graph should also include
variable
> > > > > > nodes.
> > > > > > With variable nodes introduced, the quadratic
behavior
> > > > > > mentioned
> > > > > > can
> > > > > > be avoided.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > Yeah, this is what I was trying to get at with the
statement
> > > > > "
> > > > > In
> > > > > the
> > > > > current LazyCallGraph, this would require adding some
sort of
> > > > > notion
> > > > > of hyperedge. "
> > > > 
> > > 
> > 
> 
> > > > > But you are right: from an implementation perspective
of a
> > > > > call
> > > > > graph
> > > > > data structure that is trying to model the "ref
graph" of
> > > > > LazyCallGraph, it is cleaner to have nodes that are not
> > > > > functions
> > > > > than to introduce a notion of hyperedge.
> > > > 
> > > 
> > 
> 
> > > > > > > 2. What is the intended behavior of CGSCC
passes when
> > > > > > > SCC's
> > > > > > > are
> > > > > > > split
> > > > > > > or merged? E.g. a CGSCC pass runs on an SCC
(e.g. the
> > > > > > > inliner).
> > > > > > > Now
> > > > > > > we run some function passes nested inside the
CGSCC pass
> > > > > > > manager
> > > > > > > (e.g. to simplify things after inlining).
Consider:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > a) These function passes are e.g. now able to
> > > > > > > devirtualize
> > > > > > > a
> > > > > > > call,
> > > > > > > adding an edge to the CG, forming a larger CG
SCC. Do you
> > > > > > > re-run
> > > > > > > the
> > > > > > > CGSCC pass (say, the inliner) on this larger
SCC?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > b) These function passes are e.g. able to DCE
a call,
> > > > > > > removing
> > > > > > > an
> > > > > > > edge from the CG. This converts, say, a CG
SCC which is a
> > > > > > > cycle
> > > > > > > graph (like a->b->c->a) into a path
graph (a->b->c, with
> > > > > > > no
> > > > > > > edge
> > > > > > > back to a). The inliner had already visited
a, b, and c
> > > > > > > as
> > > > > > > a
> > > > > > > single
> > > > > > > SCC. Now does it have to re-visit c, then b,
then a, as
> > > > > > > single-node
> > > > > > > SCC's?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > btw:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > One way that I have found it useful to think
about this
> > > > > > > is
> > > > > > > in
> > > > > > > terms
> > > > > > > of the visitation during Tarjan's SCC
algorithm. I'll
> > > > > > > reference
> > > > > > > the
> > > > > > > pseudocode in
> > > > > > >
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
> > > > > > > . Inside the "strongconnect"
routine when we have
> > > > > > > identified
> > > > > > > an
> > > > > > > SCC
> > > > > > > (the true branch of `if (v.lowlink =
v.index)` test ) we
> > > > > > > can
> > > > > > > visit
> > > > > > > stack[v.index:stack.size()] as an SCC. This
may or may
> > > > > > > not
> > > > > > > invalidate some things on the stack (the
variable `S` in
> > > > > > > the
> > > > > > > pseudocode) and we may need to fix it up
(e.g. inlining
> > > > > > > deleted
> > > > > > > a
> > > > > > > function, so we can't have an entry on
the stack). Then,
> > > > > > > we
> > > > > > > can
> > > > > > > run
> > > > > > > function passes as we pop individual
functions off the
> > > > > > > stack,
> > > > > > > but
> > > > > > > it
> > > > > > > is easier to think about IMO than merging of
SCC data
> > > > > > > structures:
> > > > > > > if
> > > > > > > we add edges to the CG then we have to do
more DFS on the
> > > > > > > new
> > > > > > > edges
> > > > > > > and if we delete edges then the DFS order of
the stack
> > > > > > > gives
> > > > > > > us
> > > > > > > certain guarantees.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Personally I find this much easier to reason
about than
> > > > > > > the
> > > > > > > description in terms of splitting and merging
SCC's in
> > > > > > > the
> > > > > > > CG
> > > > > > > and
> > > > > > > ref graph (which the LazyCallGraph API makes
one to think
> > > > > > > about
> > > > > > > since it hides the underlying Tarjan's
algorithm).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The LazyCallGraph API makes the current loop
in
> > > > > > >
http://reviews.llvm.org/diffusion/L/browse/llvm/trunk/include/llvm/Analysis/CGSCCPassManager.h;272124$100
> > > > > > > very clean, but at least for my thinking
about the
> > > > > > > problem,
> > > > > > > it
> > > > > > > seems
> > > > > > > like the wrong abstraction (and most of the
LazyCallGraph
> > > > > > > API
> > > > > > > seems
> > > > > > > to be unused, so it seems like it may be
overly
> > > > > > > heavyweight).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > E.g. I think that maybe the easiest thing to
do is to
> > > > > > > turn
> > > > > > > the
> > > > > > > current approach inside out: instead of
having the pass
> > > > > > > manager
> > > > > > > logic be the "normal code" and
forcing the Tarjan
> > > > > > > algorithm
> > > > > > > to
> > > > > > > become a state machine of iterators, use an
open-coded
> > > > > > > Tarjan
> > > > > > > algorithm with some callbacks and make the
pass
> > > > > > > management
> > > > > > > logic
> > > > > > > be
> > > > > > > the state machine.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > This will also open the door to avoiding the
potentially
> > > > > > > quadratic
> > > > > > > size of the ref graph, since e.g. in the
example I gave
> > > > > > > above,
> > > > > > > we
> > > > > > > can mark the `funcs` array itself as already
having been
> > > > > > > visited
> > > > > > > during the walk. In the current
LazyCallGraph, this would
> > > > > > > require
> > > > > > > adding some sort of notion of hyperedge.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Since this is such a high priority (due to
blocking PGO
> > > > > > > inlining),
> > > > > > > I
> > > > > > > will probably try my hand at implementing the
CGSCC pass
> > > > > > > manager
> > > > > > > sometime soon unless somebody beats me to it.
(I'll
> > > > > > > probably
> > > > > > > try
> > > > > > > the
> > > > > > > "open-coded SCC visit" approach).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Another possibility is implementing the new
CGSCC pass
> > > > > > > manager
> > > > > > > that
> > > > > > > uses the same visitation semantics as the one
in the old
> > > > > > > PM,
> > > > > > > and
> > > > > > > then we can refactor that as needed. In fact,
that may be
> > > > > > > the
> > > > > > > best
> > > > > > > approach so that porting to the new PM is as
NFC as
> > > > > > > possible
> > > > > > > and
> > > > > > > we
> > > > > > > can isolate the functional (i.e., need
benchmarks,
> > > > > > > measurements
> > > > > > > ...)
> > > > > > > changes in separate commits.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > A very high level comment: why do we need to
update
> > > > > > callgraph
> > > > > > on
> > > > > > the
> > > > > > fly ? Can we have a more general support of
iterative SCC
> > > > > > pass
> > > > > > invocation?
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > something like:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 1) build the callgraph
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 2) cache the post-order traversal order
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 3) if the order list is empty -- done
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 4) traversal: invoke function passes for each
function on
> > > > > > the
> > > > > > order
> > > > > > (step 2 or 5). The call graph gets updated on the
fly (with
> > > > > > new
> > > > > > edges, or new nodes for cloned functions)
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 5) update the function traversal order from new
nodes and
> > > > > > new
> > > > > > edges
> > > > > > created in 4)
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 6) go to step 3).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > (sorry for the delayed reply... this is a very poignant
> > > > > question
> > > > > /
> > > > > example)
> > > > 
> > > 
> > 
> 
> > > > > From the discussion with Chandler, I think he wants to
> > > > > provide
> > > > > more
> > > > > guarantees to function passes about the visitation
order. He
> > > > > will
> > > > > need to explain his exact concerns. But IIRC the
essence of
> > > > > one
> > > > > of
> > > > > the issues is captured in the example I gave in 2.b) in
the
> > > > > OP:
> > > > 
> > > 
> > 
> 
> > > > > " These function passes are e.g. able to DCE a
call, removing
> > > > > an
> > > > > edge
> > > > > from the CG. This converts, say, a CG SCC which is a
cycle
> > > > > graph
> > > > > (like a->b->c->a) into a path graph
(a->b->c, with no edge
> > > > > back
> > > > > to
> > > > > a)."
> > > > 
> > > 
> > 
> 
> > > > > The issue that I remember Chandler brought up is that
before
> > > > > deleting
> > > > > an edge from an SCC, the nodes in the SCC are
"unordered"
> > > > > w.r.t.
> > > > > each other; but after deleting an edge from the CG
which
> > > > > splits
> > > > > the
> > > > > SCC, there is an order.
> > > > 
> > > 
> > 
> 
> > > > > In other words, that the cached post-order may no
longer be a
> > > > > post-order traversal of the new graph after a function
pass
> > > > > runs.
> > > > > For example, in the cached post-order traversal we may
have
> > > > > entered
> > > > > the SCC `a->b->c->a` via an edge `x->b` and
so our cached
> > > > > post-order
> > > > > traversal visits the functions in the order `a, then c,
then
> > > > > b`.
> > > > > If
> > > > > a function pass visits `c` and deletes the call
`c->a`, then
> > > > > "a,
> > > > > then c, then b" is not a valid post-order for
visiting the
> > > > > functions.
> > > > 
> > > 
> > 
> 
> > > > > Looking at the explanation I gave, I think there
isn't really
> > > > > a
> > > > > problem here. The cached post-order can only be
invalidated
> > > > > for
> > > > > functions that have already been visited.
> > > > 
> > > 
> > 
> 
> > > > If we merge CGSCC based transformation passes with an IPA
> > > > analysis
> > > > pass that requires strict/valid bottom up order, then
updating
> > > > the
> > > > SCCs on the fly might be needed (or not if the result will
> > > > still
> > > > be
> > > > conservatively correct) -- but interleaving IPA analyses
with
> > > > IPA
> > > > transformations like this is not the right thing to do also
for
> > > > other reasons.
> > > 
> > 
> 
> > > > Assuming CGSCC pass just does a predefined set of
> > > > transformations
> > > > at
> > > > function level with some order, there is no issue of
> > > > 'invalidation'
> > > > -- the cached order will always be valid. In 2.b), why
bother
> > > > to
> > > > revisit the nodes?
> > > 
> > 
> 
> > > Indeed. That is the type of doubt for why I was asking this
> > > question
> > > in the OP.
> > 
> 
> > > > > Speaking more broadly about the algorithm you just
described,
> > > > > did
> > > > > you
> > > > > intend to omit an SCC visitation step?
> > > > 
> > > 
> > 
> 
> > > > It is independent of this decision. However we can actually
go
> > > > back
> > > > one step and ask the question: is adding this layer really
> > > > necessary? Why not just traversing the cgraph nodes in
reverse
> > > > topo-order (after removing the cycles)?
> > > 
> > 
> 
> > > > > The goal of the CGSCC pass manager is the ability to
visit an
> > > > > SCC
> > > > > (e.g. inliner visits an SCC), then immediately run
function
> > > > > passes
> > > > > to simplify the result of inlining.
> > > > 
> > > 
> > 
> 
> > > > This is how current implementation is. Is it a fundamental
> > > > requirement?
> > > 
> > 
> 
> > > I think this is a good question. Hopefully Chandler will be able
> > > to
> > > reply to this thread. From what I can tell, the exact
> > > requirements
> > > have never really been discussed.
> > 
> 
> > > > > Only after the simplification has occurred do we visit
the
> > > > > parent
> > > > > SCC. By running the simplifications in lock-step with
the
> > > > > inliner
> > > > > SCC visitation we give parent SCC's a more accurate
view of
> > > > > the
> > > > > real
> > > > > cost of inlining a function from a child SCC.
> > > > 
> > > 
> > 
> 
> > > > This works for all bottom up scheme regardless of a SCC
layer
> > > > is
> > > > added or not.
> > > 
> > 
> 
> > > > > Issues similar to 2.a) (i.e. adding edges to the CG)
also
> > > > > affect
> > > > > the
> > > > > visitation order of SCC's (not just functions). For
example,
> > > > > we
> > > > > visit an SCC with a CGSCC pass (e.g. inliner). Then we
run
> > > > > the
> > > > > first
> > > > > function pass on that SCC, which may add an edge (e.g.
> > > > > promote
> > > > > indirect call to direct call) that may enlarge the SCC.
Do we
> > > > > continue running the remaining function passes? Do we
> > > > > re-visit
> > > > > this
> > > > > new enlarged SCC? (if so, when?) These are the types of
> > > > > questions
> > > > > that motivated this post (hence the name "Intended
behavior
> > > > > of
> > > > > CGSCC
> > > > > pass manager.").
> > > > 
> > > 
> > 
> 
> > > > yes, I understand the motivation -- that is way I propose
the
> > > > agorithm that uses cached iteration order + worklist based
> > > > iterative
> > > > approach. In your example, when a new edge is exposed via
> > > > devirtualization, only its caller/ancestor nodes need to be
> > > > revisited in the next iteration.
> > > 
> > 
> 
> > > Your proposal certainly sounds simpler and easier to reason
> > > about.
> > > (e.g. it naturally avoids any issues with deletion)
> > 
> 
> > > Since there are only 3 non-inliner CGSCC passes (ArgPromotion,
> > > PostOrderFunctionAttrsLegacyPass, PruneEH) it may be feasible to
> > > remove the entire abstraction from LLVM and replace it with a
> > > cached
> > > post-order function pass visitation like you suggest. The inliner
> > > doesn't seem to do anything special with the knowledge that
it is
> > > visiting an SCC (besides moving call sites that call within the
> > > SCC
> > > to the end of its worklist) and so this may be fine.
> > 
> 
> > A) Using cached callgraph walk (nodes or SCC) to avoid cgraph
> > invalidation and B) whether SCC based walk has more advantages are
> > two different things to answer.
> 
> > For the later, I now tend to believe that introducing additional
> > SCC
> > layer has more advantages than alternative that uses node based
> > bottom-up walk.
> 
> > 1) Bottom-up based IPA algorithms can be implemented more cleanly
> > by
> > using SCC. For instance the pruneEH or any bottom-up attribute
> > propagations. Without using SCC, we will need to defined a generic
> > (using template) worklist based bottom-up propagation engine.
> 
> > 2) All SCC passes can be nicely grouped together -- for compile
> > time
> > reasons or to create synergies for better performance.
> 
> > For A), to repeat myself, SCC DAG caching to avoid mutation for one
> > round of CG walk has many advantages. We should probably discuss
> > more on pros/cons and various scenarios we want to handle ..
> 
> One question is what invariants we want to provide for the
> visitation.
> For example, should a CGSCC pass be able to assume that all
"child"
> SCC's (SCC's it can reach via direct calls emanating from the SCC
> being visited) have already been visited? Currently I don't think it
> can, and IIRC from the discussion at the social this is one thing
> that Chandler is hoping to fix. The "ref edge" notion in
> LazyCallGraph ensures that we cannot create a call edge
> (devirtualizing a ref edge) that will point at an SCC that has not
> yet been visited.
> E.g. consider this graph:
> digraph G {
> A -> B; B -> A; // SCC {A,B}
> S -> T; T -> S; // SCC {S,T}
> X -> Y; Y -> X; // SCC {X,Y}
> B -> X;
> B -> S;
> T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
> {S,T}",constraint=false,style=dashed]
> }
> (can visualize conveniently at http://www.webgraphviz.com/ or I have
> put an image at http://reviews.llvm.org/F2073104 )
> If we do not consider the ref graph, then it is possible for SCC
> {S,T} to be visited before SCC {X,Y}. So after devirtualizing the
> call T->Y we can no longer assume that the SCC's are visited in
> post-order (or must somehow try to ignore that the call edge T->Y
> exists).
> A more complicated case is when SCC {S,T} and SCC {X,Y} both call
> into each other via function pointers. So eventually after
> devirtualizing the calls in both directions there will be a single
> SCC {S,T,X,Y}.
> digraph G {
> A -> B; B -> A; // SCC {A,B}
> S -> T; T -> S; // SCC {S,T}
> X -> Y; Y -> X; // SCC {X,Y}
> B -> X;
> B -> S;
> T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
> {S,T}",constraint=false,style=dashed]
> X -> S [label="Ref edge that is devirtualized\nwhen visiting SCC
> {X,Y}",constraint=false,style=dashed]
> }
> (rendering at: http://reviews.llvm.org/F2073479 )
> Due to the cyclic dependence there is no SCC visitation order that
> can directly provide the invariant above. Indeed, the problem of
> maintaining the above invariant is ill-posed in this scenario.To clarify, we're trying to provide this invariant on the "ref"
graph or on the graph with direct calls only? I think the invariant need only
apply to the former if we're relying on this for correctness (i.e. an
analysis must visit all callees before visiting the callers).

-Hal 
> Consider the pipeline `cgscc(function(...simplifications that can
> devirtualize...),foo-cgscc-pass)`. A possible visitation is as
> follows:
> 1. Visit SCC {S,T} and run `function(...simplifications that can
> devirtualize...)`. This reveals the call edge T->Y.
> 2. We continue visiting SCC {S,T} and run foo-cgscc-pass on SCC
> {S,T}.
> 3. Visit SCC {X,Y} and run `function(...simplifications that can
> devirtualize...)`. This reveals the call edge X->S.
> 4. ??? what do we do now.
> Alternative 4.a) Should we continue the visitation and call
> foo-cgscc-pass on "SCC" {X,Y}?
> Alternative 4.b) Should foo-cgscc-pass now run on SCC {S,T,X,Y}?
> Alternative 4.c) Should we restart the entire outer `cgscc(...)`
> visitation on SCC {S,T,X,Y}?
> (Without a cap both 4.b and 4.c could become quadratic on a graph
> like http://reviews.llvm.org/F2073607 )
> -- Sean Silva
> > thanks,
> 
> > David
> 
> > > Sean:~/pg/llvm % git grep 'public CallGraphSCCPass'
> > 
> 
> > > include/llvm/Transforms/IPO/InlinerPass.h:struct Inliner : public
> > > CallGraphSCCPass {
> > 
> 
> > > lib/Transforms/IPO/ArgumentPromotion.cpp: struct ArgPromotion :
> > > public CallGraphSCCPass {
> > 
> 
> > > lib/Transforms/IPO/FunctionAttrs.cpp:struct
> > > PostOrderFunctionAttrsLegacyPass : public CallGraphSCCPass {
> > 
> 
> > > lib/Transforms/IPO/PruneEH.cpp: struct PruneEH : public
> > > CallGraphSCCPass {
> > 
> 
> > > lib/Analysis/CallGraphSCCPass.cpp: class PrintCallGraphPass :
> > > public
> > > CallGraphSCCPass {
> > 
> 
> > > tools/opt/PassPrinters.cpp:struct CallGraphSCCPassPrinter :
> > > public
> > > CallGraphSCCPass {
> > 
> 
> > > CGSCC passes seem to have been added in what is now SVN r8247
> > > (~Aug
> > > 2003)
> > >
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20030825/006619.html
> > > (LLVM appears to have been in CVS at the time).
> > 
> 
> > > Chris, do you remember the motivation for doing the CGSCC
> > > visitation
> > > instead of a pure post-order function visitation like David is
> > > mentioning? (or was it just an oversight / hindsight-20-20
> > > thing?)
> > > Do you think it would make sense to replace CGSCC visitation with
> > > post-order function visitation in the current LLVM?
> > 
> 
> > > -- Sean Silva
> > 
> 
> > > > David
> > > 
> > 
> 
> > > > > -- Sean Silva
> > > > 
> > > 
> > 
> 
> > > > > > David
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Sorry for the wall of text.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > -- Sean Silva
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160616/1ecf876e/attachment-0001.html>

Xinliang David Li via llvm-dev

2016-Jun-16 17:45 UTC

head link

[llvm-dev] Intended behavior of CGSCC pass manager.

> To clarify, we're trying to provide this invariant on the
"ref" graph or
> on the graph with direct calls only? I think the invariant need only apply
> to the former
>
More clarification needed :) What do you mean by 'invariant need only apply
to the former'?


> if we're relying on this for correctness (i.e. an analysis must visit
all
> callees before visiting the callers).
>

Not necessarily. Due to lost edges (from caller to indirect callees), a
callee node may be visited later.  The analysis will just have to punt when
a special edge to 'external' node is seen.

David


>
>  -Hal
>
> Consider the pipeline `cgscc(function(...simplifications that can
> devirtualize...),foo-cgscc-pass)`. A possible visitation is as follows:
>
> 1. Visit SCC {S,T} and run `function(...simplifications that can
> devirtualize...)`. This reveals the call edge T->Y.
> 2. We continue visiting SCC {S,T} and run foo-cgscc-pass on SCC {S,T}.
> 3. Visit SCC {X,Y} and run `function(...simplifications that can
> devirtualize...)`. This reveals the call edge X->S.
> 4. ??? what do we do now.
> Alternative 4.a) Should we continue the visitation and call foo-cgscc-pass
> on "SCC" {X,Y}?
> Alternative 4.b) Should foo-cgscc-pass now run on SCC {S,T,X,Y}?
> Alternative 4.c) Should we restart the entire outer `cgscc(...)`
> visitation on SCC {S,T,X,Y}?
>
> (Without a cap both 4.b and 4.c could become quadratic on a graph like
> http://reviews.llvm.org/F2073607)
>
> -- Sean Silva
>
>
>>
>> thanks,
>>
>> David
>>
>>
>>
>>
>>> Sean:~/pg/llvm % git grep 'public CallGraphSCCPass'
>>> include/llvm/Transforms/IPO/InlinerPass.h:struct Inliner : public
>>> CallGraphSCCPass {
>>> lib/Transforms/IPO/ArgumentPromotion.cpp:  struct ArgPromotion :
public
>>> CallGraphSCCPass {
>>> lib/Transforms/IPO/FunctionAttrs.cpp:struct
>>> PostOrderFunctionAttrsLegacyPass : public CallGraphSCCPass {
>>> lib/Transforms/IPO/PruneEH.cpp:  struct PruneEH : public
>>> CallGraphSCCPass {
>>> lib/Analysis/CallGraphSCCPass.cpp:  class PrintCallGraphPass :
public
>>> CallGraphSCCPass {
>>> tools/opt/PassPrinters.cpp:struct CallGraphSCCPassPrinter : public
>>> CallGraphSCCPass {
>>>
>>> CGSCC passes seem to have been added in what is now SVN r8247 (~Aug
>>> 2003)
>>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20030825/006619.html
>>> (LLVM appears to have been in CVS at the time).
>>>
>>> Chris, do you remember the motivation for doing the CGSCC
visitation
>>> instead of a pure post-order function visitation like David is
mentioning?
>>> (or was it just an oversight / hindsight-20-20 thing?) Do you think
it
>>> would make sense to replace CGSCC visitation with post-order
function
>>> visitation in the current LLVM?
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> David
>>>>
>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Sorry for the wall of text.
>>>>>>>
>>>>>>>
>>>>>>> -- Sean Silva
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160616/4c04ea97/attachment.html>

Sanjoy Das via llvm-dev

2016-Jun-16 18:12 UTC

head link

[llvm-dev] Intended behavior of CGSCC pass manager.

Hi Sean,

On Thu, Jun 16, 2016 at 4:48 AM, Sean Silva via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> One question is what invariants we want to provide for the visitation.
>
> For example, should a CGSCC pass be able to assume that all
"child" SCC's
> (SCC's it can reach via direct calls emanating from the SCC being
visited)
> have already been visited? Currently I don't think it can, and IIRC
from the
> discussion at the social this is one thing that Chandler is hoping to fix.
> The "ref edge" notion in LazyCallGraph ensures that we cannot
create a call
> edge (devirtualizing a ref edge) that will point at an SCC that has not yet
> been visited.
>
> E.g. consider this graph:
>
> digraph G {
>   A -> B; B -> A; // SCC {A,B}
>   S -> T; T -> S; // SCC {S,T}
>   X -> Y; Y -> X; // SCC {X,Y}
>
>   B -> X;
>   B -> S;
>   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
> {S,T}",constraint=false,style=dashed]
> }
>
> (can visualize conveniently at http://www.webgraphviz.com/ or I have put an
> image at http://reviews.llvm.org/F2073104)
>
> If we do not consider the ref graph, then it is possible for SCC {S,T} to
be
I'm not sure why you wouldn't consider the ref graph?  I think the
general idea is to visit RefSCCs in bottom up order, and when visiting
a RefSCC, visiting the SCC's inside the RefSCC in bottom up order.

So in your example, given the edges you've shown, we will visit {X,Y}
before visiting {S,T}.
> A more complicated case is when SCC {S,T} and SCC {X,Y} both call into each
> other via function pointers. So eventually after devirtualizing the calls
in
> both directions there will be a single SCC {S,T,X,Y}.
>
> digraph G {
>   A -> B; B -> A; // SCC {A,B}
>   S -> T; T -> S; // SCC {S,T}
>   X -> Y; Y -> X; // SCC {X,Y}
>
>   B -> X;
>   B -> S;
>   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
> {S,T}",constraint=false,style=dashed]
>   X -> S [label="Ref edge that is devirtualized\nwhen visiting SCC
> {X,Y}",constraint=false,style=dashed]
> }
>
> (rendering at: http://reviews.llvm.org/F2073479)
>
> Due to the cyclic dependence there is no SCC visitation order that can
> directly provide the invariant above. Indeed, the problem of maintaining
the
I think the workflow in the above will (roughly) be:

 Visit the RefSCC {X,Y,S,T}
   Visit the SCC {X,Y} // arbitrary
     Optimize({X,Y})
     // Now there's an edge to {S,T}, invalidate
     // the analyses cached for {X,Y} and visit {S,T}
   Visit the SCC {S,T}
     Optimize({S,T})
     // Now {X,Y,S,T} collapses to form a single SCC
   Visit the SCC {S,T,X,Y}
     Optimize({S,T,X,Y})

The difficult bit is to make the inner "// Now.*" bits work well.

-- Sanjoy

Sean Silva via llvm-dev

2016-Jun-17 00:13 UTC

head link

[llvm-dev] Intended behavior of CGSCC pass manager.

On Thu, Jun 16, 2016 at 11:12 AM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> Hi Sean,
>
> On Thu, Jun 16, 2016 at 4:48 AM, Sean Silva via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > One question is what invariants we want to provide for the visitation.
> >
> > For example, should a CGSCC pass be able to assume that all
"child" SCC's
> > (SCC's it can reach via direct calls emanating from the SCC being
> visited)
> > have already been visited? Currently I don't think it can, and
IIRC from
> the
> > discussion at the social this is one thing that Chandler is hoping to
> fix.
> > The "ref edge" notion in LazyCallGraph ensures that we
cannot create a
> call
> > edge (devirtualizing a ref edge) that will point at an SCC that has
not
> yet
> > been visited.
> >
> > E.g. consider this graph:
> >
> > digraph G {
> >   A -> B; B -> A; // SCC {A,B}
> >   S -> T; T -> S; // SCC {S,T}
> >   X -> Y; Y -> X; // SCC {X,Y}
> >
> >   B -> X;
> >   B -> S;
> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {S,T}",constraint=false,style=dashed]
> > }
> >
> > (can visualize conveniently at http://www.webgraphviz.com/ or I have
> put an
> > image at http://reviews.llvm.org/F2073104)
> >
> > If we do not consider the ref graph, then it is possible for SCC {S,T}
> to be
>
> I'm not sure why you wouldn't consider the ref graph?

The simple answer is that this is the current state of things. The SCC
visitation logic in the old PM does not consider the ref graph.

So in some sense the question is why *should* we consider the ref graph?
What is it buying us? Presumably this will take the form of some invariant
on the `run(SCC &)` calls. But I have yet to see any explicit statement of
an invariant that it gives us.

For example, the examples I gave show that without bailing out in the
middle of a cgscc pass manager (e.g. after the `function(...simplifications
that can devirtualize...)`) then we cannot even guarantee that the thing
passed to the `run(SCC &)` function is actually an SCC.

>   I think the
> general idea is to visit RefSCCs in bottom up order, and when visiting
> a RefSCC, visiting the SCC's inside the RefSCC in bottom up order.
>
> So in your example, given the edges you've shown, we will visit {X,Y}
> before visiting {S,T}.
>
> > A more complicated case is when SCC {S,T} and SCC {X,Y} both call into
> each
> > other via function pointers. So eventually after devirtualizing the
> calls in
> > both directions there will be a single SCC {S,T,X,Y}.
> >
> > digraph G {
> >   A -> B; B -> A; // SCC {A,B}
> >   S -> T; T -> S; // SCC {S,T}
> >   X -> Y; Y -> X; // SCC {X,Y}
> >
> >   B -> X;
> >   B -> S;
> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {S,T}",constraint=false,style=dashed]
> >   X -> S [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {X,Y}",constraint=false,style=dashed]
> > }
> >
> > (rendering at: http://reviews.llvm.org/F2073479)
> >
> > Due to the cyclic dependence there is no SCC visitation order that can
> > directly provide the invariant above. Indeed, the problem of
maintaining
> the
>
> I think the workflow in the above will (roughly) be:
>
>  Visit the RefSCC {X,Y,S,T}
>    Visit the SCC {X,Y} // arbitrary
>      Optimize({X,Y})
>      // Now there's an edge to {S,T}, invalidate
>      // the analyses cached for {X,Y} and visit {S,T}
>    Visit the SCC {S,T}
>      Optimize({S,T})
>      // Now {X,Y,S,T} collapses to form a single SCC
>    Visit the SCC {S,T,X,Y}
>      Optimize({S,T,X,Y})
>
> The difficult bit is to make the inner "// Now.*" bits work well.
>
But consider that Optimize({S,T}) might be of the form:
`cgscc(function(...simplifications
that can devirtualize...),foo-cgscc-pass)`.
After running `function(...simplifications that can devirtualize...)` we
would end up running `foo-cgscc-pass` on {S,T} which is no longer an SCC
anymore.
What is the invariant here? What do we actually guarantee for the `run(SCC
&)` function?

-- Sean Silva

>
> -- Sanjoy
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160616/ca6cb463/attachment.html>

Xinliang David Li via llvm-dev

2016-Jun-17 04:53 UTC

head link

[llvm-dev] Intended behavior of CGSCC pass manager.

On Thu, Jun 16, 2016 at 11:12 AM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> Hi Sean,
>
> On Thu, Jun 16, 2016 at 4:48 AM, Sean Silva via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > One question is what invariants we want to provide for the visitation.
> >
> > For example, should a CGSCC pass be able to assume that all
"child" SCC's
> > (SCC's it can reach via direct calls emanating from the SCC being
> visited)
> > have already been visited? Currently I don't think it can, and
IIRC from
> the
> > discussion at the social this is one thing that Chandler is hoping to
> fix.
> > The "ref edge" notion in LazyCallGraph ensures that we
cannot create a
> call
> > edge (devirtualizing a ref edge) that will point at an SCC that has
not
> yet
> > been visited.
> >
> > E.g. consider this graph:
> >
> > digraph G {
> >   A -> B; B -> A; // SCC {A,B}
> >   S -> T; T -> S; // SCC {S,T}
> >   X -> Y; Y -> X; // SCC {X,Y}
> >
> >   B -> X;
> >   B -> S;
> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {S,T}",constraint=false,style=dashed]
> > }
> >
> > (can visualize conveniently at http://www.webgraphviz.com/ or I have
> put an
> > image at http://reviews.llvm.org/F2073104)
> >
> > If we do not consider the ref graph, then it is possible for SCC {S,T}
> to be
>
> I'm not sure why you wouldn't consider the ref graph?  I think the
> general idea is to visit RefSCCs in bottom up order, and when visiting
> a RefSCC, visiting the SCC's inside the RefSCC in bottom up order.
>
> So in your example, given the edges you've shown, we will visit {X,Y}
> before visiting {S,T}.
>
> > A more complicated case is when SCC {S,T} and SCC {X,Y} both call into
> each
> > other via function pointers. So eventually after devirtualizing the
> calls in
> > both directions there will be a single SCC {S,T,X,Y}.
> >
> > digraph G {
> >   A -> B; B -> A; // SCC {A,B}
> >   S -> T; T -> S; // SCC {S,T}
> >   X -> Y; Y -> X; // SCC {X,Y}
> >
> >   B -> X;
> >   B -> S;
> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {S,T}",constraint=false,style=dashed]
> >   X -> S [label="Ref edge that is devirtualized\nwhen visiting
SCC
> > {X,Y}",constraint=false,style=dashed]
> > }
> >
> > (rendering at: http://reviews.llvm.org/F2073479)
> >
> > Due to the cyclic dependence there is no SCC visitation order that can
> > directly provide the invariant above. Indeed, the problem of
maintaining
> the
>
> I think the workflow in the above will (roughly) be:
>
>  Visit the RefSCC {X,Y,S,T}
>
Are we sure RefSCC has ref edges between {X, Y} and {S, T} in this case?  I
may miss the code handling it.


>    Visit the SCC {X,Y} // arbitrary
>      Optimize({X,Y})
>      // Now there's an edge to {S,T}, invalidate
>      // the analyses cached for {X,Y} and visit {S,T}
>
I am not sure if this makes sense.   If dynamically, the call edge from {X,
Y} to {S, T} does exist, but not discovered by the analysis, then the
cached {X, Y} will still be invalid, but who is going to invalidate it?

David




>    Visit the SCC {S,T}
>      Optimize({S,T})
>      // Now {X,Y,S,T} collapses to form a single SCC
>    Visit the SCC {S,T,X,Y}
>      Optimize({S,T,X,Y})
>
> The difficult bit is to make the inner "// Now.*" bits work well.
>
> -- Sanjoy
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160616/3fb32159/attachment.html>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Jun 2016 - Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

[llvm-dev] Intended behavior of CGSCC pass manager.

Maybe Matching Threads