thr3ads.net - llvm dev - [llvm-dev] RFC: Dynamic dominators [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Jakub (Kuba) Kuderski via llvm-dev

2017-Jun-13 06:47 UTC

[llvm-dev] RFC: Dynamic dominators

Hi Tobias,

1) Daniel and Chandler have for a long time been talking about
computing> dominance and post-dominance in one shot to reduce the cost of
> post-dominance and make it (freely) available everywhere.  Is this
> covered by your current (or planned) work?

I'm not sure what you exactly mean by one shot; I'll ask around
tomorrow.

 I wanted to play a little bit with your work, but run in some
issues:> > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > bin/dominators -to-graph
../test/Analysis/Dominators/2007-07-11-SplitBlock.bc
> | tee graph
> p 5 5 1 1
> a 1 3
> a 1 5
> a 2 3
> a 3 2
> a 3 4
> e
> > bin/dominators graph
> Unknown file format for graph
> Invalid input graph
> I must do something obvious wrong. Any idea what?

You almost got it right -- 'graph' files must end with
".txt"... :P

Best,
Kuba

On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser <tobias.grosser at
inf.ethz.ch> wrote:
> Hi  Jakub,
>
> thanks for the update. I was already eagerly waiting to hear what your
> internship on dominance brings. I think the numbers are already very
> encouraging and I believe moving incrementally over to the new algorithm
> is exactly the right approach.
>
> I have a couple of questions:
>
> 1) Daniel and Chandler have for a long time been talking about computing
> dominance and post-dominance in one shot to reduce the cost of
> post-dominance and make it (freely) available everywhere.  Is this
> covered by your current (or planned) work?
>
> 2) I tried to use tools/dominators
>
> I wanted to play a little bit with your work, but run in some issues:
>
> > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > bin/dominators -to-graph
../test/Analysis/Dominators/2007-07-11-SplitBlock.bc
> | tee graph
> p 5 5 1 1
> a 1 3
> a 1 5
> a 2 3
> a 3 2
> a 3 4
> e
> > bin/dominators graph
> Unknown file format for graph
> Invalid input graph
>
> I must do something obvious wrong. Any idea what?
>
> Best,
> Tobias
>
> On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev
> wrote:
> > Hi folks,
> >
> > This summer I'm working on improving dominators during my
internship at
> > Google. Below is an RFC on switching to dynamic dominators, which you
can
> > also read as a Google Doc
> > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> yF65OTfhSMaNHQ/edit?usp=sharing>
> > if you prefer so. Please let us know what you think.
> >
> > ~Kuba
> >
> >
======================================================================> >
> > *1. Context*
> >
> > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build the
> > (post)dominator tree. The only way to update it is by manually setting
> > IDoms which is not obvious in many cases and can be extremely
> > error-prone.
> > And because updating it manually is so difficult, programmers tend to
> > just
> > recompute it after changing the CFG (by not AU.addPreserved()'ing
the
> > DomTree). This causes DomTree calculation to fire very often even if
only
> > a
> > very small portion of it gets really affected by the changes in CFG.
As
> > an
> > example, when optimizing a Full LTO clang bitcode,
> > DominatorTreeWrapperPass
> > alone calls DT.recalculate over 6.5 million times, which takes 20s on
my
> > machine.
> >
> > Using an incremental algorithm it would be much easier to keep an
> > up-to-date DomTree without custom update logic, which will save us the
> > time
> > currently spent during DomTree recalculations and reduce the number of
> > bugs
> > caused by manual updates. It would also make it feasible to maintain
> > postdominators and use them more broadly, which currently can be too
> > complicated and expensive.
> >
> > *2. Proposal*
> >
> > The proposal is to make dominators use the incremental algorithm that
> > allows to keep (post)dominator tree up to date as CFG changes. To
achieve
> > that, we would implement the dynamic depth based search algorithm
(DBS)
> > described in this paper [1]
<https://arxiv.org/pdf/1604.02711.pdf> and
> > expose 2 main new functions: insertArc(From, To) and deleteArc(From,
To).
> > The algorithm uses SemiNCA under the hood which would replace
> > Lengauer-Tarjan implementation currently used.
> >
> > The second part of the proposal is to gradually deprecate and remove
the
> > existing API for manually manipulating dominator tree
> > (changeImmediateDominator, addNewBlock) and replace its use within
LLVM
> > with the new incremental API.
> >
> > *3. Proof of concept*
> >
> > The prototype implementation can be found in my LLVM fork [2]
> > <https://github.com/kuhar/llvm-dominators>. It comes with
several layers
> > of
> > verification and was tested on clang, llvm test suite and a few open
> > source
> > libraries.
> > The code provides the basic functionality and tries be
‘API-equivalent’
> > with the current DomTree implementation. The NewDomTree is also able
to
> > convert itself to the current one for testing and verification
purposes.
> > Dynamic dominators are hacked into 3 existing passes
(DomTreeWrapperPass,
> > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment with
> > the
> > use of the new API.
> >
> > The prototype also comes with a bunch of testing utilities and a
simple
> > benchmarking tool.
> >
> > *4. Performance*
> >
> > The real life performance of full dynamic DBS-based DomTree
recalculation
> > is between 20% and 2% better on a machine with two Xeon E5-2680v2 CPUs
> > than
> > the existing SLT algorithm, which is consistent with the findings from
> > this
> > thesis [3] <ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>.
The
> > advantage
> > is not that visible on debug builds, where the DBS-algorithm can be up
to
> > 8% slower. It is most like possible to speed up debug builds, but I
> > haven’t
> > looked into that yet.
> > The benchmark performed [4]
> > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> yF65OTfhSMaNHQ/edit?usp=sharing>
> > loads of a (full LTO) bitcode file and builds DomTress of all
functions
> > 15
> > times.
> >
> > The current DomTree updates DFS in-out numbers eagerly upon
construction,
> > while the new one postpones is until they are actually needed. To make
> > the
> > benchmark fair, numbers were collected for both eager and lazy
strategy
> > for
> > the new DomTree.
> >
> > The main advantage of the incremental algorithm comes from the fact
that
> > it
> > allows incremental updates without rebuilding the whole tree, not from
> > the
> > slightly faster construction.
> > What’s more, the insertArc / deleteArc API doesn’t force updates to
> > happen
> > immediately -- they can be queued behind the scenes and happen in
batches
> > if we decide to pursue that design later.
> >
> > *5. Transition plan*
> >
> > There are several possibilities when it comes to transition. The
biggest
> > problem is that the current DomTree exposes an interface for manual
> > updates
> > (setIDom, changeImmediateDominator), and for manual construction
> > (addNewBlock). Because of the additional data stored in the
incremental
> > algorithm (relative dominators, preorder parents, levels) it is not
> > really
> > possible to use the old API while keeping this data up-to-date. The
> > incremental algorithm relies on this information when performing fast
arc
> > deletions; It is still able to perform them without it -- deletions
are
> > then just slower in some cases.
> > The most straightforward solutions are:
> >
> > a) Keep the existing DomTree and gradually replace its uses with the
new
> > one. It is possible to convert the DBS-based dominators to the old
ones.
> > b) Replace the existing DomTree with the new, dynamic dominators. Nuke
> > all
> > of the old update functions and replace their uses with the new
> > incremental
> > API in one commit.
> > c) Replace the existing DomTree with the new one, but keep the old API
> > around and mark it as deprecated. If someone invokes one of the old
> > update
> > functions, mark the the additional information as invalid for dynamic
> > deletions. Follow the pessimistic but correct dynamic deletion path if
> > the
> > additional information is marked as invalidated. Gradually modernize
the
> > projects to use the new API instead and then remove the old update
> > functions.
> >
> > Maintaining the old DT and the new one simultaneously seems like a
waste
> > of
> > time as the DBS offers better or similar performance to the existing
> > SLT-based implementation.
> > The problem with replacing the old API with the new one is that it’s
used
> > in many places (~100) across the project and I believe that doing that
> > all
> > at once would be tricky to get correct. Gradual update seems much
easier
> > to
> > ensure correctness and the transitional API (invalid additional data
> > check)
> > is trivial to implement. Because of that, I propose to follow the
option
> > (c).
> >
> >
> >
> > [1] Georgiadis et al., An Experimental Study of Dynamic Dominators,
> > https://arxiv.org/pdf/1604.02711.pdf
> > [2] llvm-dominators LLVM fork on Github,
> > https://github.com/kuhar/llvm-dominators
> > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and Related
> > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p. 38
> > [4] Google Doc with the performance numbers,
> > https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> yF65OTfhSMaNHQ/edit?usp=sharing
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170612/f7d1d8ce/attachment.html>

Tobias Grosser via llvm-dev

2017-Jun-13 07:05 UTC

head link

[llvm-dev] RFC: Dynamic dominators

On Tue, Jun 13, 2017, at 08:47 AM, Jakub (Kuba) Kuderski via llvm-dev
wrote:> Hi Tobias,
> 
> 1) Daniel and Chandler have for a long time been talking about computing
> > dominance and post-dominance in one shot to reduce the cost of
> > post-dominance and make it (freely) available everywhere.  Is this
> > covered by your current (or planned) work?
> 
> I'm not sure what you exactly mean by one shot; I'll ask around
tomorrow.
Not sure what algorithm Daniel had in mind, but he was talking about
some approach that basically gives post-dominance "for free" when
computing dominance. Not sure how exactly this would work.
 >  I wanted to play a little bit with your work, but run in some issues:
> > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > > bin/dominators -to-graph
../test/Analysis/Dominators/2007-07-11-SplitBlock.bc
> > | tee graph
> > p 5 5 1 1
> > a 1 3
> > a 1 5
> > a 2 3
> > a 3 2
> > a 3 4
> > e
> > > bin/dominators graph
> > Unknown file format for graph
> > Invalid input graph
> > I must do something obvious wrong. Any idea what?
> 
> 
> You almost got it right -- 'graph' files must end with
".txt"... :P
I got it. Thank you! I also realized I can directly read in .bc files.

I have a couple of immediate (but rather minor) comments:

-  It would be convenient to allow .ll files to be read.
- You do not use the BB names in the output, right? This would certainly
   improve readability!
- My output prints "New Dominator Tree" to times in a row. What is the
difference? What is the difference to Inorder Dominator Tree?
- How do I get the post-dominator tree with your tool? Do you already
have test cases? In fact, normal IR test cases would be really cool. We
do not have a lot of test coverage for all the dominance stuff.

Best,
Tobias
> 
> Best,
> Kuba
> 
> On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser
> <tobias.grosser at inf.ethz.ch
> > wrote:
> 
> > Hi  Jakub,
> >
> > thanks for the update. I was already eagerly waiting to hear what your
> > internship on dominance brings. I think the numbers are already very
> > encouraging and I believe moving incrementally over to the new
algorithm
> > is exactly the right approach.
> >
> > I have a couple of questions:
> >
> > 1) Daniel and Chandler have for a long time been talking about
computing
> > dominance and post-dominance in one shot to reduce the cost of
> > post-dominance and make it (freely) available everywhere.  Is this
> > covered by your current (or planned) work?
> >
> > 2) I tried to use tools/dominators
> >
> > I wanted to play a little bit with your work, but run in some issues:
> >
> > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > > bin/dominators -to-graph
../test/Analysis/Dominators/2007-07-11-SplitBlock.bc
> > | tee graph
> > p 5 5 1 1
> > a 1 3
> > a 1 5
> > a 2 3
> > a 3 2
> > a 3 4
> > e
> > > bin/dominators graph
> > Unknown file format for graph
> > Invalid input graph
> >
> > I must do something obvious wrong. Any idea what?
> >
> > Best,
> > Tobias
> >
> > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev
> > wrote:
> > > Hi folks,
> > >
> > > This summer I'm working on improving dominators during my
internship at
> > > Google. Below is an RFC on switching to dynamic dominators, which
you can
> > > also read as a Google Doc
> > >
<https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > yF65OTfhSMaNHQ/edit?usp=sharing>
> > > if you prefer so. Please let us know what you think.
> > >
> > > ~Kuba
> > >
> > >
======================================================================> >
>
> > > *1. Context*
> > >
> > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build
the
> > > (post)dominator tree. The only way to update it is by manually
setting
> > > IDoms which is not obvious in many cases and can be extremely
> > > error-prone.
> > > And because updating it manually is so difficult, programmers
tend to
> > > just
> > > recompute it after changing the CFG (by not
AU.addPreserved()'ing the
> > > DomTree). This causes DomTree calculation to fire very often even
if only
> > > a
> > > very small portion of it gets really affected by the changes in
CFG. As
> > > an
> > > example, when optimizing a Full LTO clang bitcode,
> > > DominatorTreeWrapperPass
> > > alone calls DT.recalculate over 6.5 million times, which takes
20s on my
> > > machine.
> > >
> > > Using an incremental algorithm it would be much easier to keep an
> > > up-to-date DomTree without custom update logic, which will save
us the
> > > time
> > > currently spent during DomTree recalculations and reduce the
number of
> > > bugs
> > > caused by manual updates. It would also make it feasible to
maintain
> > > postdominators and use them more broadly, which currently can be
too
> > > complicated and expensive.
> > >
> > > *2. Proposal*
> > >
> > > The proposal is to make dominators use the incremental algorithm
that
> > > allows to keep (post)dominator tree up to date as CFG changes. To
achieve
> > > that, we would implement the dynamic depth based search algorithm
(DBS)
> > > described in this paper [1]
<https://arxiv.org/pdf/1604.02711.pdf> and
> > > expose 2 main new functions: insertArc(From, To) and
deleteArc(From, To).
> > > The algorithm uses SemiNCA under the hood which would replace
> > > Lengauer-Tarjan implementation currently used.
> > >
> > > The second part of the proposal is to gradually deprecate and
remove the
> > > existing API for manually manipulating dominator tree
> > > (changeImmediateDominator, addNewBlock) and replace its use
within LLVM
> > > with the new incremental API.
> > >
> > > *3. Proof of concept*
> > >
> > > The prototype implementation can be found in my LLVM fork [2]
> > > <https://github.com/kuhar/llvm-dominators>. It comes with
several layers
> > > of
> > > verification and was tested on clang, llvm test suite and a few
open
> > > source
> > > libraries.
> > > The code provides the basic functionality and tries be
‘API-equivalent’
> > > with the current DomTree implementation. The NewDomTree is also
able to
> > > convert itself to the current one for testing and verification
purposes.
> > > Dynamic dominators are hacked into 3 existing passes
(DomTreeWrapperPass,
> > > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment
with
> > > the
> > > use of the new API.
> > >
> > > The prototype also comes with a bunch of testing utilities and a
simple
> > > benchmarking tool.
> > >
> > > *4. Performance*
> > >
> > > The real life performance of full dynamic DBS-based DomTree
recalculation
> > > is between 20% and 2% better on a machine with two Xeon E5-2680v2
CPUs
> > > than
> > > the existing SLT algorithm, which is consistent with the findings
from
> > > this
> > > thesis [3]
<ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>. The
> > > advantage
> > > is not that visible on debug builds, where the DBS-algorithm can
be up to
> > > 8% slower. It is most like possible to speed up debug builds, but
I
> > > haven’t
> > > looked into that yet.
> > > The benchmark performed [4]
> > >
<https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > yF65OTfhSMaNHQ/edit?usp=sharing>
> > > loads of a (full LTO) bitcode file and builds DomTress of all
functions
> > > 15
> > > times.
> > >
> > > The current DomTree updates DFS in-out numbers eagerly upon
construction,
> > > while the new one postpones is until they are actually needed. To
make
> > > the
> > > benchmark fair, numbers were collected for both eager and lazy
strategy
> > > for
> > > the new DomTree.
> > >
> > > The main advantage of the incremental algorithm comes from the
fact that
> > > it
> > > allows incremental updates without rebuilding the whole tree, not
from
> > > the
> > > slightly faster construction.
> > > What’s more, the insertArc / deleteArc API doesn’t force updates
to
> > > happen
> > > immediately -- they can be queued behind the scenes and happen in
batches
> > > if we decide to pursue that design later.
> > >
> > > *5. Transition plan*
> > >
> > > There are several possibilities when it comes to transition. The
biggest
> > > problem is that the current DomTree exposes an interface for
manual
> > > updates
> > > (setIDom, changeImmediateDominator), and for manual construction
> > > (addNewBlock). Because of the additional data stored in the
incremental
> > > algorithm (relative dominators, preorder parents, levels) it is
not
> > > really
> > > possible to use the old API while keeping this data up-to-date.
The
> > > incremental algorithm relies on this information when performing
fast arc
> > > deletions; It is still able to perform them without it --
deletions are
> > > then just slower in some cases.
> > > The most straightforward solutions are:
> > >
> > > a) Keep the existing DomTree and gradually replace its uses with
the new
> > > one. It is possible to convert the DBS-based dominators to the
old ones.
> > > b) Replace the existing DomTree with the new, dynamic dominators.
Nuke
> > > all
> > > of the old update functions and replace their uses with the new
> > > incremental
> > > API in one commit.
> > > c) Replace the existing DomTree with the new one, but keep the
old API
> > > around and mark it as deprecated. If someone invokes one of the
old
> > > update
> > > functions, mark the the additional information as invalid for
dynamic
> > > deletions. Follow the pessimistic but correct dynamic deletion
path if
> > > the
> > > additional information is marked as invalidated. Gradually
modernize the
> > > projects to use the new API instead and then remove the old
update
> > > functions.
> > >
> > > Maintaining the old DT and the new one simultaneously seems like
a waste
> > > of
> > > time as the DBS offers better or similar performance to the
existing
> > > SLT-based implementation.
> > > The problem with replacing the old API with the new one is that
it’s used
> > > in many places (~100) across the project and I believe that doing
that
> > > all
> > > at once would be tricky to get correct. Gradual update seems much
easier
> > > to
> > > ensure correctness and the transitional API (invalid additional
data
> > > check)
> > > is trivial to implement. Because of that, I propose to follow the
option
> > > (c).
> > >
> > >
> > >
> > > [1] Georgiadis et al., An Experimental Study of Dynamic
Dominators,
> > > https://arxiv.org/pdf/1604.02711.pdf
> > > [2] llvm-dominators LLVM fork on Github,
> > > https://github.com/kuhar/llvm-dominators
> > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and
Related
> > > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p. 38
> > > [4] Google Doc with the performance numbers,
> > > https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > yF65OTfhSMaNHQ/edit?usp=sharing
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> 
> 
> 
> -- 
> Jakub Kuderski
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Jakub (Kuba) Kuderski via llvm-dev

2017-Jun-13 07:22 UTC

head link

[llvm-dev] RFC: Dynamic dominators

Tobias,

Thanks for the comments and taking a look at my fork.

>  -  It would be convenient to allow .ll files to be read.
Sure, I can add it tomorrow.

- You do not use the BB names in the output, right? This would
certainly>    improve readability!
You actually don't have to convert you bitcode files to 'graph'
files (as
long as they contain a single function) -- then names should be preserved,
but you don't get to play with updates. The format I'm using isn't
very
good, but it's compatible with some other implementation by Loucas -- the
author of the algorithm. Do you think that having a format like that in
LLVM would make sense? Danny and I though about in the context of quickly
writing and modifying tests for dominators and things like the NewGVN.

- My output prints "New Dominator Tree" to times in a row. What is
the> difference? What is the difference to Inorder Dominator Tree?
> When you graph files have updates (i numFrom numTo and d numFrom numTo)
then the first one is the original one and the second one is the one after
applying all the updates. Inorder Dominator Tree is the existing DomTree --
I used to use it just for visual comparison.

- How do I get the post-dominator tree with your tool? Do you
already> have test cases? In fact, normal IR test cases would be really cool. We
> do not have a lot of test coverage for all the dominance stuff.
 I haven't really played with postdominators yet. Do you have any specific
ideas on how to test it in mind? And I definitely agree, dominators need to
be tested more thoroughly. I think that because of the manual updates (of
questionable correctness) and frequent recalculations there must be many
undiscovered bugs that we just haven't had a chance to observe yet.

Best,
Kuba

On Tue, Jun 13, 2017 at 12:05 AM, Tobias Grosser <tobias.grosser at
inf.ethz.ch> wrote:
>
>
> On Tue, Jun 13, 2017, at 08:47 AM, Jakub (Kuba) Kuderski via llvm-dev
> wrote:
> > Hi Tobias,
> >
> > 1) Daniel and Chandler have for a long time been talking about
computing
> > > dominance and post-dominance in one shot to reduce the cost of
> > > post-dominance and make it (freely) available everywhere.  Is
this
> > > covered by your current (or planned) work?
> >
> > I'm not sure what you exactly mean by one shot; I'll ask
around tomorrow.
>
> Not sure what algorithm Daniel had in mind, but he was talking about
> some approach that basically gives post-dominance "for free" when
> computing dominance. Not sure how exactly this would work.
>
> >  I wanted to play a little bit with your work, but run in some issues:
> > > > bin/llvm-as
../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > > > bin/dominators -to-graph ../test/Analysis/Dominators/
> 2007-07-11-SplitBlock.bc
> > > | tee graph
> > > p 5 5 1 1
> > > a 1 3
> > > a 1 5
> > > a 2 3
> > > a 3 2
> > > a 3 4
> > > e
> > > > bin/dominators graph
> > > Unknown file format for graph
> > > Invalid input graph
> > > I must do something obvious wrong. Any idea what?
> >
> >
> > You almost got it right -- 'graph' files must end with
".txt"... :P
>
> I got it. Thank you! I also realized I can directly read in .bc files.
>
> I have a couple of immediate (but rather minor) comments:
>
> -  It would be convenient to allow .ll files to be read.
> - You do not use the BB names in the output, right? This would certainly
>    improve readability!
> - My output prints "New Dominator Tree" to times in a row. What
is the
> difference? What is the difference to Inorder Dominator Tree?
> - How do I get the post-dominator tree with your tool? Do you already
> have test cases? In fact, normal IR test cases would be really cool. We
> do not have a lot of test coverage for all the dominance stuff.
>
> Best,
> Tobias
>
> >
> > Best,
> > Kuba
> >
> > On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser
> > <tobias.grosser at inf.ethz.ch
> > > wrote:
> >
> > > Hi  Jakub,
> > >
> > > thanks for the update. I was already eagerly waiting to hear what
your
> > > internship on dominance brings. I think the numbers are already
very
> > > encouraging and I believe moving incrementally over to the new
> algorithm
> > > is exactly the right approach.
> > >
> > > I have a couple of questions:
> > >
> > > 1) Daniel and Chandler have for a long time been talking about
> computing
> > > dominance and post-dominance in one shot to reduce the cost of
> > > post-dominance and make it (freely) available everywhere.  Is
this
> > > covered by your current (or planned) work?
> > >
> > > 2) I tried to use tools/dominators
> > >
> > > I wanted to play a little bit with your work, but run in some
issues:
> > >
> > > > bin/llvm-as
../test/Analysis/Dominators/2007-07-11-SplitBlock.ll
> > > > bin/dominators -to-graph ../test/Analysis/Dominators/
> 2007-07-11-SplitBlock.bc
> > > | tee graph
> > > p 5 5 1 1
> > > a 1 3
> > > a 1 5
> > > a 2 3
> > > a 3 2
> > > a 3 4
> > > e
> > > > bin/dominators graph
> > > Unknown file format for graph
> > > Invalid input graph
> > >
> > > I must do something obvious wrong. Any idea what?
> > >
> > > Best,
> > > Tobias
> > >
> > > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via
llvm-dev
> > > wrote:
> > > > Hi folks,
> > > >
> > > > This summer I'm working on improving dominators during
my internship
> at
> > > > Google. Below is an RFC on switching to dynamic dominators,
which
> you can
> > > > also read as a Google Doc
> > > >
<https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > > yF65OTfhSMaNHQ/edit?usp=sharing>
> > > > if you prefer so. Please let us know what you think.
> > > >
> > > > ~Kuba
> > > >
> > > >
===========================================================> ==========>
> > >
> > > > *1. Context*
> > > >
> > > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to
build the
> > > > (post)dominator tree. The only way to update it is by
manually
> setting
> > > > IDoms which is not obvious in many cases and can be
extremely
> > > > error-prone.
> > > > And because updating it manually is so difficult,
programmers tend to
> > > > just
> > > > recompute it after changing the CFG (by not
AU.addPreserved()'ing the
> > > > DomTree). This causes DomTree calculation to fire very often
even if
> only
> > > > a
> > > > very small portion of it gets really affected by the changes
in CFG.
> As
> > > > an
> > > > example, when optimizing a Full LTO clang bitcode,
> > > > DominatorTreeWrapperPass
> > > > alone calls DT.recalculate over 6.5 million times, which
takes 20s
> on my
> > > > machine.
> > > >
> > > > Using an incremental algorithm it would be much easier to
keep an
> > > > up-to-date DomTree without custom update logic, which will
save us
> the
> > > > time
> > > > currently spent during DomTree recalculations and reduce the
number
> of
> > > > bugs
> > > > caused by manual updates. It would also make it feasible to
maintain
> > > > postdominators and use them more broadly, which currently
can be too
> > > > complicated and expensive.
> > > >
> > > > *2. Proposal*
> > > >
> > > > The proposal is to make dominators use the incremental
algorithm that
> > > > allows to keep (post)dominator tree up to date as CFG
changes. To
> achieve
> > > > that, we would implement the dynamic depth based search
algorithm
> (DBS)
> > > > described in this paper [1]
<https://arxiv.org/pdf/1604.02711.pdf>
> and
> > > > expose 2 main new functions: insertArc(From, To) and
deleteArc(From,
> To).
> > > > The algorithm uses SemiNCA under the hood which would
replace
> > > > Lengauer-Tarjan implementation currently used.
> > > >
> > > > The second part of the proposal is to gradually deprecate
and remove
> the
> > > > existing API for manually manipulating dominator tree
> > > > (changeImmediateDominator, addNewBlock) and replace its use
within
> LLVM
> > > > with the new incremental API.
> > > >
> > > > *3. Proof of concept*
> > > >
> > > > The prototype implementation can be found in my LLVM fork
[2]
> > > > <https://github.com/kuhar/llvm-dominators>. It comes
with several
> layers
> > > > of
> > > > verification and was tested on clang, llvm test suite and a
few open
> > > > source
> > > > libraries.
> > > > The code provides the basic functionality and tries be
> ‘API-equivalent’
> > > > with the current DomTree implementation. The NewDomTree is
also able
> to
> > > > convert itself to the current one for testing and
verification
> purposes.
> > > > Dynamic dominators are hacked into 3 existing passes
> (DomTreeWrapperPass,
> > > > LoopDeletion, LoopSimplifyCFG) to test correctness and
experiment
> with
> > > > the
> > > > use of the new API.
> > > >
> > > > The prototype also comes with a bunch of testing utilities
and a
> simple
> > > > benchmarking tool.
> > > >
> > > > *4. Performance*
> > > >
> > > > The real life performance of full dynamic DBS-based DomTree
> recalculation
> > > > is between 20% and 2% better on a machine with two Xeon
E5-2680v2
> CPUs
> > > > than
> > > > the existing SLT algorithm, which is consistent with the
findings
> from
> > > > this
> > > > thesis [3]
<ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>. The
> > > > advantage
> > > > is not that visible on debug builds, where the DBS-algorithm
can be
> up to
> > > > 8% slower. It is most like possible to speed up debug
builds, but I
> > > > haven’t
> > > > looked into that yet.
> > > > The benchmark performed [4]
> > > >
<https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > > yF65OTfhSMaNHQ/edit?usp=sharing>
> > > > loads of a (full LTO) bitcode file and builds DomTress of
all
> functions
> > > > 15
> > > > times.
> > > >
> > > > The current DomTree updates DFS in-out numbers eagerly upon
> construction,
> > > > while the new one postpones is until they are actually
needed. To
> make
> > > > the
> > > > benchmark fair, numbers were collected for both eager and
lazy
> strategy
> > > > for
> > > > the new DomTree.
> > > >
> > > > The main advantage of the incremental algorithm comes from
the fact
> that
> > > > it
> > > > allows incremental updates without rebuilding the whole
tree, not
> from
> > > > the
> > > > slightly faster construction.
> > > > What’s more, the insertArc / deleteArc API doesn’t force
updates to
> > > > happen
> > > > immediately -- they can be queued behind the scenes and
happen in
> batches
> > > > if we decide to pursue that design later.
> > > >
> > > > *5. Transition plan*
> > > >
> > > > There are several possibilities when it comes to transition.
The
> biggest
> > > > problem is that the current DomTree exposes an interface for
manual
> > > > updates
> > > > (setIDom, changeImmediateDominator), and for manual
construction
> > > > (addNewBlock). Because of the additional data stored in the
> incremental
> > > > algorithm (relative dominators, preorder parents, levels) it
is not
> > > > really
> > > > possible to use the old API while keeping this data
up-to-date. The
> > > > incremental algorithm relies on this information when
performing
> fast arc
> > > > deletions; It is still able to perform them without it --
deletions
> are
> > > > then just slower in some cases.
> > > > The most straightforward solutions are:
> > > >
> > > > a) Keep the existing DomTree and gradually replace its uses
with the
> new
> > > > one. It is possible to convert the DBS-based dominators to
the old
> ones.
> > > > b) Replace the existing DomTree with the new, dynamic
dominators.
> Nuke
> > > > all
> > > > of the old update functions and replace their uses with the
new
> > > > incremental
> > > > API in one commit.
> > > > c) Replace the existing DomTree with the new one, but keep
the old
> API
> > > > around and mark it as deprecated. If someone invokes one of
the old
> > > > update
> > > > functions, mark the the additional information as invalid
for dynamic
> > > > deletions. Follow the pessimistic but correct dynamic
deletion path
> if
> > > > the
> > > > additional information is marked as invalidated. Gradually
modernize
> the
> > > > projects to use the new API instead and then remove the old
update
> > > > functions.
> > > >
> > > > Maintaining the old DT and the new one simultaneously seems
like a
> waste
> > > > of
> > > > time as the DBS offers better or similar performance to the
existing
> > > > SLT-based implementation.
> > > > The problem with replacing the old API with the new one is
that it’s
> used
> > > > in many places (~100) across the project and I believe that
doing
> that
> > > > all
> > > > at once would be tricky to get correct. Gradual update seems
much
> easier
> > > > to
> > > > ensure correctness and the transitional API (invalid
additional data
> > > > check)
> > > > is trivial to implement. Because of that, I propose to
follow the
> option
> > > > (c).
> > > >
> > > >
> > > >
> > > > [1] Georgiadis et al., An Experimental Study of Dynamic
Dominators,
> > > > https://arxiv.org/pdf/1604.02711.pdf
> > > > [2] llvm-dominators LLVM fork on Github,
> > > > https://github.com/kuhar/llvm-dominators
> > > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and
Related
> > > > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p.
38
> > > > [4] Google Doc with the performance numbers,
> > > >
https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId
> > > yF65OTfhSMaNHQ/edit?usp=sharing
> > > > _______________________________________________
> > > > LLVM Developers mailing list
> > > > llvm-dev at lists.llvm.org
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
> >
> >
> >
> > --
> > Jakub Kuderski
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170613/d8a1a1ed/attachment.html>

llvm dev - Jun 2017 - RFC: Dynamic dominators

[llvm-dev] RFC: Dynamic dominators

[llvm-dev] RFC: Dynamic dominators

[llvm-dev] RFC: Dynamic dominators