Jakub (Kuba) Kuderski via llvm-dev
2017-Jun-13 06:47 UTC
[llvm-dev] RFC: Dynamic dominators
Hi Tobias, 1) Daniel and Chandler have for a long time been talking about computing> dominance and post-dominance in one shot to reduce the cost of > post-dominance and make it (freely) available everywhere. Is this > covered by your current (or planned) work?I'm not sure what you exactly mean by one shot; I'll ask around tomorrow. I wanted to play a little bit with your work, but run in some issues:> > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > bin/dominators -to-graph ../test/Analysis/Dominators/2007-07-11-SplitBlock.bc > | tee graph > p 5 5 1 1 > a 1 3 > a 1 5 > a 2 3 > a 3 2 > a 3 4 > e > > bin/dominators graph > Unknown file format for graph > Invalid input graph > I must do something obvious wrong. Any idea what?You almost got it right -- 'graph' files must end with ".txt"... :P Best, Kuba On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser <tobias.grosser at inf.ethz.ch> wrote:> Hi Jakub, > > thanks for the update. I was already eagerly waiting to hear what your > internship on dominance brings. I think the numbers are already very > encouraging and I believe moving incrementally over to the new algorithm > is exactly the right approach. > > I have a couple of questions: > > 1) Daniel and Chandler have for a long time been talking about computing > dominance and post-dominance in one shot to reduce the cost of > post-dominance and make it (freely) available everywhere. Is this > covered by your current (or planned) work? > > 2) I tried to use tools/dominators > > I wanted to play a little bit with your work, but run in some issues: > > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > bin/dominators -to-graph ../test/Analysis/Dominators/2007-07-11-SplitBlock.bc > | tee graph > p 5 5 1 1 > a 1 3 > a 1 5 > a 2 3 > a 3 2 > a 3 4 > e > > bin/dominators graph > Unknown file format for graph > Invalid input graph > > I must do something obvious wrong. Any idea what? > > Best, > Tobias > > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev > wrote: > > Hi folks, > > > > This summer I'm working on improving dominators during my internship at > > Google. Below is an RFC on switching to dynamic dominators, which you can > > also read as a Google Doc > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > yF65OTfhSMaNHQ/edit?usp=sharing> > > if you prefer so. Please let us know what you think. > > > > ~Kuba > > > > ======================================================================> > > > *1. Context* > > > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build the > > (post)dominator tree. The only way to update it is by manually setting > > IDoms which is not obvious in many cases and can be extremely > > error-prone. > > And because updating it manually is so difficult, programmers tend to > > just > > recompute it after changing the CFG (by not AU.addPreserved()'ing the > > DomTree). This causes DomTree calculation to fire very often even if only > > a > > very small portion of it gets really affected by the changes in CFG. As > > an > > example, when optimizing a Full LTO clang bitcode, > > DominatorTreeWrapperPass > > alone calls DT.recalculate over 6.5 million times, which takes 20s on my > > machine. > > > > Using an incremental algorithm it would be much easier to keep an > > up-to-date DomTree without custom update logic, which will save us the > > time > > currently spent during DomTree recalculations and reduce the number of > > bugs > > caused by manual updates. It would also make it feasible to maintain > > postdominators and use them more broadly, which currently can be too > > complicated and expensive. > > > > *2. Proposal* > > > > The proposal is to make dominators use the incremental algorithm that > > allows to keep (post)dominator tree up to date as CFG changes. To achieve > > that, we would implement the dynamic depth based search algorithm (DBS) > > described in this paper [1] <https://arxiv.org/pdf/1604.02711.pdf> and > > expose 2 main new functions: insertArc(From, To) and deleteArc(From, To). > > The algorithm uses SemiNCA under the hood which would replace > > Lengauer-Tarjan implementation currently used. > > > > The second part of the proposal is to gradually deprecate and remove the > > existing API for manually manipulating dominator tree > > (changeImmediateDominator, addNewBlock) and replace its use within LLVM > > with the new incremental API. > > > > *3. Proof of concept* > > > > The prototype implementation can be found in my LLVM fork [2] > > <https://github.com/kuhar/llvm-dominators>. It comes with several layers > > of > > verification and was tested on clang, llvm test suite and a few open > > source > > libraries. > > The code provides the basic functionality and tries be ‘API-equivalent’ > > with the current DomTree implementation. The NewDomTree is also able to > > convert itself to the current one for testing and verification purposes. > > Dynamic dominators are hacked into 3 existing passes (DomTreeWrapperPass, > > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment with > > the > > use of the new API. > > > > The prototype also comes with a bunch of testing utilities and a simple > > benchmarking tool. > > > > *4. Performance* > > > > The real life performance of full dynamic DBS-based DomTree recalculation > > is between 20% and 2% better on a machine with two Xeon E5-2680v2 CPUs > > than > > the existing SLT algorithm, which is consistent with the findings from > > this > > thesis [3] <ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>. The > > advantage > > is not that visible on debug builds, where the DBS-algorithm can be up to > > 8% slower. It is most like possible to speed up debug builds, but I > > haven’t > > looked into that yet. > > The benchmark performed [4] > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > yF65OTfhSMaNHQ/edit?usp=sharing> > > loads of a (full LTO) bitcode file and builds DomTress of all functions > > 15 > > times. > > > > The current DomTree updates DFS in-out numbers eagerly upon construction, > > while the new one postpones is until they are actually needed. To make > > the > > benchmark fair, numbers were collected for both eager and lazy strategy > > for > > the new DomTree. > > > > The main advantage of the incremental algorithm comes from the fact that > > it > > allows incremental updates without rebuilding the whole tree, not from > > the > > slightly faster construction. > > What’s more, the insertArc / deleteArc API doesn’t force updates to > > happen > > immediately -- they can be queued behind the scenes and happen in batches > > if we decide to pursue that design later. > > > > *5. Transition plan* > > > > There are several possibilities when it comes to transition. The biggest > > problem is that the current DomTree exposes an interface for manual > > updates > > (setIDom, changeImmediateDominator), and for manual construction > > (addNewBlock). Because of the additional data stored in the incremental > > algorithm (relative dominators, preorder parents, levels) it is not > > really > > possible to use the old API while keeping this data up-to-date. The > > incremental algorithm relies on this information when performing fast arc > > deletions; It is still able to perform them without it -- deletions are > > then just slower in some cases. > > The most straightforward solutions are: > > > > a) Keep the existing DomTree and gradually replace its uses with the new > > one. It is possible to convert the DBS-based dominators to the old ones. > > b) Replace the existing DomTree with the new, dynamic dominators. Nuke > > all > > of the old update functions and replace their uses with the new > > incremental > > API in one commit. > > c) Replace the existing DomTree with the new one, but keep the old API > > around and mark it as deprecated. If someone invokes one of the old > > update > > functions, mark the the additional information as invalid for dynamic > > deletions. Follow the pessimistic but correct dynamic deletion path if > > the > > additional information is marked as invalidated. Gradually modernize the > > projects to use the new API instead and then remove the old update > > functions. > > > > Maintaining the old DT and the new one simultaneously seems like a waste > > of > > time as the DBS offers better or similar performance to the existing > > SLT-based implementation. > > The problem with replacing the old API with the new one is that it’s used > > in many places (~100) across the project and I believe that doing that > > all > > at once would be tricky to get correct. Gradual update seems much easier > > to > > ensure correctness and the transitional API (invalid additional data > > check) > > is trivial to implement. Because of that, I propose to follow the option > > (c). > > > > > > > > [1] Georgiadis et al., An Experimental Study of Dynamic Dominators, > > https://arxiv.org/pdf/1604.02711.pdf > > [2] llvm-dominators LLVM fork on Github, > > https://github.com/kuhar/llvm-dominators > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and Related > > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p. 38 > > [4] Google Doc with the performance numbers, > > https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > yF65OTfhSMaNHQ/edit?usp=sharing > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Jakub Kuderski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170612/f7d1d8ce/attachment.html>
On Tue, Jun 13, 2017, at 08:47 AM, Jakub (Kuba) Kuderski via llvm-dev wrote:> Hi Tobias, > > 1) Daniel and Chandler have for a long time been talking about computing > > dominance and post-dominance in one shot to reduce the cost of > > post-dominance and make it (freely) available everywhere. Is this > > covered by your current (or planned) work? > > I'm not sure what you exactly mean by one shot; I'll ask around tomorrow.Not sure what algorithm Daniel had in mind, but he was talking about some approach that basically gives post-dominance "for free" when computing dominance. Not sure how exactly this would work.> I wanted to play a little bit with your work, but run in some issues: > > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > > bin/dominators -to-graph ../test/Analysis/Dominators/2007-07-11-SplitBlock.bc > > | tee graph > > p 5 5 1 1 > > a 1 3 > > a 1 5 > > a 2 3 > > a 3 2 > > a 3 4 > > e > > > bin/dominators graph > > Unknown file format for graph > > Invalid input graph > > I must do something obvious wrong. Any idea what? > > > You almost got it right -- 'graph' files must end with ".txt"... :PI got it. Thank you! I also realized I can directly read in .bc files. I have a couple of immediate (but rather minor) comments: - It would be convenient to allow .ll files to be read. - You do not use the BB names in the output, right? This would certainly improve readability! - My output prints "New Dominator Tree" to times in a row. What is the difference? What is the difference to Inorder Dominator Tree? - How do I get the post-dominator tree with your tool? Do you already have test cases? In fact, normal IR test cases would be really cool. We do not have a lot of test coverage for all the dominance stuff. Best, Tobias> > Best, > Kuba > > On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser > <tobias.grosser at inf.ethz.ch > > wrote: > > > Hi Jakub, > > > > thanks for the update. I was already eagerly waiting to hear what your > > internship on dominance brings. I think the numbers are already very > > encouraging and I believe moving incrementally over to the new algorithm > > is exactly the right approach. > > > > I have a couple of questions: > > > > 1) Daniel and Chandler have for a long time been talking about computing > > dominance and post-dominance in one shot to reduce the cost of > > post-dominance and make it (freely) available everywhere. Is this > > covered by your current (or planned) work? > > > > 2) I tried to use tools/dominators > > > > I wanted to play a little bit with your work, but run in some issues: > > > > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > > bin/dominators -to-graph ../test/Analysis/Dominators/2007-07-11-SplitBlock.bc > > | tee graph > > p 5 5 1 1 > > a 1 3 > > a 1 5 > > a 2 3 > > a 3 2 > > a 3 4 > > e > > > bin/dominators graph > > Unknown file format for graph > > Invalid input graph > > > > I must do something obvious wrong. Any idea what? > > > > Best, > > Tobias > > > > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev > > wrote: > > > Hi folks, > > > > > > This summer I'm working on improving dominators during my internship at > > > Google. Below is an RFC on switching to dynamic dominators, which you can > > > also read as a Google Doc > > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > yF65OTfhSMaNHQ/edit?usp=sharing> > > > if you prefer so. Please let us know what you think. > > > > > > ~Kuba > > > > > > ======================================================================> > > > > > *1. Context* > > > > > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build the > > > (post)dominator tree. The only way to update it is by manually setting > > > IDoms which is not obvious in many cases and can be extremely > > > error-prone. > > > And because updating it manually is so difficult, programmers tend to > > > just > > > recompute it after changing the CFG (by not AU.addPreserved()'ing the > > > DomTree). This causes DomTree calculation to fire very often even if only > > > a > > > very small portion of it gets really affected by the changes in CFG. As > > > an > > > example, when optimizing a Full LTO clang bitcode, > > > DominatorTreeWrapperPass > > > alone calls DT.recalculate over 6.5 million times, which takes 20s on my > > > machine. > > > > > > Using an incremental algorithm it would be much easier to keep an > > > up-to-date DomTree without custom update logic, which will save us the > > > time > > > currently spent during DomTree recalculations and reduce the number of > > > bugs > > > caused by manual updates. It would also make it feasible to maintain > > > postdominators and use them more broadly, which currently can be too > > > complicated and expensive. > > > > > > *2. Proposal* > > > > > > The proposal is to make dominators use the incremental algorithm that > > > allows to keep (post)dominator tree up to date as CFG changes. To achieve > > > that, we would implement the dynamic depth based search algorithm (DBS) > > > described in this paper [1] <https://arxiv.org/pdf/1604.02711.pdf> and > > > expose 2 main new functions: insertArc(From, To) and deleteArc(From, To). > > > The algorithm uses SemiNCA under the hood which would replace > > > Lengauer-Tarjan implementation currently used. > > > > > > The second part of the proposal is to gradually deprecate and remove the > > > existing API for manually manipulating dominator tree > > > (changeImmediateDominator, addNewBlock) and replace its use within LLVM > > > with the new incremental API. > > > > > > *3. Proof of concept* > > > > > > The prototype implementation can be found in my LLVM fork [2] > > > <https://github.com/kuhar/llvm-dominators>. It comes with several layers > > > of > > > verification and was tested on clang, llvm test suite and a few open > > > source > > > libraries. > > > The code provides the basic functionality and tries be ‘API-equivalent’ > > > with the current DomTree implementation. The NewDomTree is also able to > > > convert itself to the current one for testing and verification purposes. > > > Dynamic dominators are hacked into 3 existing passes (DomTreeWrapperPass, > > > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment with > > > the > > > use of the new API. > > > > > > The prototype also comes with a bunch of testing utilities and a simple > > > benchmarking tool. > > > > > > *4. Performance* > > > > > > The real life performance of full dynamic DBS-based DomTree recalculation > > > is between 20% and 2% better on a machine with two Xeon E5-2680v2 CPUs > > > than > > > the existing SLT algorithm, which is consistent with the findings from > > > this > > > thesis [3] <ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>. The > > > advantage > > > is not that visible on debug builds, where the DBS-algorithm can be up to > > > 8% slower. It is most like possible to speed up debug builds, but I > > > haven’t > > > looked into that yet. > > > The benchmark performed [4] > > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > yF65OTfhSMaNHQ/edit?usp=sharing> > > > loads of a (full LTO) bitcode file and builds DomTress of all functions > > > 15 > > > times. > > > > > > The current DomTree updates DFS in-out numbers eagerly upon construction, > > > while the new one postpones is until they are actually needed. To make > > > the > > > benchmark fair, numbers were collected for both eager and lazy strategy > > > for > > > the new DomTree. > > > > > > The main advantage of the incremental algorithm comes from the fact that > > > it > > > allows incremental updates without rebuilding the whole tree, not from > > > the > > > slightly faster construction. > > > What’s more, the insertArc / deleteArc API doesn’t force updates to > > > happen > > > immediately -- they can be queued behind the scenes and happen in batches > > > if we decide to pursue that design later. > > > > > > *5. Transition plan* > > > > > > There are several possibilities when it comes to transition. The biggest > > > problem is that the current DomTree exposes an interface for manual > > > updates > > > (setIDom, changeImmediateDominator), and for manual construction > > > (addNewBlock). Because of the additional data stored in the incremental > > > algorithm (relative dominators, preorder parents, levels) it is not > > > really > > > possible to use the old API while keeping this data up-to-date. The > > > incremental algorithm relies on this information when performing fast arc > > > deletions; It is still able to perform them without it -- deletions are > > > then just slower in some cases. > > > The most straightforward solutions are: > > > > > > a) Keep the existing DomTree and gradually replace its uses with the new > > > one. It is possible to convert the DBS-based dominators to the old ones. > > > b) Replace the existing DomTree with the new, dynamic dominators. Nuke > > > all > > > of the old update functions and replace their uses with the new > > > incremental > > > API in one commit. > > > c) Replace the existing DomTree with the new one, but keep the old API > > > around and mark it as deprecated. If someone invokes one of the old > > > update > > > functions, mark the the additional information as invalid for dynamic > > > deletions. Follow the pessimistic but correct dynamic deletion path if > > > the > > > additional information is marked as invalidated. Gradually modernize the > > > projects to use the new API instead and then remove the old update > > > functions. > > > > > > Maintaining the old DT and the new one simultaneously seems like a waste > > > of > > > time as the DBS offers better or similar performance to the existing > > > SLT-based implementation. > > > The problem with replacing the old API with the new one is that it’s used > > > in many places (~100) across the project and I believe that doing that > > > all > > > at once would be tricky to get correct. Gradual update seems much easier > > > to > > > ensure correctness and the transitional API (invalid additional data > > > check) > > > is trivial to implement. Because of that, I propose to follow the option > > > (c). > > > > > > > > > > > > [1] Georgiadis et al., An Experimental Study of Dynamic Dominators, > > > https://arxiv.org/pdf/1604.02711.pdf > > > [2] llvm-dominators LLVM fork on Github, > > > https://github.com/kuhar/llvm-dominators > > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and Related > > > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p. 38 > > > [4] Google Doc with the performance numbers, > > > https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > yF65OTfhSMaNHQ/edit?usp=sharing > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > -- > Jakub Kuderski > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Jakub (Kuba) Kuderski via llvm-dev
2017-Jun-13 07:22 UTC
[llvm-dev] RFC: Dynamic dominators
Tobias, Thanks for the comments and taking a look at my fork.> - It would be convenient to allow .ll files to be read.Sure, I can add it tomorrow. - You do not use the BB names in the output, right? This would certainly> improve readability!You actually don't have to convert you bitcode files to 'graph' files (as long as they contain a single function) -- then names should be preserved, but you don't get to play with updates. The format I'm using isn't very good, but it's compatible with some other implementation by Loucas -- the author of the algorithm. Do you think that having a format like that in LLVM would make sense? Danny and I though about in the context of quickly writing and modifying tests for dominators and things like the NewGVN. - My output prints "New Dominator Tree" to times in a row. What is the> difference? What is the difference to Inorder Dominator Tree? >When you graph files have updates (i numFrom numTo and d numFrom numTo) then the first one is the original one and the second one is the one after applying all the updates. Inorder Dominator Tree is the existing DomTree -- I used to use it just for visual comparison. - How do I get the post-dominator tree with your tool? Do you already> have test cases? In fact, normal IR test cases would be really cool. We > do not have a lot of test coverage for all the dominance stuff.I haven't really played with postdominators yet. Do you have any specific ideas on how to test it in mind? And I definitely agree, dominators need to be tested more thoroughly. I think that because of the manual updates (of questionable correctness) and frequent recalculations there must be many undiscovered bugs that we just haven't had a chance to observe yet. Best, Kuba On Tue, Jun 13, 2017 at 12:05 AM, Tobias Grosser <tobias.grosser at inf.ethz.ch> wrote:> > > On Tue, Jun 13, 2017, at 08:47 AM, Jakub (Kuba) Kuderski via llvm-dev > wrote: > > Hi Tobias, > > > > 1) Daniel and Chandler have for a long time been talking about computing > > > dominance and post-dominance in one shot to reduce the cost of > > > post-dominance and make it (freely) available everywhere. Is this > > > covered by your current (or planned) work? > > > > I'm not sure what you exactly mean by one shot; I'll ask around tomorrow. > > Not sure what algorithm Daniel had in mind, but he was talking about > some approach that basically gives post-dominance "for free" when > computing dominance. Not sure how exactly this would work. > > > I wanted to play a little bit with your work, but run in some issues: > > > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > > > bin/dominators -to-graph ../test/Analysis/Dominators/ > 2007-07-11-SplitBlock.bc > > > | tee graph > > > p 5 5 1 1 > > > a 1 3 > > > a 1 5 > > > a 2 3 > > > a 3 2 > > > a 3 4 > > > e > > > > bin/dominators graph > > > Unknown file format for graph > > > Invalid input graph > > > I must do something obvious wrong. Any idea what? > > > > > > You almost got it right -- 'graph' files must end with ".txt"... :P > > I got it. Thank you! I also realized I can directly read in .bc files. > > I have a couple of immediate (but rather minor) comments: > > - It would be convenient to allow .ll files to be read. > - You do not use the BB names in the output, right? This would certainly > improve readability! > - My output prints "New Dominator Tree" to times in a row. What is the > difference? What is the difference to Inorder Dominator Tree? > - How do I get the post-dominator tree with your tool? Do you already > have test cases? In fact, normal IR test cases would be really cool. We > do not have a lot of test coverage for all the dominance stuff. > > Best, > Tobias > > > > > Best, > > Kuba > > > > On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser > > <tobias.grosser at inf.ethz.ch > > > wrote: > > > > > Hi Jakub, > > > > > > thanks for the update. I was already eagerly waiting to hear what your > > > internship on dominance brings. I think the numbers are already very > > > encouraging and I believe moving incrementally over to the new > algorithm > > > is exactly the right approach. > > > > > > I have a couple of questions: > > > > > > 1) Daniel and Chandler have for a long time been talking about > computing > > > dominance and post-dominance in one shot to reduce the cost of > > > post-dominance and make it (freely) available everywhere. Is this > > > covered by your current (or planned) work? > > > > > > 2) I tried to use tools/dominators > > > > > > I wanted to play a little bit with your work, but run in some issues: > > > > > > > bin/llvm-as ../test/Analysis/Dominators/2007-07-11-SplitBlock.ll > > > > bin/dominators -to-graph ../test/Analysis/Dominators/ > 2007-07-11-SplitBlock.bc > > > | tee graph > > > p 5 5 1 1 > > > a 1 3 > > > a 1 5 > > > a 2 3 > > > a 3 2 > > > a 3 4 > > > e > > > > bin/dominators graph > > > Unknown file format for graph > > > Invalid input graph > > > > > > I must do something obvious wrong. Any idea what? > > > > > > Best, > > > Tobias > > > > > > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev > > > wrote: > > > > Hi folks, > > > > > > > > This summer I'm working on improving dominators during my internship > at > > > > Google. Below is an RFC on switching to dynamic dominators, which > you can > > > > also read as a Google Doc > > > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > > yF65OTfhSMaNHQ/edit?usp=sharing> > > > > if you prefer so. Please let us know what you think. > > > > > > > > ~Kuba > > > > > > > > ===========================================================> ==========> > > > > > > > *1. Context* > > > > > > > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build the > > > > (post)dominator tree. The only way to update it is by manually > setting > > > > IDoms which is not obvious in many cases and can be extremely > > > > error-prone. > > > > And because updating it manually is so difficult, programmers tend to > > > > just > > > > recompute it after changing the CFG (by not AU.addPreserved()'ing the > > > > DomTree). This causes DomTree calculation to fire very often even if > only > > > > a > > > > very small portion of it gets really affected by the changes in CFG. > As > > > > an > > > > example, when optimizing a Full LTO clang bitcode, > > > > DominatorTreeWrapperPass > > > > alone calls DT.recalculate over 6.5 million times, which takes 20s > on my > > > > machine. > > > > > > > > Using an incremental algorithm it would be much easier to keep an > > > > up-to-date DomTree without custom update logic, which will save us > the > > > > time > > > > currently spent during DomTree recalculations and reduce the number > of > > > > bugs > > > > caused by manual updates. It would also make it feasible to maintain > > > > postdominators and use them more broadly, which currently can be too > > > > complicated and expensive. > > > > > > > > *2. Proposal* > > > > > > > > The proposal is to make dominators use the incremental algorithm that > > > > allows to keep (post)dominator tree up to date as CFG changes. To > achieve > > > > that, we would implement the dynamic depth based search algorithm > (DBS) > > > > described in this paper [1] <https://arxiv.org/pdf/1604.02711.pdf> > and > > > > expose 2 main new functions: insertArc(From, To) and deleteArc(From, > To). > > > > The algorithm uses SemiNCA under the hood which would replace > > > > Lengauer-Tarjan implementation currently used. > > > > > > > > The second part of the proposal is to gradually deprecate and remove > the > > > > existing API for manually manipulating dominator tree > > > > (changeImmediateDominator, addNewBlock) and replace its use within > LLVM > > > > with the new incremental API. > > > > > > > > *3. Proof of concept* > > > > > > > > The prototype implementation can be found in my LLVM fork [2] > > > > <https://github.com/kuhar/llvm-dominators>. It comes with several > layers > > > > of > > > > verification and was tested on clang, llvm test suite and a few open > > > > source > > > > libraries. > > > > The code provides the basic functionality and tries be > ‘API-equivalent’ > > > > with the current DomTree implementation. The NewDomTree is also able > to > > > > convert itself to the current one for testing and verification > purposes. > > > > Dynamic dominators are hacked into 3 existing passes > (DomTreeWrapperPass, > > > > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment > with > > > > the > > > > use of the new API. > > > > > > > > The prototype also comes with a bunch of testing utilities and a > simple > > > > benchmarking tool. > > > > > > > > *4. Performance* > > > > > > > > The real life performance of full dynamic DBS-based DomTree > recalculation > > > > is between 20% and 2% better on a machine with two Xeon E5-2680v2 > CPUs > > > > than > > > > the existing SLT algorithm, which is consistent with the findings > from > > > > this > > > > thesis [3] <ftp://ftp.cs.princeton.edu/reports/2005/737.pdf>. The > > > > advantage > > > > is not that visible on debug builds, where the DBS-algorithm can be > up to > > > > 8% slower. It is most like possible to speed up debug builds, but I > > > > haven’t > > > > looked into that yet. > > > > The benchmark performed [4] > > > > <https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > > yF65OTfhSMaNHQ/edit?usp=sharing> > > > > loads of a (full LTO) bitcode file and builds DomTress of all > functions > > > > 15 > > > > times. > > > > > > > > The current DomTree updates DFS in-out numbers eagerly upon > construction, > > > > while the new one postpones is until they are actually needed. To > make > > > > the > > > > benchmark fair, numbers were collected for both eager and lazy > strategy > > > > for > > > > the new DomTree. > > > > > > > > The main advantage of the incremental algorithm comes from the fact > that > > > > it > > > > allows incremental updates without rebuilding the whole tree, not > from > > > > the > > > > slightly faster construction. > > > > What’s more, the insertArc / deleteArc API doesn’t force updates to > > > > happen > > > > immediately -- they can be queued behind the scenes and happen in > batches > > > > if we decide to pursue that design later. > > > > > > > > *5. Transition plan* > > > > > > > > There are several possibilities when it comes to transition. The > biggest > > > > problem is that the current DomTree exposes an interface for manual > > > > updates > > > > (setIDom, changeImmediateDominator), and for manual construction > > > > (addNewBlock). Because of the additional data stored in the > incremental > > > > algorithm (relative dominators, preorder parents, levels) it is not > > > > really > > > > possible to use the old API while keeping this data up-to-date. The > > > > incremental algorithm relies on this information when performing > fast arc > > > > deletions; It is still able to perform them without it -- deletions > are > > > > then just slower in some cases. > > > > The most straightforward solutions are: > > > > > > > > a) Keep the existing DomTree and gradually replace its uses with the > new > > > > one. It is possible to convert the DBS-based dominators to the old > ones. > > > > b) Replace the existing DomTree with the new, dynamic dominators. > Nuke > > > > all > > > > of the old update functions and replace their uses with the new > > > > incremental > > > > API in one commit. > > > > c) Replace the existing DomTree with the new one, but keep the old > API > > > > around and mark it as deprecated. If someone invokes one of the old > > > > update > > > > functions, mark the the additional information as invalid for dynamic > > > > deletions. Follow the pessimistic but correct dynamic deletion path > if > > > > the > > > > additional information is marked as invalidated. Gradually modernize > the > > > > projects to use the new API instead and then remove the old update > > > > functions. > > > > > > > > Maintaining the old DT and the new one simultaneously seems like a > waste > > > > of > > > > time as the DBS offers better or similar performance to the existing > > > > SLT-based implementation. > > > > The problem with replacing the old API with the new one is that it’s > used > > > > in many places (~100) across the project and I believe that doing > that > > > > all > > > > at once would be tricky to get correct. Gradual update seems much > easier > > > > to > > > > ensure correctness and the transitional API (invalid additional data > > > > check) > > > > is trivial to implement. Because of that, I propose to follow the > option > > > > (c). > > > > > > > > > > > > > > > > [1] Georgiadis et al., An Experimental Study of Dynamic Dominators, > > > > https://arxiv.org/pdf/1604.02711.pdf > > > > [2] llvm-dominators LLVM fork on Github, > > > > https://github.com/kuhar/llvm-dominators > > > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and Related > > > > Problems, ftp://ftp.cs.princeton.edu/reports/2005/737.pdf p. 38 > > > > [4] Google Doc with the performance numbers, > > > > https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId > > > yF65OTfhSMaNHQ/edit?usp=sharing > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > -- > > Jakub Kuderski > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Jakub Kuderski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170613/d8a1a1ed/attachment.html>