Alexandre Isoard via llvm-dev
2018-Mar-15 15:09 UTC
[llvm-dev] Commit module to Git after each Pass
Does git-commit-after-all print correctly after all the passes? Maybe I messed it up and it skip some passes, therefore having less to do? Either that, or piping has a higher cost than writing to file. Looks like it surprisingly spends much less time in system more when going through file. Maybe that's because the file is consistently around the same size and is mmapped into memory continuously, while piping require regular (more than once per module) context switches between the two processes? Honestly, I would say something is wrong (aka. first paragraph). I didn't build that with efficiency in mind in any way... On Thu, Mar 15, 2018, 07:47 Fedor Sergeev via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hmm... > > I tried Alexandre's fix from D44244 and surprisingly it appears that > just using -print-module-scope w/o > any additional git actions is waaaay slower on my testcase than > -git-commit-module-all. > > Hell, even a plan -print-after-all is slower: > > ] time R/bin/opt -O3 some-ir.ll -disable-output -git-commit-after-all > 2>/dev/null > real 0m8.041s > user 0m7.133s > sys 0m0.936s > ] time R/bin/opt -O3 some-ir.ll -disable-output -print-after-all > 2>/dev/null > > real 0m13.575s > user 0m6.179s > sys 0m7.394s > > I cant really explain that... > > regards, > Fedor. > > On 03/15/2018 04:30 PM, Fedor Sergeev via llvm-dev wrote: > > > > > > On 03/15/2018 01:32 PM, Fedor Sergeev via llvm-dev wrote: > >> For this to be really usable in this setup we need additionally to: > >> - extend -print-module-scope to cover basic block passes > >> - introduce a clear way to separate module IRs as those are being > >> printed by -print-after-all > >> > >> But yes, it should work, and a wrapper that pipes to git fast-import > >> seems to be the best way to handle it. > > A simple 20-lines perl script does the trick pretty easily: > > https://pastebin.com/4J0b5Tr8 > > > > (this assumes my local modification to introduce the *** END OF ** IR > > DUMP marked at the end of -print-module-scope's IR module dump) > > > > ] git init > > ] RA/bin/opt -O3 some-ir.ll -disable-output -print-after-all > > -print-module-scope 2>&1 | filter-LLVM-ir-print.pl | git fast-import > > --done --date-format=now > > .... > > > > Majority of time is spent to actually print the IR (~2m for my testcase). > > Fast-import takes just a second. > > > > regards, > > Fedor. > > > >> > >> regards, > >> Fedor. > >> > >> On 03/15/2018 12:31 AM, Daniel Neilson via llvm-dev wrote: > >>> The print-module-after-all type of option exists in upstream: > >>> -print-module-scope - When > >>> printing IR for print-[before|after]{-all} always print a module IR > >>> > >>> commit 7d160f714357f6784ead669ce516e94991c12e5a > >>> Author: Fedor Sergeev <fedor.sergeev at azul.com > >>> <mailto:fedor.sergeev at azul.com>> > >>> Date: Fri Dec 1 17:42:46 2017 +0000 > >>> > >>> IR printing improvement for function passes - introducing > >>> -print-module-scope > >>> > >>> > >>> Summary: > >>> When debugging function passes it happens to be rather useful to > >>> dump > >>> the whole module before the transformation and then use this dump > >>> to analyze this single transformation by running it separately > >>> on that particular module state. > >>> > >>> > >>> Introducing > >>> -print-module-scope > >>> debugging option that forces all the function-level IR dumps > >>> to become whole-module dumps. > >>> > >>> > >>> This option builds on top of normal dumping controls like > >>> -print-before/after > >>> -filter-print-funcs > >>> > >>> > >>> The plan is to eventually extend this option to cover other > >>> local passes > >>> (at least loop passes) but that should go as a separate change. > >>> > >>> > >>> Loop passes here: > >>> commit 5608259c999fb77c5d6093895696f4daebe6b8cd > >>> Author: Fedor Sergeev <fedor.sergeev at azul.com > >>> <mailto:fedor.sergeev at azul.com>> > >>> Date: Fri Dec 1 18:33:58 2017 +0000 > >>> > >>> IR printing improvement for loop passes - handle > >>> -print-module-scope > >>> > >>> > >>> -Daniel > >>> > >>> > >>>> On Mar 14, 2018, at 3:51 PM, Philip Reames via llvm-dev > >>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > >>>> > >>>> This is interesting, and might be useful. I don't know that this > >>>> is broadly useful enough for upstream inclusion, but if you could > >>>> post this to github somewhere, I might play with it. > >>>> > >>>> There might also be room to factor out common functionality. We've > >>>> also run into the need to print whole-module instead of containing > >>>> construct (i.e. this loop). If we added upstream support for > >>>> something along the lines of -print-module-after-all, building the > >>>> git history could easily be done as a post processing step. > >>>> > >>>> Philip > >>>> > >>>> > >>>> On 03/06/2018 10:43 AM, Alexandre Isoard via llvm-dev wrote: > >>>>> Hello, > >>>>> > >>>>> I had a stupid idea recently that turned out not so stupid after > >>>>> all. I wanted to be able to "see" an entire pass pipeline in > >>>>> action to find unnecessary transformations and/or missed > >>>>> opportunities and generally improve the debug-ability of LLVM. > >>>>> > >>>>> So as the title suggest, I implemented an equivalent of > >>>>> "-print-after-all" but instead of printing into stdout I dump into > >>>>> a file that get commit into a temporary git. There are some quirks > >>>>> with it but it's working and is actually awesome. For example, at > >>>>> first sight, I see multiple time lcssa and instcombine cancelling > >>>>> each other's work. > >>>>> > >>>>> Of course, that has a big impact on compile time when enabled, but > >>>>> that's still practical (git being quite good at its job) when > >>>>> debugging. > >>>>> > >>>>> There are improvement I can make, but would you guys be interested > >>>>> in such feature? > >>>>> > >>>>> -- > >>>>> *Alexandre Isoard* > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> LLVM Developers mailing list > >>>>> llvm-dev at lists.llvm.org > >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>> > >>> > >>> > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180315/b1e3af1a/attachment.html>
Fedor Sergeev via llvm-dev
2018-Mar-15 15:25 UTC
[llvm-dev] Commit module to Git after each Pass
On 03/15/2018 06:09 PM, Alexandre Isoard wrote:> Does git-commit-after-all print correctly after all the passes? Maybe > I messed it up and it skip some passes, therefore having less to do?I did verify that total amount of lines committed to git is reasonably high: ] git rev-list master | while read cmt; do git show $cmt:some-ir.ll; done | wc -l 1587532 corresponding number for -print-after-all (w/o print-module-scope): ] time R/bin/opt -O3 some-ir.ll -disable-output -print-after-all 2>&1 | wc -l 219328 ] Also amount of commits seems to be right as well.> > Either that, or piping has a higher cost than writing to file. Looks > like it surprisingly spends much less time in system more when going > through file. Maybe that's because the file is consistently around the > same size and is mmapped into memory continuously, while piping > require regular (more than once per module) context switches between > the two processes? > > Honestly, I would say something is wrong (aka. first paragraph). I > didn't build that with efficiency in mind in any way...Well, git by itself is so focused on performance, so its not surprising to me that even using git add/git commit does not cause performance penalties. regards, Fedor.> > On Thu, Mar 15, 2018, 07:47 Fedor Sergeev via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hmm... > > I tried Alexandre's fix from D44244 and surprisingly it appears that > just using -print-module-scope w/o > any additional git actions is waaaay slower on my testcase than > -git-commit-module-all. > > Hell, even a plan -print-after-all is slower: > > ] time R/bin/opt -O3 some-ir.ll -disable-output > -git-commit-after-all > 2>/dev/null > real 0m8.041s > user 0m7.133s > sys 0m0.936s > ] time R/bin/opt -O3 some-ir.ll -disable-output -print-after-all > 2>/dev/null > > real 0m13.575s > user 0m6.179s > sys 0m7.394s > > I cant really explain that... > > regards, > Fedor. > > On 03/15/2018 04:30 PM, Fedor Sergeev via llvm-dev wrote: > > > > > > On 03/15/2018 01:32 PM, Fedor Sergeev via llvm-dev wrote: > >> For this to be really usable in this setup we need additionally to: > >> - extend -print-module-scope to cover basic block passes > >> - introduce a clear way to separate module IRs as those are being > >> printed by -print-after-all > >> > >> But yes, it should work, and a wrapper that pipes to git > fast-import > >> seems to be the best way to handle it. > > A simple 20-lines perl script does the trick pretty easily: > > https://pastebin.com/4J0b5Tr8 > > > > (this assumes my local modification to introduce the *** END OF > ** IR > > DUMP marked at the end of -print-module-scope's IR module dump) > > > > ] git init > > ] RA/bin/opt -O3 some-ir.ll -disable-output -print-after-all > > -print-module-scope 2>&1 | filter-LLVM-ir-print.pl | git fast-import > > --done --date-format=now > > .... > > > > Majority of time is spent to actually print the IR (~2m for my > testcase). > > Fast-import takes just a second. > > > > regards, > > Fedor. > > > >> > >> regards, > >> Fedor. > >> > >> On 03/15/2018 12:31 AM, Daniel Neilson via llvm-dev wrote: > >>> The print-module-after-all type of option exists in upstream: > >>> -print-module-scope - When > >>> printing IR for print-[before|after]{-all} always print a > module IR > >>> > >>> commit 7d160f714357f6784ead669ce516e94991c12e5a > >>> Author: Fedor Sergeev <fedor.sergeev at azul.com > <mailto:fedor.sergeev at azul.com> > >>> <mailto:fedor.sergeev at azul.com <mailto:fedor.sergeev at azul.com>>> > >>> Date: Fri Dec 1 17:42:46 2017 +0000 > >>> > >>> IR printing improvement for function passes - introducing > >>> -print-module-scope > >>> > >>> > >>> Summary: > >>> When debugging function passes it happens to be rather > useful to > >>> dump > >>> the whole module before the transformation and then use > this dump > >>> to analyze this single transformation by running it separately > >>> on that particular module state. > >>> > >>> > >>> Introducing > >>> -print-module-scope > >>> debugging option that forces all the function-level IR dumps > >>> to become whole-module dumps. > >>> > >>> > >>> This option builds on top of normal dumping controls like > >>> -print-before/after > >>> -filter-print-funcs > >>> > >>> > >>> The plan is to eventually extend this option to cover other > >>> local passes > >>> (at least loop passes) but that should go as a separate > change. > >>> > >>> > >>> Loop passes here: > >>> commit 5608259c999fb77c5d6093895696f4daebe6b8cd > >>> Author: Fedor Sergeev <fedor.sergeev at azul.com > <mailto:fedor.sergeev at azul.com> > >>> <mailto:fedor.sergeev at azul.com <mailto:fedor.sergeev at azul.com>>> > >>> Date: Fri Dec 1 18:33:58 2017 +0000 > >>> > >>> IR printing improvement for loop passes - handle > >>> -print-module-scope > >>> > >>> > >>> -Daniel > >>> > >>> > >>>> On Mar 14, 2018, at 3:51 PM, Philip Reames via llvm-dev > >>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>> > wrote: > >>>> > >>>> This is interesting, and might be useful. I don't know that this > >>>> is broadly useful enough for upstream inclusion, but if you could > >>>> post this to github somewhere, I might play with it. > >>>> > >>>> There might also be room to factor out common functionality. > We've > >>>> also run into the need to print whole-module instead of > containing > >>>> construct (i.e. this loop). If we added upstream support for > >>>> something along the lines of -print-module-after-all, > building the > >>>> git history could easily be done as a post processing step. > >>>> > >>>> Philip > >>>> > >>>> > >>>> On 03/06/2018 10:43 AM, Alexandre Isoard via llvm-dev wrote: > >>>>> Hello, > >>>>> > >>>>> I had a stupid idea recently that turned out not so stupid after > >>>>> all. I wanted to be able to "see" an entire pass pipeline in > >>>>> action to find unnecessary transformations and/or missed > >>>>> opportunities and generally improve the debug-ability of LLVM. > >>>>> > >>>>> So as the title suggest, I implemented an equivalent of > >>>>> "-print-after-all" but instead of printing into stdout I > dump into > >>>>> a file that get commit into a temporary git. There are some > quirks > >>>>> with it but it's working and is actually awesome. For > example, at > >>>>> first sight, I see multiple time lcssa and instcombine > cancelling > >>>>> each other's work. > >>>>> > >>>>> Of course, that has a big impact on compile time when > enabled, but > >>>>> that's still practical (git being quite good at its job) when > >>>>> debugging. > >>>>> > >>>>> There are improvement I can make, but would you guys be > interested > >>>>> in such feature? > >>>>> > >>>>> -- > >>>>> *Alexandre Isoard* > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> LLVM Developers mailing list > >>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>> > >>> > >>> > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Alexandre Isoard via llvm-dev
2018-Mar-15 15:45 UTC
[llvm-dev] Commit module to Git after each Pass
Huh. Great! 😁 I don't believe my poor excuse from earlier (else we should map all pipes into files!), but I'm curious why we spend less time in system mode when going through file than pipe. Maybe /dev/null is not as efficient as we might think? I can't believe I'm saying that... On Thu, Mar 15, 2018, 08:25 Fedor Sergeev <fedor.sergeev at azul.com> wrote:> Well, git by itself is so focused on performance, so its not surprising > to me that even using git add/git commit does not cause > performance penalties. >Sure, but still, I write more stuff (entire module) into a slower destination (file). Even ignoring git execution time it's counter intuitive. The only difference is that while I write more, it overwrite itself continuously, instead of being a long linear steam. I was thinking of mmap the file instead of going through our raw_stream, but maybe that's unnecessary then...>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180315/c303c87c/attachment.html>