Fedor Sergeev via llvm-dev
2018-Mar-15 22:21 UTC
[llvm-dev] Commit module to Git after each Pass
git-commit-after-all solution has one serious issue - it has a hardcoded git handling which makes it look problematic from many angles (picking a proper git, selecting exact way of storing information, creating repository, replacing the file etc etc). Just dumping information in a way that allows easy subsequent machine processing seems to be a more flexible, less cluttered and overall clean solution that allows to avoid making any of "user interface" decisions mentioned above. We need to understand why git-commit-after-all works faster than print-after-all. I dont believe in magic... yet :) And, btw, thanks for both the idea and the patch. regards, Fedor. On 03/16/2018 12:03 AM, Alexandre Isoard wrote:> If this is faster than -print-after-all we may actually consider > pushing that in the code base then? (after diligent code review of > course) > > Note that it uses the same printing method as -print-after-all: > - create a pass of the same pass kind as the pass we just ran > - use Module::print(raw_ostream) to print (except -print-after-all > only print the concerned part and into stdout) > > If there is improvement to be done to print-after-all it might also > improve git-commit-after-all. (unless that only improve speed when > printing constructs smaller than module) > > In any case, it is, to me, much more usable (and extensible) than > -print-after-all. But requires git to be in PATH (I'm curious if that > works on Windows). > > On Thu, Mar 15, 2018 at 1:35 PM, Daniel Sanders > <daniel_l_sanders at apple.com <mailto:daniel_l_sanders at apple.com>> wrote: > > Does https://reviews.llvm.org/D44132 > <https://reviews.llvm.org/D44132> help at all? > > >> On 15 Mar 2018, at 09:16, Philip Reames via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> The most likely answer is that the printer used by >> print-after-all is slow. I know there were some changes made >> around passing in some form of state cache (metadata related?) >> and that running printers without doing so work, but are dog >> slow. I suspect the print-after-all support was never updated. >> Look at what we do for the normal IR emission "-S" and see if >> print-after-all is out of sync. >> >> Philip >> >> >> On 03/15/2018 08:45 AM, Alexandre Isoard via llvm-dev wrote: >>> Huh. Great! 😁 >>> >>> I don't believe my poor excuse from earlier (else we should map >>> all pipes into files!), but I'm curious why we spend less time >>> in system mode when going through file than pipe. Maybe >>> /dev/null is not as efficient as we might think? I can't believe >>> I'm saying that... >>> >>> On Thu, Mar 15, 2018, 08:25 Fedor Sergeev >>> <fedor.sergeev at azul.com <mailto:fedor.sergeev at azul.com>> wrote: >>> >>> Well, git by itself is so focused on performance, so its not >>> surprising >>> to me that even using git add/git commit does not cause >>> performance penalties. >>> >>> >>> Sure, but still, I write more stuff (entire module) into a >>> slower destination (file). Even ignoring git execution time it's >>> counter intuitive. >>> >>> The only difference is that while I write more, it overwrite >>> itself continuously, instead of being a long linear steam. I was >>> thinking of mmap the file instead of going through our >>> raw_stream, but maybe that's unnecessary then... >>> >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > > -- > *Alexandre Isoard*-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180316/3d5b3466/attachment.html>
Alexandre Isoard via llvm-dev
2018-Mar-16 02:32 UTC
[llvm-dev] Commit module to Git after each Pass
On Thu, Mar 15, 2018 at 3:21 PM, Fedor Sergeev <fedor.sergeev at azul.com> wrote:> git-commit-after-all solution has one serious issue - it has a hardcoded > git handling which > makes it look problematic from many angles (picking a proper git, > selecting exact way of storing information, creating repository, replacing > the file etc etc). >True. Although that can be said of our 'dot' printing passes too (that create uncontrolable file names in an uncontrolable directory) and 'view' (that call 'dot' and 'gv' commands).> Just dumping information in a way that allows easy subsequent machine > processing > seems to be a more flexible, less cluttered and overall clean solution > that allows to avoid > making any of "user interface" decisions mentioned above. >Maybe those 'dot' and 'view' passes are bad examples too and we should print to stdout and rely on some post-processing? I never liked the fact that they pollute my directory. (also, I much prefer xdot than gv+ps as a viewer)> We need to understand why git-commit-after-all works faster than > print-after-all. > I dont believe in magic... yet :) >Yes, really curious too! :-)> And, btw, thanks for both the idea and the patch. >You are welcome. Glad that's of any use.> regards, > Fedor. >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180315/f00353e0/attachment.html>
Fedor Sergeev via llvm-dev
2018-Mar-21 10:08 UTC
[llvm-dev] Commit module to Git after each Pass
On 03/16/2018 01:21 AM, Fedor Sergeev via llvm-dev wrote: > git-commit-after-all solution has one serious issue - it has a hardcoded git handling which > makes it look problematic from many angles (picking a proper git, > selecting exact way of storing information, creating repository, replacing the file etc etc). > > Just dumping information in a way that allows easy subsequent machine processing > seems to be a more flexible, less cluttered and overall clean solution that allows to avoid > making any of "user interface" decisions mentioned above. > > We need to understand why git-commit-after-all works faster than print-after-all. Made an interesting experiment today and extended your git-commit-after-all to avoid issuing any git commands if git-repo starts with "/dev/". With git-repo=/dev/stderr it becomes functionally equivalent to print-after-all+print-module-scope, dumping module into stderr after each pass. On my testcase: # first normal git-commit-after-all execution ] rm -rf test-git; time $RR/bin/opt -O1 some-ir.ll -disable-output -git-commit-after-all -git-repo=./test-git real 0m7.172s user 0m6.303s sys 0m0.902s # then "printing" git-commit-after-all execution ] time $RR/bin/opt -O1 some-ir.ll -disable-output -git-commit-after-all -git-repo=/dev/stderr 2>&1 | grep -c '^; ModuleID' 615 real 0m2.893s user 0m2.859s sys 0m0.356s # and finally print-after-all ] time $RR/bin/opt -O1 some-ir.ll -disable-output -print-after-all -print-module-scope 2>&1 | grep -c "^; ModuleID" 526 real 2m8.024s user 0m55.933s sys 3m19.253s ] Ugh... 60x??? Now, I'm set to analyze this astonishing difference that threatens my sanity (while I'm still sane ... hopefully). regards, Fedor. PS btw, I checked /dev/null - and it works faster than /dev/stderr as expected :) > I dont believe in magic... yet :) > > And, btw, thanks for both the idea and the patch. > > regards, > Fedor. > > On 03/16/2018 12:03 AM, Alexandre Isoard wrote: >> If this is faster than -print-after-all we may actually consider pushing that in the code base then? (after diligent code review of course) >> >> Note that it uses the same printing method as -print-after-all: >> - create a pass of the same pass kind as the pass we just ran >> - use Module::print(raw_ostream) to print (except -print-after-all only print the concerned part and into stdout) >> >> If there is improvement to be done to print-after-all it might also improve git-commit-after-all. (unless that only improve speed when printing constructs smaller than module) >> >> In any case, it is, to me, much more usable (and extensible) than -print-after-all. But requires git to be in PATH (I'm curious if that works on Windows). >> >> On Thu, Mar 15, 2018 at 1:35 PM, Daniel Sanders <daniel_l_sanders at apple.com> wrote: >> >> Does https://reviews.llvm.org/D44132 help at all? >> >> >>> On 15 Mar 2018, at 09:16, Philip Reames via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> The most likely answer is that the printer used by print-after-all is slow. I know there were some changes made around passing in some form of state cache (metadata related?) and that running printers without doing so work, but are dog slow. I suspect the print-after-all support was never updated. Look at what we do for the normal IR emission "-S" and see if print-after-all is out of sync. >>> >>> Philip >>> >>> On 03/15/2018 08:45 AM, Alexandre Isoard via llvm-dev wrote: >>>> Huh. Great! 😁 >>>> >>>> I don't believe my poor excuse from earlier (else we should map all pipes into files!), but I'm curious why we spend less time in system mode when going through file than pipe. Maybe /dev/null is not as efficient as we might think? I can't believe I'm saying that... >>>> >>>> On Thu, Mar 15, 2018, 08:25 Fedor Sergeev <fedor.sergeev at azul.com> wrote: >>>> >>>> Well, git by itself is so focused on performance, so its not surprising >>>> to me that even using git add/git commit does not cause >>>> performance penalties. >>>> >>>> >>>> Sure, but still, I write more stuff (entire module) into a slower destination (file). Even ignoring git execution time it's counter intuitive. >>>> >>>> The only difference is that while I write more, it overwrite itself continuously, instead of being a long linear steam. I was thinking of mmap the file instead of going through our raw_stream, but maybe that's unnecessary then... >>>> >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> >> -- >> Alexandre Isoard > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Jeremy Lakeman via llvm-dev
2018-Mar-21 12:23 UTC
[llvm-dev] Commit module to Git after each Pass
Do you really need to write the entire module to a single file? (Hence my earlier hint...) Why not write out a separate file for each def, so you don't need to dump functions that haven't changed? On Wed, Mar 21, 2018 at 8:38 PM, Fedor Sergeev via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 03/16/2018 01:21 AM, Fedor Sergeev via llvm-dev wrote: > > git-commit-after-all solution has one serious issue - it has a hardcoded > git handling which > > makes it look problematic from many angles (picking a proper git, > > selecting exact way of storing information, creating repository, > replacing the file etc etc). > > > > Just dumping information in a way that allows easy subsequent machine > processing > > seems to be a more flexible, less cluttered and overall clean solution > that allows to avoid > > making any of "user interface" decisions mentioned above. > > > > We need to understand why git-commit-after-all works faster than > print-after-all. > Made an interesting experiment today and extended your > git-commit-after-all to avoid issuing > any git commands if git-repo starts with "/dev/". > > With git-repo=/dev/stderr it becomes functionally equivalent to > print-after-all+print-module-scope, > dumping module into stderr after each pass. > > On my testcase: > > # first normal git-commit-after-all execution > ] rm -rf test-git; time $RR/bin/opt -O1 some-ir.ll -disable-output > -git-commit-after-all -git-repo=./test-git > > real 0m7.172s > user 0m6.303s > sys 0m0.902s > # then "printing" git-commit-after-all execution > ] time $RR/bin/opt -O1 some-ir.ll -disable-output -git-commit-after-all > -git-repo=/dev/stderr 2>&1 | grep -c '^; ModuleID' > 615 > > real 0m2.893s > user 0m2.859s > sys 0m0.356s > # and finally print-after-all > ] time $RR/bin/opt -O1 some-ir.ll -disable-output -print-after-all > -print-module-scope 2>&1 | grep -c "^; ModuleID" > 526 > > real 2m8.024s > user 0m55.933s > sys 3m19.253s > ] > Ugh... 60x??? > Now, I'm set to analyze this astonishing difference that threatens my > sanity (while I'm still sane ... hopefully). > > regards, > Fedor. > PS btw, I checked /dev/null - and it works faster than /dev/stderr as > expected :) > > > > I dont believe in magic... yet :) > > > > And, btw, thanks for both the idea and the patch. > > > > regards, > > Fedor. > > > > On 03/16/2018 12:03 AM, Alexandre Isoard wrote: > >> If this is faster than -print-after-all we may actually consider > pushing that in the code base then? (after diligent code review of course) > >> > >> Note that it uses the same printing method as -print-after-all: > >> - create a pass of the same pass kind as the pass we just ran > >> - use Module::print(raw_ostream) to print (except -print-after-all only > print the concerned part and into stdout) > >> > >> If there is improvement to be done to print-after-all it might also > improve git-commit-after-all. (unless that only improve speed when printing > constructs smaller than module) > >> > >> In any case, it is, to me, much more usable (and extensible) than > -print-after-all. But requires git to be in PATH (I'm curious if that works > on Windows). > >> > >> On Thu, Mar 15, 2018 at 1:35 PM, Daniel Sanders < > daniel_l_sanders at apple.com> wrote: > >> > >> Does https://reviews.llvm.org/D44132 help at all? > >> > >> > >>> On 15 Mar 2018, at 09:16, Philip Reames via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >>> > >>> The most likely answer is that the printer used by print-after-all > is slow. I know there were some changes made around passing in some form > of state cache (metadata related?) and that running printers without doing > so work, but are dog slow. I suspect the print-after-all support was never > updated. Look at what we do for the normal IR emission "-S" and see if > print-after-all is out of sync. > >>> > >>> Philip > >>> > >>> On 03/15/2018 08:45 AM, Alexandre Isoard via llvm-dev wrote: > >>>> Huh. Great! 😁 > >>>> > >>>> I don't believe my poor excuse from earlier (else we should map > all pipes into files!), but I'm curious why we spend less time in system > mode when going through file than pipe. Maybe /dev/null is not as efficient > as we might think? I can't believe I'm saying that... > >>>> > >>>> On Thu, Mar 15, 2018, 08:25 Fedor Sergeev <fedor.sergeev at azul.com> > wrote: > >>>> > >>>> Well, git by itself is so focused on performance, so its not > surprising > >>>> to me that even using git add/git commit does not cause > >>>> performance penalties. > >>>> > >>>> > >>>> Sure, but still, I write more stuff (entire module) into a slower > destination (file). Even ignoring git execution time it's counter intuitive. > >>>> > >>>> The only difference is that while I write more, it overwrite > itself continuously, instead of being a long linear steam. I was thinking > of mmap the file instead of going through our raw_stream, but maybe that's > unnecessary then... > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>> > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> > >> > >> > >> -- > >> Alexandre Isoard > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180321/07af346a/attachment.html>
Fedor Sergeev via llvm-dev
2018-Mar-22 15:04 UTC
[llvm-dev] Commit module to Git after each Pass
Oh, well... as usually the answer appears to be pretty obvious. 99% of the time is spent inside the plain write. -print-after-all prints into llvm::errs(), which is an *unbuffered* raw_fd_stream. And -git-commit-after-all opens a *buffered* raw_fd_stream. As soon as I hacked -print-after-all to use a buffered stream to stderr performance went up to the normal expected values: ] time bin/opt -O1 big-ir.ll -disable-output -print-after-all -print-module-scope 2>&1 | grep -c "^; ModuleID" 526 real 0m2.363s user 0m2.373s sys 0m0.271s ] So, the morale of this story is - we should not be printing module IR into dbgs/errs(). And then the idea of streaming IR module dumps into a buffered stream and then postprocessing seems to be a right one. regards, Fedor. On 03/21/2018 01:08 PM, Fedor Sergeev via llvm-dev wrote:> On 03/16/2018 01:21 AM, Fedor Sergeev via llvm-dev wrote: > > git-commit-after-all solution has one serious issue - it has a > hardcoded git handling which > > makes it look problematic from many angles (picking a proper git, > > selecting exact way of storing information, creating repository, > replacing the file etc etc). > > > > Just dumping information in a way that allows easy subsequent > machine processing > > seems to be a more flexible, less cluttered and overall clean > solution that allows to avoid > > making any of "user interface" decisions mentioned above. > > > > We need to understand why git-commit-after-all works faster than > print-after-all. > Made an interesting experiment today and extended your > git-commit-after-all to avoid issuing > any git commands if git-repo starts with "/dev/". > > With git-repo=/dev/stderr it becomes functionally equivalent to > print-after-all+print-module-scope, > dumping module into stderr after each pass. > > On my testcase: > > # first normal git-commit-after-all execution > ] rm -rf test-git; time $RR/bin/opt -O1 some-ir.ll -disable-output > -git-commit-after-all -git-repo=./test-git > > real 0m7.172s > user 0m6.303s > sys 0m0.902s > # then "printing" git-commit-after-all execution > ] time $RR/bin/opt -O1 some-ir.ll -disable-output > -git-commit-after-all -git-repo=/dev/stderr 2>&1 | grep -c '^; ModuleID' > 615 > > real 0m2.893s > user 0m2.859s > sys 0m0.356s > # and finally print-after-all > ] time $RR/bin/opt -O1 some-ir.ll -disable-output -print-after-all > -print-module-scope 2>&1 | grep -c "^; ModuleID" > 526 > > real 2m8.024s > user 0m55.933s > sys 3m19.253s > ] > Ugh... 60x??? > Now, I'm set to analyze this astonishing difference that threatens my > sanity (while I'm still sane ... hopefully). > > regards, > Fedor. > PS btw, I checked /dev/null - and it works faster than /dev/stderr as > expected :) > > > I dont believe in magic... yet :) > > > > And, btw, thanks for both the idea and the patch. > > > > regards, > > Fedor. > > > > On 03/16/2018 12:03 AM, Alexandre Isoard wrote: > >> If this is faster than -print-after-all we may actually consider > pushing that in the code base then? (after diligent code review of > course) > >> > >> Note that it uses the same printing method as -print-after-all: > >> - create a pass of the same pass kind as the pass we just ran > >> - use Module::print(raw_ostream) to print (except -print-after-all > only print the concerned part and into stdout) > >> > >> If there is improvement to be done to print-after-all it might also > improve git-commit-after-all. (unless that only improve speed when > printing constructs smaller than module) > >> > >> In any case, it is, to me, much more usable (and extensible) than > -print-after-all. But requires git to be in PATH (I'm curious if that > works on Windows). > >> > >> On Thu, Mar 15, 2018 at 1:35 PM, Daniel Sanders > <daniel_l_sanders at apple.com> wrote: > >> > >> Does https://reviews.llvm.org/D44132 help at all? > >> > >> > >>> On 15 Mar 2018, at 09:16, Philip Reames via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > >>> > >>> The most likely answer is that the printer used by > print-after-all is slow. I know there were some changes made around > passing in some form of state cache (metadata related?) and that > running printers without doing so work, but are dog slow. I suspect > the print-after-all support was never updated. Look at what we do for > the normal IR emission "-S" and see if print-after-all is out of sync. > >>> > >>> Philip > >>> > >>> On 03/15/2018 08:45 AM, Alexandre Isoard via llvm-dev wrote: > >>>> Huh. Great! 😁 > >>>> > >>>> I don't believe my poor excuse from earlier (else we should > map all pipes into files!), but I'm curious why we spend less time in > system mode when going through file than pipe. Maybe /dev/null is not > as efficient as we might think? I can't believe I'm saying that... > >>>> > >>>> On Thu, Mar 15, 2018, 08:25 Fedor Sergeev > <fedor.sergeev at azul.com> wrote: > >>>> > >>>> Well, git by itself is so focused on performance, so its > not surprising > >>>> to me that even using git add/git commit does not cause > >>>> performance penalties. > >>>> > >>>> > >>>> Sure, but still, I write more stuff (entire module) into a > slower destination (file). Even ignoring git execution time it's > counter intuitive. > >>>> > >>>> The only difference is that while I write more, it overwrite > itself continuously, instead of being a long linear steam. I was > thinking of mmap the file instead of going through our raw_stream, but > maybe that's unnecessary then... > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>> > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> > >> > >> > >> -- > >> Alexandre Isoard > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev