Justin Lebar via llvm-dev
2017-Dec-06  05:59 UTC
[llvm-dev] Who wants faster LLVM/Clang builds?
> To find which header files could be removed it scanned the file for"#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. It looks like this makes us rely heavily on transitive header includes -- is that right? Specifically what I mean is, suppose foo.cc uses class1 and class2. class2.h #includes "class1.h". If foo.cc had a #include for class1.h and class2.h, this tool will remove the include of class1.h. Is that right? If so, seems like probably more aggressive than we want to be, because it makes our codebase fragile: If someone removes class1.h from class2.h, it may break foo.cc. I'm sure we can find many situations like this today, but this approach seems it would make that situation much more common. IWYU would avoid this pitfall, although I know IWYU has its own problems. Alternatively maybe if the tool only removed headers whose removal significantly decreased the preprocessed size of the file, we'd know that we actually removed "something interesting". -Justin On Tue, Dec 5, 2017 at 9:38 PM Chris Lattner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I, for one, want faster builds. > > Beyond that though, this seems like obvious goodness to reduce coupling in > the codebase. I’ve only skimmed the patch, but this seems like a clearly > amazingly great ideas. Did you use the IWYU tool or something else? > > -Chris > > > On Dec 5, 2017, at 3:40 PM, Mikhail Zolotukhin via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi, > > Recently I've done some experiments on the LLVM/Clang code and discovered > that many of our source files often include unnecessary header files. I > wrote a simple tool that eliminates redundant includes and estimates > benefits of doing it, and the results were quite nice: for some files we > were able to save 90% of compile time! I think we want to apply some of the > cleanups I found, but I'm not sure how to better do it: the total patches > are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach > them for reference). My suggestion would be that people take a look at the > list of changed files and pick the changes for the piece of code they are > working on if the changes look sane (the changes do need some checking > before committing). Does it sound like a good idea? I'd appreciate any > feedback on what can we do here. > > The list of files for which removing redundant headers improved compile > time (the numbers are compile time in seconds for a Debug build): > > *LLVM top 10* > *Filename Old New Delta* > lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9% > lib/MC/MCLabel.cpp 0.19 0.02 -88.2% > tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5% > lib/MC/MCWinEH.cpp 0.51 0.13 -74.3% > lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7% > tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6% > lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1% > lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3% > lib/LTO/LTOModule.cpp 2.25 1.33 -41.1% > lib/Target/TargetMachine.cpp 1.76 1.10 -37.8% > > Full list: > > <llvm.txt> > > > > *Clang top 10* > *Filename Old New Delta* > tools/libclang/CXString.cpp 1.70 0.25 -85.2% > lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3% > lib/AST/StmtViz.cpp 1.02 0.44 -57.4% > tools/driver/cc1_main.cpp 2.26 0.97 -57.1% > unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6% > lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9% > unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0% > unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5% > tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8% > tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8% > > Full list: > > <clang.txt> > > > The corresponding patches (careful, they are big): > > <llvm_redundant_headers.patch> > <clang_redundant_headers.patch> > > > *Methodology* > My tool took the compile_commands.json from LLVM build and iterated over > files trying to remove redundant headers. To find which header files could > be removed it scanned the file for "#include" lines and tried to remove > them one by one (checking if the file still compiles after the removal). > When there were no more include lines to remove, we verified the change > with ninja+ninja check. After it we compared preprocessed file size before > and after the change hoping to see that it dropped and then checked the > compile time impact. > NB: As a side effect of this approach we removed all include-lines from > inactive "ifdef" sections, which means that the patches **will** break > other configurations if applied as-is. > > Thanks, > Michael > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171206/52155011/attachment.html>
Michael Zolotukhin via llvm-dev
2017-Dec-06  06:08 UTC
[llvm-dev] Who wants faster LLVM/Clang builds?
> On Dec 5, 2017, at 9:59 PM, Justin Lebar <jlebar at google.com> wrote: > > > To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. > > It looks like this makes us rely heavily on transitive header includes -- is that right?Correct.> > Specifically what I mean is, suppose foo.cc uses class1 and class2. class2.h #includes "class1.h". If foo.cc had a #include for class1.h and class2.h, this tool will remove the include of class1.h. Is that right?Correct.> > If so, seems like probably more aggressive than we want to be, because it makes our codebase fragile: If someone removes class1.h from class2.h, it may break foo.cc.I completely agree, that’s why I intended this patch to be a collection of hints for real people rather than something we want to commit as-is. I think I can make the tool check the patch again and try to restore #include lines that don’t contribute to the speedup.> > I'm sure we can find many situations like this today, but this approach seems it would make that situation much more common. > > IWYU would avoid this pitfall, although I know IWYU has its own problems. Alternatively maybe if the tool only removed headers whose removal significantly decreased the preprocessed size of the file, we'd know that we actually removed "something interesting”.That’s a good idea. Thanks for the feedback! Michael> > -Justin > > On Tue, Dec 5, 2017 at 9:38 PM Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > I, for one, want faster builds. > > Beyond that though, this seems like obvious goodness to reduce coupling in the codebase. I’ve only skimmed the patch, but this seems like a clearly amazingly great ideas. Did you use the IWYU tool or something else? > > -Chris > > > >> On Dec 5, 2017, at 3:40 PM, Mikhail Zolotukhin via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> > >> Hi, >> >> Recently I've done some experiments on the LLVM/Clang code and discovered that many of our source files often include unnecessary header files. I wrote a simple tool that eliminates redundant includes and estimates benefits of doing it, and the results were quite nice: for some files we were able to save 90% of compile time! I think we want to apply some of the cleanups I found, but I'm not sure how to better do it: the total patches are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach them for reference). My suggestion would be that people take a look at the list of changed files and pick the changes for the piece of code they are working on if the changes look sane (the changes do need some checking before committing). Does it sound like a good idea? I'd appreciate any feedback on what can we do here. >> >> The list of files for which removing redundant headers improved compile time (the numbers are compile time in seconds for a Debug build): >> >> LLVM top 10 >> Filename Old New Delta >> lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9% >> lib/MC/MCLabel.cpp 0.19 0.02 -88.2% >> tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5% >> lib/MC/MCWinEH.cpp 0.51 0.13 -74.3% >> lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7% >> tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6% >> lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1% >> lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3% >> lib/LTO/LTOModule.cpp 2.25 1.33 -41.1% >> lib/Target/TargetMachine.cpp 1.76 1.10 -37.8% >> >> Full list: > >> <llvm.txt> > >> >> >> Clang top 10 >> Filename Old New Delta >> tools/libclang/CXString.cpp 1.70 0.25 -85.2% >> lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3% >> lib/AST/StmtViz.cpp 1.02 0.44 -57.4% >> tools/driver/cc1_main.cpp 2.26 0.97 -57.1% >> unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6% >> lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9% >> unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0% >> unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5% >> tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8% >> tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8% >> >> Full list: > >> <clang.txt> > >> >> The corresponding patches (careful, they are big): > >> <llvm_redundant_headers.patch> >> <clang_redundant_headers.patch> > >> >> Methodology >> My tool took the compile_commands.json from LLVM build and iterated over files trying to remove redundant headers. To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. After it we compared preprocessed file size before and after the change hoping to see that it dropped and then checked the compile time impact. >> NB: As a side effect of this approach we removed all include-lines from inactive "ifdef" sections, which means that the patches *will* break other configurations if applied as-is. >> >> Thanks, >> Michael >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/30b58629/attachment-0001.html>
Kim Gräsman via llvm-dev
2017-Dec-06  07:15 UTC
[llvm-dev] [cfe-dev] Who wants faster LLVM/Clang builds?
Hey all, IWYU maintainer here. I wanted to make a small observation. Surprisingly, IWYU will most often *add* includes to a reasonably well-factored codebase, and this ties into Chris' comment:> Beyond that though, this seems like obvious > goodness to reduce coupling in the codebase.Just blindly removing includes will probably increase coupling, not reduce it, because it optimizes for transitive dependencies. So if a refactor makes it possible to remove an include in a faraway upstream header, that will potentially break lots of code. One of IWYU's benefits, imho, is that it works against this. But it's a much harder problem, which is why it's less immediately effective :) Good to see some numbers below! - Kim Den 6 dec. 2017 7:09 fm skrev "Michael Zolotukhin via cfe-dev" < cfe-dev at lists.llvm.org>:> > > On Dec 5, 2017, at 9:59 PM, Justin Lebar <jlebar at google.com> wrote: > > > To find which header files could be removed it scanned the file for > "#include" lines and tried to remove them one by one (checking if the file > still compiles after the removal). When there were no more include lines to > remove, we verified the change with ninja+ninja check. > > It looks like this makes us rely heavily on transitive header includes -- > is that right? > > Correct. > > > Specifically what I mean is, suppose foo.cc uses class1 and class2. > class2.h #includes "class1.h". If foo.cc had a #include for class1.h and > class2.h, this tool will remove the include of class1.h. Is that right? > > Correct. > > > If so, seems like probably more aggressive than we want to be, because it > makes our codebase fragile: If someone removes class1.h from class2.h, it > may break foo.cc. > > I completely agree, that’s why I intended this patch to be a collection of > hints for real people rather than something we want to commit as-is. I > think I can make the tool check the patch again and try to restore #include > lines that don’t contribute to the speedup. > > > I'm sure we can find many situations like this today, but this approach > seems it would make that situation much more common. > > IWYU would avoid this pitfall, although I know IWYU has its own problems. > Alternatively maybe if the tool only removed headers whose removal > significantly decreased the preprocessed size of the file, we'd know that > we actually removed "something interesting”. > > That’s a good idea. Thanks for the feedback! > > Michael > > > -Justin > > On Tue, Dec 5, 2017 at 9:38 PM Chris Lattner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I, for one, want faster builds. >> >> Beyond that though, this seems like obvious goodness to reduce coupling >> in the codebase. I’ve only skimmed the patch, but this seems like a >> clearly amazingly great ideas. Did you use the IWYU tool or something else? >> >> -Chris >> >> >> On Dec 5, 2017, at 3:40 PM, Mikhail Zolotukhin via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> Recently I've done some experiments on the LLVM/Clang code and discovered >> that many of our source files often include unnecessary header files. I >> wrote a simple tool that eliminates redundant includes and estimates >> benefits of doing it, and the results were quite nice: for some files we >> were able to save 90% of compile time! I think we want to apply some of the >> cleanups I found, but I'm not sure how to better do it: the total patches >> are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach >> them for reference). My suggestion would be that people take a look at the >> list of changed files and pick the changes for the piece of code they are >> working on if the changes look sane (the changes do need some checking >> before committing). Does it sound like a good idea? I'd appreciate any >> feedback on what can we do here. >> >> The list of files for which removing redundant headers improved compile >> time (the numbers are compile time in seconds for a Debug build): >> >> *LLVM top 10* >> *Filename Old New Delta* >> lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9% >> lib/MC/MCLabel.cpp 0.19 0.02 -88.2% >> tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5% >> lib/MC/MCWinEH.cpp 0.51 0.13 -74.3% >> lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7% >> tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6% >> lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1% >> lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3% >> lib/LTO/LTOModule.cpp 2.25 1.33 -41.1% >> lib/Target/TargetMachine.cpp 1.76 1.10 -37.8% >> >> Full list: >> >> <llvm.txt> >> >> >> >> *Clang top 10* >> *Filename Old New Delta* >> tools/libclang/CXString.cpp 1.70 0.25 -85.2% >> lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3% >> lib/AST/StmtViz.cpp 1.02 0.44 -57.4% >> tools/driver/cc1_main.cpp 2.26 0.97 -57.1% >> unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6% >> lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9% >> unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0% >> unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5% >> tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8% >> tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8% >> >> Full list: >> >> <clang.txt> >> >> >> The corresponding patches (careful, they are big): >> >> <llvm_redundant_headers.patch> >> <clang_redundant_headers.patch> >> >> >> *Methodology* >> My tool took the compile_commands.json from LLVM build and iterated over >> files trying to remove redundant headers. To find which header files could >> be removed it scanned the file for "#include" lines and tried to remove >> them one by one (checking if the file still compiles after the removal). >> When there were no more include lines to remove, we verified the change >> with ninja+ninja check. After it we compared preprocessed file size before >> and after the change hoping to see that it dropped and then checked the >> compile time impact. >> NB: As a side effect of this approach we removed all include-lines from >> inactive "ifdef" sections, which means that the patches **will** break >> other configurations if applied as-is. >> >> Thanks, >> Michael >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171206/e8b23d06/attachment.html>