Michael Zolotukhin via llvm-dev
2017-Dec-06 19:38 UTC
[llvm-dev] [cfe-dev] Who wants faster LLVM/Clang builds?
> On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).> Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).> > So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it. Thanks, Michael> > As Chris says, a benefit of "don't rebuild so much when editing a header file" is clearly a good benefit. > > -- > Mats > > On 6 December 2017 at 15:05, Bruce Hoult via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: > It's also likely that a lot of '#include "foo.h"' can be replaced with 'class foo;' > > Especially in the transitive inclusion case, instead of removing the #include entirely. > > > On Wed, Dec 6, 2017 at 8:38 AM, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > I, for one, want faster builds. > > Beyond that though, this seems like obvious goodness to reduce coupling in the codebase. I’ve only skimmed the patch, but this seems like a clearly amazingly great ideas. Did you use the IWYU tool or something else? > > -Chris > > >> On Dec 5, 2017, at 3:40 PM, Mikhail Zolotukhin via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi, >> >> Recently I've done some experiments on the LLVM/Clang code and discovered that many of our source files often include unnecessary header files. I wrote a simple tool that eliminates redundant includes and estimates benefits of doing it, and the results were quite nice: for some files we were able to save 90% of compile time! I think we want to apply some of the cleanups I found, but I'm not sure how to better do it: the total patches are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach them for reference). My suggestion would be that people take a look at the list of changed files and pick the changes for the piece of code they are working on if the changes look sane (the changes do need some checking before committing). Does it sound like a good idea? I'd appreciate any feedback on what can we do here. >> >> The list of files for which removing redundant headers improved compile time (the numbers are compile time in seconds for a Debug build): >> >> LLVM top 10 >> Filename Old New Delta >> lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9% >> lib/MC/MCLabel.cpp 0.19 0.02 -88.2% >> tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5% >> lib/MC/MCWinEH.cpp 0.51 0.13 -74.3% >> lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7% >> tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6% >> lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1% >> lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3% >> lib/LTO/LTOModule.cpp 2.25 1.33 -41.1% >> lib/Target/TargetMachine.cpp 1.76 1.10 -37.8% >> >> Full list: >> <llvm.txt> >> >> >> Clang top 10 >> Filename Old New Delta >> tools/libclang/CXString.cpp 1.70 0.25 -85.2% >> lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3% >> lib/AST/StmtViz.cpp 1.02 0.44 -57.4% >> tools/driver/cc1_main.cpp 2.26 0.97 -57.1% >> unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6% >> lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9% >> unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0% >> unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5% >> tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8% >> tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8% >> >> Full list: >> <clang.txt> >> >> The corresponding patches (careful, they are big): >> <llvm_redundant_headers.patch> >> <clang_redundant_headers.patch> >> >> Methodology >> My tool took the compile_commands.json from LLVM build and iterated over files trying to remove redundant headers. To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. After it we compared preprocessed file size before and after the change hoping to see that it dropped and then checked the compile time impact. >> NB: As a side effect of this approach we removed all include-lines from inactive "ifdef" sections, which means that the patches *will* break other configurations if applied as-is. >> >> Thanks, >> Michael >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171206/5f03e4f7/attachment-0001.html>
serge guelton via llvm-dev
2017-Dec-06 20:28 UTC
[llvm-dev] [cfe-dev] Who wants faster LLVM/Clang builds?
On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:> > > On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > > > In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build). > The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower). > > Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”. > By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice). > > > > So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again. > On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.my 2¢ +1 for point 2). Even leaving aside the speed gain, removing unused includes file just looks like good coding practice to me.
Matthias Braun via llvm-dev
2017-Dec-06 21:17 UTC
[llvm-dev] [cfe-dev] Who wants faster LLVM/Clang builds?
- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter). - I want faster builds. - The only way to reliably fight this is indeed automatic tools. - Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project. - However relying on transitive includes works contrary to the last "undestanding/documentation" point. - (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation. For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build... - Matthias> On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote: >> >>> On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <cfe-dev at lists.llvm.org> wrote: >>> >>> In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build). >> The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower). >>> Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”. >> By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice). >>> >>> So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again. >> On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it. > > my 2¢ > > +1 for point 2). Even leaving aside the speed gain, removing unused > includes file just looks like good coding practice to me. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171206/900e2590/attachment.html>