Rafael Espíndola via llvm-dev
2016-Mar-08 16:13 UTC
[llvm-dev] llvm and clang are getting slower
I have just benchmarked building trunk llvm and clang in Debug, Release and LTO modes (see the attached scrip for the cmake lines). The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all cases I used the system libgcc and libstdc++. For release builds there is a monotonic increase in each version. From 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc 5.3.2 takes 205 minutes. Debug and LTO show an improvement in 3.7, but have regressed again in 3.8. Cheers, Rafael -------------- next part -------------- A non-text attachment was scrubbed... Name: run.sh Type: application/x-sh Size: 936 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/60ca8a99/attachment.sh> -------------- next part -------------- A non-text attachment was scrubbed... Name: LTO.time Type: application/octet-stream Size: 262 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/60ca8a99/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: Debug.time Type: application/octet-stream Size: 259 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/60ca8a99/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: Release.time Type: application/octet-stream Size: 326 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/60ca8a99/attachment-0002.obj>
Mehdi Amini via llvm-dev
2016-Mar-08 17:40 UTC
[llvm-dev] llvm and clang are getting slower
Hi Rafael, CC: cfe-dev Thanks for sharing. We also noticed this internally, and I know that Bruno and Chris are working on some infrastructure and tooling to help tracking closely compile time regressions. We had this conversation internally about the tradeoff between compile-time and runtime performance, and I planned to bring-up the topic on the list in the coming months, this looks like a good occasion to plant the seed. Apparently in the past (years/decade ago?) the project was very conservative on adding any optimizations that would impact compile time, however there is no explicit policy (that I know of) to address this tradeoff. The closest I could find would be what Chandler wrote in: http://reviews.llvm.org/D12826 ; for instance for O2 he stated that "if an optimization increases compile time by 5% or increases code size by 5% for a particular benchmark, that benchmark should also be one which sees a 5% runtime improvement". My hope is that with better tooling for tracking compile time in the future, we'll reach a state where we'll be able to consider "breaking" the compile-time regression test as important as breaking any test: i.e. the offending commit should be reverted unless it has been shown to significantly (hand wavy...) improve the runtime performance. <troll> With the current trend, the Polly developers don't have to worry about improving their compile time, we'll catch up with them ;) </troll> -- Mehdi> On Mar 8, 2016, at 8:13 AM, Rafael Espíndola via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I have just benchmarked building trunk llvm and clang in Debug, > Release and LTO modes (see the attached scrip for the cmake lines). > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > cases I used the system libgcc and libstdc++. > > For release builds there is a monotonic increase in each version. From > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > 5.3.2 takes 205 minutes. > > Debug and LTO show an improvement in 3.7, but have regressed again in 3.8. > > Cheers, > Rafael > <run.sh><LTO.time><Debug.time><Release.time>_______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Jonas Paulsson via llvm-dev
2016-Mar-08 17:41 UTC
[llvm-dev] llvm and clang are getting slower
Hi, There is a possibility that r259673 could play a role here. For the buildSchedGraph() method, there is the -dag-maps-huge-region that has the default value of 1000. When I commited the patch, I was expecting people to lower this value as needed and also suggested this, but this has not happened. 1000 is very high, basically "unlimited". It would be interesting to see what results you get with e.g.-mllvm -dag-maps-huge-region=50. Of course, since this is a trade-off between compile time and scheduler freedom, some care should be taken before lowering this in trunk. Just a thought, Jonas On 2016-03-08 17:13, Rafael Espíndola via llvm-dev wrote:> I have just benchmarked building trunk llvm and clang in Debug, > Release and LTO modes (see the attached scrip for the cmake lines). > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > cases I used the system libgcc and libstdc++. > > For release builds there is a monotonic increase in each version. From > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > 5.3.2 takes 205 minutes. > > Debug and LTO show an improvement in 3.7, but have regressed again in 3.8. > > Cheers, > Rafael > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/bdda9438/attachment.html>
Hal Finkel via llvm-dev
2016-Mar-08 17:55 UTC
[llvm-dev] [cfe-dev] llvm and clang are getting slower
----- Original Message -----> From: "Mehdi Amini via cfe-dev" <cfe-dev at lists.llvm.org> > To: "Rafael Espíndola" <rafael.espindola at gmail.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "cfe-dev" <cfe-dev at lists.llvm.org> > Sent: Tuesday, March 8, 2016 11:40:47 AM > Subject: Re: [cfe-dev] [llvm-dev] llvm and clang are getting slower > > Hi Rafael, > > CC: cfe-dev > > Thanks for sharing. We also noticed this internally, and I know that > Bruno and Chris are working on some infrastructure and tooling to > help tracking closely compile time regressions. > > We had this conversation internally about the tradeoff between > compile-time and runtime performance, and I planned to bring-up the > topic on the list in the coming months, this looks like a good > occasion to plant the seed. Apparently in the past (years/decade > ago?) the project was very conservative on adding any optimizations > that would impact compile time, however there is no explicit policy > (that I know of) to address this tradeoff. > The closest I could find would be what Chandler wrote in: > http://reviews.llvm.org/D12826 ; for instance for O2 he stated that > "if an optimization increases compile time by 5% or increases code > size by 5% for a particular benchmark, that benchmark should also be > one which sees a 5% runtime improvement". > > My hope is that with better tooling for tracking compile time in the > future, we'll reach a state where we'll be able to consider > "breaking" the compile-time regression test as important as breaking > any test: i.e. the offending commit should be reverted unless it has > been shown to significantly (hand wavy...) improve the runtime > performance. > > <troll> > With the current trend, the Polly developers don't have to worry > about improving their compile time, we'll catch up with them ;) > </troll>My two largest pet peeves in this area are: 1. We often use functions from ValueTracking (to get known bits, the number of sign bits, etc.) as through they're low cost. They're not really low cost. The problem is that they *should* be. These functions do bottom-up walks, and could cache their results. Instead, they do a limited walk and recompute everything each time. This is expensive, and a significant amount of our InstCombine time goes to ValueTracking, and that shouldn't be the case. The more we add to InstCombine (and related passes), and the more we run InstCombine, the worse this gets. On the other hand, fixing this will help both compile time and code quality. Furthermore, BasicAA has the same problem. 2. We have "cleanup" passes in the pipeline, such as those that run after loop unrolling and/or vectorization, that run regardless of whether the preceding pass actually did anything. We've been adding more of these, and they catch important use cases, but we need a better infrastructure for this (either with the new pass manager or otherwise). Also, I'm very hopeful that as our new MemorySSA and GVN improvements materialize, we'll see large compile-time improvements from that work. We spend a huge amount of time in GVN computing memory-dependency information (the dwarfs the time spent by GVN doing actual value numbering work by an order of magnitude or more). -Hal> > -- > Mehdi > > > > > > > > On Mar 8, 2016, at 8:13 AM, Rafael Espíndola via llvm-dev > > <llvm-dev at lists.llvm.org> wrote: > > > > I have just benchmarked building trunk llvm and clang in Debug, > > Release and LTO modes (see the attached scrip for the cmake lines). > > > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > > cases I used the system libgcc and libstdc++. > > > > For release builds there is a monotonic increase in each version. > > From > > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > > 5.3.2 takes 205 minutes. > > > > Debug and LTO show an improvement in 3.7, but have regressed again > > in 3.8. > > > > Cheers, > > Rafael > > <run.sh><LTO.time><Debug.time><Release.time>_______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Richard Smith via llvm-dev
2016-Mar-08 18:42 UTC
[llvm-dev] llvm and clang are getting slower
On Tue, Mar 8, 2016 at 8:13 AM, Rafael Espíndola <llvm-dev at lists.llvm.org> wrote:> I have just benchmarked building trunk llvm and clang in Debug, > Release and LTO modes (see the attached scrip for the cmake lines). > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > cases I used the system libgcc and libstdc++. > > For release builds there is a monotonic increase in each version. From > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > 5.3.2 takes 205 minutes. > > Debug and LTO show an improvement in 3.7, but have regressed again in 3.8.I'm curious how these times divide across Clang and various parts of LLVM; rerunning with -ftime-report and summing the numbers across all compiles could be interesting.
mats petersson via llvm-dev
2016-Mar-08 18:55 UTC
[llvm-dev] llvm and clang are getting slower
I have noticed that LLVM doesn't seem to "like" large functions, as a general rule. Admittedly, my experience is similar with gcc, so I'm not sure it's something that can be easily fixed. And I'm probably sounding like a broken record, because I have said this before. My experience is that the time it takes to compile something is growing above linear with size of function. Of course, the LLVM code is growing over time, both to support more features and to support more architectures, new processor types and instruction sets, at least of which will lead to larger functions in general [and this is the function "after inlining", so splitting small 'called once' functions out doesn't really help much]. I will have a little play to see if I can identify more of a cuplrit [at the very least if it's "large basic blocks" or "large functions" that is the problem] - of course, this could be unrelated and irellevant to the problem Daniel is pointing at, and it may or may not be easily resolved... -- Mats On 8 March 2016 at 18:42, Richard Smith via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Tue, Mar 8, 2016 at 8:13 AM, Rafael Espíndola > <llvm-dev at lists.llvm.org> wrote: > > I have just benchmarked building trunk llvm and clang in Debug, > > Release and LTO modes (see the attached scrip for the cmake lines). > > > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > > cases I used the system libgcc and libstdc++. > > > > For release builds there is a monotonic increase in each version. From > > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > > 5.3.2 takes 205 minutes. > > > > Debug and LTO show an improvement in 3.7, but have regressed again in > 3.8. > > I'm curious how these times divide across Clang and various parts of > LLVM; rerunning with -ftime-report and summing the numbers across all > compiles could be interesting. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/109754cf/attachment.html>
Sean Silva via llvm-dev
2016-Mar-08 19:43 UTC
[llvm-dev] llvm and clang are getting slower
In case someone finds it useful, this is some indication of the breakdown of where time is spent during a build of Clang. tl;dr: in Debug+Asserts about 10% of time is spent in the backend and in Release without asserts (and without debug info IIRC) about 33% of time is spent in the backend. These are the charts I collected a while back breaking down the time it takes clang to compile itself. See the thread "[cfe-dev] Some DTrace probes for measuring per-file time" for how I collected this information. The raw data is basically aggregated CPU time spent textually parsing each header (and IRGen'ing them, since clang does that as it parses. There are also a couple "phony" headers to cover stuff like the backend/optimizer. Since there a large number of files, the pie charts below are grouped into rough categories. E.g. the "llvm headers" includes the time spent on include/llvm/Support/raw_ostream.h and all other headers in include/llvm. The "libc++" pie slice contains the time spent in the libc++ system headers (this data was collected on a mac, so libc++ was the C++ standard library). "system" are C system headers. All time spent inside the LLVM optimizer is in the "after parsing" pie slice. Debug with asserts: [image: Inline image 1] Release without asserts (and without debug info IIRC): [image: Inline image 2] -- Sean Silva On Tue, Mar 8, 2016 at 8:13 AM, Rafael Espíndola <llvm-dev at lists.llvm.org> wrote:> I have just benchmarked building trunk llvm and clang in Debug, > Release and LTO modes (see the attached scrip for the cmake lines). > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > cases I used the system libgcc and libstdc++. > > For release builds there is a monotonic increase in each version. From > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > 5.3.2 takes 205 minutes. > > Debug and LTO show an improvement in 3.7, but have regressed again in 3.8. > > Cheers, > Rafael > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/c020a549/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: LLVM release without asserts.png Type: image/png Size: 59043 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/c020a549/attachment-0002.png> -------------- next part -------------- A non-text attachment was scrubbed... Name: LLVM for_zygoloid (default CMake config).png Type: image/png Size: 60513 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/c020a549/attachment-0003.png>
Sean Silva via llvm-dev
2016-Mar-08 21:09 UTC
[llvm-dev] llvm and clang are getting slower
On Tue, Mar 8, 2016 at 10:42 AM, Richard Smith via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Tue, Mar 8, 2016 at 8:13 AM, Rafael Espíndola > <llvm-dev at lists.llvm.org> wrote: > > I have just benchmarked building trunk llvm and clang in Debug, > > Release and LTO modes (see the attached scrip for the cmake lines). > > > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > > cases I used the system libgcc and libstdc++. > > > > For release builds there is a monotonic increase in each version. From > > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > > 5.3.2 takes 205 minutes. > > > > Debug and LTO show an improvement in 3.7, but have regressed again in > 3.8. > > I'm curious how these times divide across Clang and various parts of > LLVM; rerunning with -ftime-report and summing the numbers across all > compiles could be interesting. >Based on the results I posted upthread about the relative time spend in the backend for debug vs release, we can estimate this. To summarize: 10% of time spent in LLVM for Debug 33% of time spent in LLVM for Release (I'll abbreviate "in LLVM" as just "backend"; this is "backend" from clang's perspective) Let's look at the difference between 3.5 and trunk. For debug, the user time jumps from 174m50.251s to 197m9.932s. That's {10490.3, 11829.9} seconds, respectively. For release, the corresponding numbers are: {9826.71, 12714.3} seconds. debug35 = 10490.251 debugTrunk = 11829.932 debugTrunk/debug35 == 1.12771 debugRatio = 1.12771 release35 = 9826.705 releaseTrunk = 12714.288 releaseTrunk/release35 == 1.29385 releaseRatio = 1.29385 For simplicity, let's use a simple linear model for the distribution of slowdown between the frontend and backend: a constant factor slowdown for the backend, and an independent constant factor slowdown for the frontend. This gives the following linear system: debugRatio = .1 * backendRatio + (1 - .1) * frontendRatio releaseRatio = .33 * backendRatio + (1 - .33) * frontendRatio Solving this linear system we find that under this simple model, the expected slowdown factors are: backendRatio = 1.77783 frontendRatio = 1.05547 Intuitively, backendRatio comes out larger in this comparison because we see the biggest slowdown during release (1.29 vs 1.12), and during release we are spending a larger fraction of time in the backend (33% vs 10%). Applying this same model to across Rafael's data, we find the following (numbers have been rounded for clarity): transition backendRatio frontendRatio 3.5->3.6 1.08 1.03 3.6->3.7 1.30 0.95 3.7->3.8 1.34 1.07 3.8->trunk 0.98 1.02 Note that in Rafael's measurements LTO is pretty similar to Release from a CPU time (user time) standpoint. While the final LTO link takes a large amount of real time, it is single threaded. Based on the real time numbers the LTO link was only spending about 20 minutes single-threaded (i.e. about 20 minutes CPU time), which is pretty small compared to the 300-400 minutes of total CPU time. It would be interesting to see the numbers for -O0 or -O1 per-TU together with LTO. -- Sean Silva> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/7a77c758/attachment.html>
Bruno Cardoso Lopes via llvm-dev
2016-Mar-14 19:14 UTC
[llvm-dev] llvm and clang are getting slower
Hi,> There is a possibility that r259673 could play a role here. > > For the buildSchedGraph() method, there is the -dag-maps-huge-region that > has the default value of 1000. When I commited the patch, I was expecting > people to lower this value as needed and also suggested this, but this has > not happened. 1000 is very high, basically "unlimited". > > It would be interesting to see what results you get with e.g. -mllvm > -dag-maps-huge-region=50. Of course, since this is a trade-off between > compile time and scheduler freedom, some care should be taken before > lowering this in trunk.Indeed we hit this internally, filed a PR: https://llvm.org/bugs/show_bug.cgi?id=26940 As a general comment on this thread and as mentioned by Mehdi, we care a lot about compile time and we're looking forward to contribute more in this area in the following months; by collecting compile time testcases into a testsuite and publicly tracking results on those we should be able to start a RFC on a tradeoff policy. -- Bruno Cardoso Lopes http://www.brunocardoso.cc
Jack Howarth via llvm-dev
2016-Mar-23 15:39 UTC
[llvm-dev] llvm and clang are getting slower
Honza recently posted some benchmarks for building libreoffice with GCC 6 and LTO and found a similar compile time regression for recent llvm trunk... http://hubicka.blogspot.nl/2016/03/building-libreoffice-with-gcc-6-and-lto.html#more Compared to llvm 3.5,0. the builds with llvm 3.9.0 svn were 24% slower. On Tue, Mar 8, 2016 at 11:13 AM, Rafael Espíndola <llvm-dev at lists.llvm.org> wrote:> I have just benchmarked building trunk llvm and clang in Debug, > Release and LTO modes (see the attached scrip for the cmake lines). > > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all > cases I used the system libgcc and libstdc++. > > For release builds there is a monotonic increase in each version. From > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc > 5.3.2 takes 205 minutes. > > Debug and LTO show an improvement in 3.7, but have regressed again in 3.8. > > Cheers, > Rafael > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >