Mehdi Amini via llvm-dev
2016-Mar-31 20:41 UTC
[llvm-dev] RFC: Large, especially super-linear, compile time regressions are bugs.
Hi, TLDR: I totally support considering compile time regression as bug. I'm glad you bring this topic. Also it is worth pointing at this recent thread: http://lists.llvm.org/pipermail/llvm-dev/2016-March/096488.html And also this blog post comparing the evolution of clang and gcc on this aspect: http://hubicka.blogspot.nl/2016/03/building-libreoffice-with-gcc-6-and-lto.html I will repeat myself here, since we also noticed internally that compile time was slowly degrading with time. Bruno and Chris are working on some infrastructure and tooling to help tracking closely compile time regressions. We had this conversation internally about the tradeoff between compile-time and runtime performance, and I planned to bring-up the topic on the list in the coming months, but was waiting for more tooling to be ready. Apparently in the past (years/decade ago?) the project was very conservative on adding any optimizations that would impact compile time, however there is no explicit policy (that I know of) to address this tradeoff. The closest I could find would be what Chandler wrote in: http://reviews.llvm.org/D12826 <http://reviews.llvm.org/D12826> ; for instance for O2 he stated that "if an optimization increases compile time by 5% or increases code size by 5% for a particular benchmark, that benchmark should also be one which sees a 5% runtime improvement". My hope is that with better tooling for tracking compile time in the future, we'll reach a state where we'll be able to consider "breaking" the compile-time regression test as important as breaking any test: i.e. the offending commit should be reverted unless it has been shown to significantly (hand wavy...) improve the runtime performance. Since you raise the discussion now, I take the opportunity to push on the "more aggressive" side: I think the policy should be a balance between the improvement the commit brings compared to the compile time slow down. Something along the line as what you wrote in my quote above. You are referring to "large" compile time regressions (aside: what is "large"?), while Bruno has graphs that shows that the compile time regressions are mostly a lot of 1-3% regressions in general, spread over tens of commits. Also (and this where we need better tooling) *unexpected* compile-time slow down are what makes me worried: i.e. the author of the commit adds something but didn't expect the compile time to be "significantly" impacted. This is motivated by Bruno/Chris data. Tracking this more closely may also help to triage thing between O2 and O3 when a commit introduces a compile time slow-down but also brings significant enough runtime improvements. -- Mehdi> On Mar 31, 2016, at 1:28 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > LLVM has a wonderful policy regarding broken commits: we revert to green. We ask that a test case be available within a reasonable time frame (preferably before, but some exceptions can be made), but otherwise we revert the offending patch, even if it contains nice features that people want, and keep the tree green. This is an awesome policy. > > I would like to suggest we adopt and follow the same policy for compile time regressions that are large, and especially for ones that are super-linear. As an example from the previous thread: > > On Mon, Mar 14, 2016 at 12:14 PM Bruno Cardoso Lopes via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > There is a possibility that r259673 could play a role here. > > > > For the buildSchedGraph() method, there is the -dag-maps-huge-region that > > has the default value of 1000. When I commited the patch, I was expecting > > people to lower this value as needed and also suggested this, but this has > > not happened. 1000 is very high, basically "unlimited". > > > > It would be interesting to see what results you get with e.g. -mllvm > > -dag-maps-huge-region=50. Of course, since this is a trade-off between > > compile time and scheduler freedom, some care should be taken before > > lowering this in trunk. > > Indeed we hit this internally, filed a PR: > https://llvm.org/bugs/show_bug.cgi?id=26940 <https://llvm.org/bugs/show_bug.cgi?id=26940> > > I think we should have rolled back r259673 as soon as the test case was available. > > Thoughts? > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160331/94af3e0d/attachment.html>
Renato Golin via llvm-dev
2016-Mar-31 21:46 UTC
[llvm-dev] RFC: Large, especially super-linear, compile time regressions are bugs.
On 31 March 2016 at 21:41, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote:> TLDR: I totally support considering compile time regression as bug.Me too. I also agree that reverting fresh and reapplying is *much* easier than trying to revert late. But I'd like to avoid dubious metrics.> The closest I could find would be what Chandler wrote in: > http://reviews.llvm.org/D12826 ; for instance for O2 he stated that "if an > optimization increases compile time by 5% or increases code size by 5% for a > particular benchmark, that benchmark should also be one which sees a 5% > runtime improvement".I think this is a bit limited and can lead to which hunts, especially wrt performance measurements. Chandler's title is perfect though... Large can be vague, but "super-linear" is not. We used to have the concept that any large super-linear (quadratic+) compile time introductions had to be in O3 or, for really bad cases, behind additional flags. I think we should keep that mindset.> My hope is that with better tooling for tracking compile time in the future, > we'll reach a state where we'll be able to consider "breaking" the > compile-time regression test as important as breaking any test: i.e. the > offending commit should be reverted unless it has been shown to > significantly (hand wavy...) improve the runtime performance.In order to have any kind of threshold, we'd have to monitor with some accuracy the performance of both compiler and compiled code for the main platforms. We do that to certain extent with the test-suite bots, but that's very far from ideal. So, I'd recommend we steer away from any kind of percentage or ratio and keep at least the quadratic changes and beyond on special flags (n.logn is ok for most cases).> Since you raise the discussion now, I take the opportunity to push on the > "more aggressive" side: I think the policy should be a balance between the > improvement the commit brings compared to the compile time slow down.This is a fallacy. Compile time often regress across all targets, while execution improvements are focused on specific targets and can have negative effects on those that were not benchmarked on. Overall, though, compile time regressions dilute over the improvements, but not on a commit per commit basis. That's what I meant by which hunt. I think we should keep an eye on those changes, ask for numbers in code review and even maybe do some benchmarking on our own before accepting it. Also, we should not commit code that we know hurts performance that badly, even if we believe people will replace them in the future. It always takes too long. I myself have done that last year, and I learnt my lesson. Metrics are often more dangerous than helpful, as they tend to be used as a substitute for thinking. My tuppence. --renato
Mehdi Amini via llvm-dev
2016-Mar-31 22:34 UTC
[llvm-dev] RFC: Large, especially super-linear, compile time regressions are bugs.
Hi Renato,> On Mar 31, 2016, at 2:46 PM, Renato Golin <renato.golin at linaro.org> wrote: > > On 31 March 2016 at 21:41, Mehdi Amini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> TLDR: I totally support considering compile time regression as bug. > > Me too. > > I also agree that reverting fresh and reapplying is *much* easier than > trying to revert late. > > But I'd like to avoid dubious metrics.I'm not sure about how "this commit regress the compile time by 2%" is a dubious metric. The metric is not dubious IMO, it is what it is: a measurement. You just have to cast a good process around it to exploit this measurement in a useful way for the project.>> The closest I could find would be what Chandler wrote in: >> http://reviews.llvm.org/D12826 ; for instance for O2 he stated that "if an >> optimization increases compile time by 5% or increases code size by 5% for a >> particular benchmark, that benchmark should also be one which sees a 5% >> runtime improvement". > > I think this is a bit limited and can lead to which hunts, especially > wrt performance measurements. > > Chandler's title is perfect though... Large can be vague, but > "super-linear" is not. We used to have the concept that any large > super-linear (quadratic+) compile time introductions had to be in O3 > or, for really bad cases, behind additional flags. I think we should > keep that mindset. > > >> My hope is that with better tooling for tracking compile time in the future, >> we'll reach a state where we'll be able to consider "breaking" the >> compile-time regression test as important as breaking any test: i.e. the >> offending commit should be reverted unless it has been shown to >> significantly (hand wavy...) improve the runtime performance. > > In order to have any kind of threshold, we'd have to monitor with some > accuracy the performance of both compiler and compiled code for the > main platforms. We do that to certain extent with the test-suite bots, > but that's very far from ideal.I agree. Did you read the part where I was mentioning that we're working in the tooling part and that I was waiting for it to be done to start this thread?> > So, I'd recommend we steer away from any kind of percentage or ratio > and keep at least the quadratic changes and beyond on special flags > (n.logn is ok for most cases).How to do you suggest we address the long trail of 1-3% slow down that lead to the current situation (cf the two links I posted in my previous email)? Because there *is* a problem here, and I'd really like someone to come up with a solution for that.>> Since you raise the discussion now, I take the opportunity to push on the >> "more aggressive" side: I think the policy should be a balance between the >> improvement the commit brings compared to the compile time slow down. > > This is a fallacy.Not sure why or what you mean? The fact that an optimization improves only some target does not invalidate the point.> > Compile time often regress across all targets, while execution > improvements are focused on specific targets and can have negative > effects on those that were not benchmarked on.Yeah, as usual in LLVM: if you care about something on your platform, setup a bot and track trunk closely, otherwise you're less of a priority.> Overall, though, > compile time regressions dilute over the improvements, but not on a > commit per commit basis. That's what I meant by which hunt.There is no "witch hunt", at least that's not my objective. I think everyone is pretty enthusiastic with every new perf improvement (I do), but just like without bot in general (and policy) we would break them all the time unintentionally. I talking about chasing and tracking every single commit were a developer would regress compile time *without even being aware*. I'd personally love to have a bot or someone emailing me with compile time regression I would introduce.> > I think we should keep an eye on those changes, ask for numbers in > code review and even maybe do some benchmarking on our own before > accepting it. Also, we should not commit code that we know hurts > performance that badly, even if we believe people will replace them in > the future. It always takes too long. I myself have done that last > year, and I learnt my lesson.Agree.> Metrics are often more dangerous than helpful, as they tend to be used > as a substitute for thinking.I don't relate this sentence to anything concrete at stance here. I think this list is full of people that are very good at thinking and won't substitute it :) Best, -- Mehdi
Sean Silva via llvm-dev
2016-Apr-01 02:14 UTC
[llvm-dev] RFC: Large, especially super-linear, compile time regressions are bugs.
On Thu, Mar 31, 2016 at 2:46 PM, Renato Golin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 31 March 2016 at 21:41, Mehdi Amini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > TLDR: I totally support considering compile time regression as bug. > > Me too. > > I also agree that reverting fresh and reapplying is *much* easier than > trying to revert late. > > But I'd like to avoid dubious metrics. > > > > The closest I could find would be what Chandler wrote in: > > http://reviews.llvm.org/D12826 ; for instance for O2 he stated that "if > an > > optimization increases compile time by 5% or increases code size by 5% > for a > > particular benchmark, that benchmark should also be one which sees a 5% > > runtime improvement". > > I think this is a bit limited and can lead to which hunts, especially > wrt performance measurements. > > Chandler's title is perfect though... Large can be vague, but > "super-linear" is not. We used to have the concept that any large > super-linear (quadratic+) compile time introductions had to be in O3 > or, for really bad cases, behind additional flags. I think we should > keep that mindset. > > > > My hope is that with better tooling for tracking compile time in the > future, > > we'll reach a state where we'll be able to consider "breaking" the > > compile-time regression test as important as breaking any test: i.e. the > > offending commit should be reverted unless it has been shown to > > significantly (hand wavy...) improve the runtime performance. > > In order to have any kind of threshold, we'd have to monitor with some > accuracy the performance of both compiler and compiled code for the > main platforms. We do that to certain extent with the test-suite bots, > but that's very far from ideal. > > So, I'd recommend we steer away from any kind of percentage or ratio > and keep at least the quadratic changes and beyond on special flags > (n.logn is ok for most cases). > > > > Since you raise the discussion now, I take the opportunity to push on the > > "more aggressive" side: I think the policy should be a balance between > the > > improvement the commit brings compared to the compile time slow down. > > This is a fallacy. > > Compile time often regress across all targets, while execution > improvements are focused on specific targets and can have negative > effects on those that were not benchmarked on. Overall, though, > compile time regressions dilute over the improvements, but not on a > commit per commit basis. That's what I meant by which hunt. > > I think we should keep an eye on those changes, ask for numbers in > code review and even maybe do some benchmarking on our own before > accepting it. Also, we should not commit code that we know hurts > performance that badly, even if we believe people will replace them in > the future. It always takes too long. I myself have done that last > year, and I learnt my lesson. > > Metrics are often more dangerous than helpful, as they tend to be used > as a substitute for thinking. >One of my favorite quotes: "The results are definitely numbers, but do they have very much at all to do with the problem?" - Forman Acton, "Numerical Methods that (usually) Work" -- Sean Silva> > My tuppence. > > --renato > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160331/777d1e57/attachment.html>
Seemingly Similar Threads
- RFC: Large, especially super-linear, compile time regressions are bugs.
- RFC: Large, especially super-linear, compile time regressions are bugs.
- RFC: Large, especially super-linear, compile time regressions are bugs.
- RFC: Large, especially super-linear, compile time regressions are bugs.
- MetaPost device?