On Wed, Jul 8, 2015 at 1:46 PM Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Xinliang David Li" <davidxl at google.com> > > To: "Chandler Carruth" <chandlerc at gmail.com> > > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" < > llvmdev at cs.uiuc.edu> > > Sent: Wednesday, July 8, 2015 12:25:18 AM > > Subject: Re: [LLVMdev] Inline hint for methods defined in-class > > > > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth > > <chandlerc at gmail.com> wrote: > > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> > > > wrote: > > >> > > >> I'm reviving this thread after a while and CCing cfe-commits as > > >> suggested by David Blaikie. I've also collected numbers building > > >> chrome (from chromium, on Linux) with and without this patch as > > >> suggested by David. I've re-posted the proposed patch and > > >> performance/size numbers collected at the top to make it easily > > >> readable for those reading it through cfe-commits. > > > > > > > > > First off, thanks for collecting the numbers and broadening the > > > distribution. Also, sorry it took me so long to get over to this > > > thread. > > > > > > I want to lay out my stance on this issue at a theoretical and > > > practical > > > level first. I'll follow up with thoughts on the numbers as well > > > after that. > > > > > > I think that tying *any* optimizer behavior to the 'inline' keyword > > > is > > > fundamentally the wrong direction. > > > > Chandler, thanks for sharing your thought -- however I don't think it > > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > > that can be done, the inliner is in the end heuristic based. In lack > > of the profile data, when inlining two calls yield the same static > > benefit and size cost, it is reasonable for the inliner to think the > > call to the function with inline hint to yield more high > > dynamic/runtime benefit -- thus it has a higher static size budget to > > burn. > > > > >We have reasons why we have done this > > > historically, and we can't just do an immediate about face, but we > > > should be > > > actively looking for ways to *reduce* the optimizer's reliance on > > > this > > > keyword to convey any meaning whatsoever. > > > > yes those additional things will be done, but they are really > > orthogonal. > > > > > > > > The reason I think that is the correct direction is because, for > > > better or > > > worse, the 'inline' keyword in C++ is not about optimization, but > > > about > > > linkage. > > > > It is about both optimization and linkage. In fact the linkage simply > > serves as an implementation detail. In C++ standard 7.1.2, paragraph > > 2 says: > > The fact that C++ combines, into one keyword, a change in semantics > (linkage) and an optimization hint is quite unfortunate. I wish it were > otherwise.We could work to change it? I specifically proposed adding a way to move away from this unfortunate design.> However, as it stands, I support this change. The benchmark numbers are > encouraging, and it replaces an implementation quirk with the underlying > (unfortunate) language design choice. The implementation quirk is that > putting the inline keyword on an in-class function definition changes the > behavior of the optimizer. However, according to the language > specification, that definition should have implied that keyword. While an > implementation is certainly free to do arbitrary things with hints, this > behavior violates the spirit of the language specification.I strongly disagree that this is the spirit of the language specification. Even if it was historically, I think we should move away from that. The language shouldn't be trying to do this with a language keyword, and it shouldn't be coupling semantics to hints. I'm very happy to take this up with the committee, but I don't see why we shouldn't push Clang in that direction here when there is no issue of conformance. To see how broken this is, let's look at how miserably small the difference is between the un-hinted and hinted thresholds. We've ended up shrinking this difference over time in LLVM because increasing the hinted threshold caused lots of performance regressions and size regressions.> It makes a meaningless use of a standardized keyword meaningful, and > that's the greater transgression.So here is what I want to do: 1) Add a non-semantic attribute that conveys this hint. We could even convey a much *stronger* hint with this rather than just a tiny hint the way it is today because it wouldn't end up being forced onto every template regardless of whether that makes sense. 2) Start lobbying to remove the hint from the 'inline' keyword by working with the people who see regressions from this to use the new annotation to recover the performance. 3) Completely remove the semantic coupling of the optimizer hint and fix the meaningless use of the standardized keyword at the same time. But the more places where we use the inline hint today, the harder #2 will become. I've already tried once before to remove the hint and couldn't because of benchmarks that had been tightly tuned and coupled to the existing (quirky) behavior. I really think that doing this more will make getting to #3 harder. Making progress toward a cleaner design harder seems worse than coping with the quirks that have existed in Clang for over 5 years for another few years.> In addition, it does tend to be the case that in-class function > definitions are small and suitable for inlining. >But if they are small and suitable for inlining, won't the existing threshold work just fine?> > I agree. A 1% performance increase is worth a 4% code-size increase when > not optimizing for size. >I really don't. Maybe in -O3 or something, but a 4% code-size increase is a hard regression to swallow. Especially in a single benchmark, and where many other benchmarks show no benefit at all. This isn't a matter of "most code gets better, so we need to tolerate the unfortunate variance of size". I know its not really "fair" to view regressions as more important than missed opportunities, but the reality is that regressions *are* more problematic than missed opportunities. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/13b35053/attachment.html>
Xinliang David Li
2015-Jul-09 07:00 UTC
[LLVMdev] Inline hint for methods defined in-class
On Wed, Jul 8, 2015 at 10:40 PM, Chandler Carruth <chandlerc at google.com> wrote:> On Wed, Jul 8, 2015 at 1:46 PM Hal Finkel <hfinkel at anl.gov> wrote: > >> ----- Original Message ----- >> > From: "Xinliang David Li" <davidxl at google.com> >> > To: "Chandler Carruth" <chandlerc at gmail.com> >> > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" < >> llvmdev at cs.uiuc.edu> >> > Sent: Wednesday, July 8, 2015 12:25:18 AM >> > Subject: Re: [LLVMdev] Inline hint for methods defined in-class >> > >> > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth >> > <chandlerc at gmail.com> wrote: >> > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> >> > > wrote: >> > >> >> > >> I'm reviving this thread after a while and CCing cfe-commits as >> > >> suggested by David Blaikie. I've also collected numbers building >> > >> chrome (from chromium, on Linux) with and without this patch as >> > >> suggested by David. I've re-posted the proposed patch and >> > >> performance/size numbers collected at the top to make it easily >> > >> readable for those reading it through cfe-commits. >> > > >> > > >> > > First off, thanks for collecting the numbers and broadening the >> > > distribution. Also, sorry it took me so long to get over to this >> > > thread. >> > > >> > > I want to lay out my stance on this issue at a theoretical and >> > > practical >> > > level first. I'll follow up with thoughts on the numbers as well >> > > after that. >> > > >> > > I think that tying *any* optimizer behavior to the 'inline' keyword >> > > is >> > > fundamentally the wrong direction. >> > >> > Chandler, thanks for sharing your thought -- however I don't think it >> > is wrong, let alone 'fundamentally wrong'. Despite all the analysis >> > that can be done, the inliner is in the end heuristic based. In lack >> > of the profile data, when inlining two calls yield the same static >> > benefit and size cost, it is reasonable for the inliner to think the >> > call to the function with inline hint to yield more high >> > dynamic/runtime benefit -- thus it has a higher static size budget to >> > burn. >> > >> > >We have reasons why we have done this >> > > historically, and we can't just do an immediate about face, but we >> > > should be >> > > actively looking for ways to *reduce* the optimizer's reliance on >> > > this >> > > keyword to convey any meaning whatsoever. >> > >> > yes those additional things will be done, but they are really >> > orthogonal. >> > >> > > >> > > The reason I think that is the correct direction is because, for >> > > better or >> > > worse, the 'inline' keyword in C++ is not about optimization, but >> > > about >> > > linkage. >> > >> > It is about both optimization and linkage. In fact the linkage simply >> > serves as an implementation detail. In C++ standard 7.1.2, paragraph >> > 2 says: >> >> The fact that C++ combines, into one keyword, a change in semantics >> (linkage) and an optimization hint is quite unfortunate. I wish it were >> otherwise. > > > We could work to change it? I specifically proposed adding a way to move > away from this unfortunate design. > > >> However, as it stands, I support this change. The benchmark numbers are >> encouraging, and it replaces an implementation quirk with the underlying >> (unfortunate) language design choice. The implementation quirk is that >> putting the inline keyword on an in-class function definition changes the >> behavior of the optimizer. However, according to the language >> specification, that definition should have implied that keyword. While an >> implementation is certainly free to do arbitrary things with hints, this >> behavior violates the spirit of the language specification. > > > I strongly disagree that this is the spirit of the language specification. > Even if it was historically, I think we should move away from that. The > language shouldn't be trying to do this with a language keyword, and it > shouldn't be coupling semantics to hints. I'm very happy to take this up > with the committee, but I don't see why we shouldn't push Clang in that > direction here when there is no issue of conformance. > > To see how broken this is, let's look at how miserably small the > difference is between the un-hinted and hinted thresholds. We've ended up > shrinking this difference over time in LLVM because increasing the hinted > threshold caused lots of performance regressions and size regressions. > > >> It makes a meaningless use of a standardized keyword meaningful, and >> that's the greater transgression. > > > So here is what I want to do: > > 1) Add a non-semantic attribute that conveys this hint. We could even > convey a much *stronger* hint with this rather than just a tiny hint the > way it is today because it wouldn't end up being forced onto every template > regardless of whether that makes sense. > > 2) Start lobbying to remove the hint from the 'inline' keyword by working > with the people who see regressions from this to use the new annotation to > recover the performance. > > 3) Completely remove the semantic coupling of the optimizer hint and fix > the meaningless use of the standardized keyword at the same time. > > But the more places where we use the inline hint today, the harder #2 will > become. I've already tried once before to remove the hint and couldn't > because of benchmarks that had been tightly tuned and coupled to the > existing (quirky) behavior. I really think that doing this more will make > getting to #3 harder. Making progress toward a cleaner design harder seems > worse than coping with the quirks that have existed in Clang for over 5 > years for another few years. > > >> In addition, it does tend to be the case that in-class function >> definitions are small and suitable for inlining. >> > > But if they are small and suitable for inlining, won't the existing > threshold work just fine? > > >> >> I agree. A 1% performance increase is worth a 4% code-size increase when >> not optimizing for size. >> > > I really don't. Maybe in -O3 or something, but a 4% code-size increase is > a hard regression to swallow. Especially in a single benchmark, and where > many other benchmarks show no benefit at all. This isn't a matter of "most > code gets better, so we need to tolerate the unfortunate variance of size". >Suppose we take out the specially treatment of 'inline' hint completely in inliner which results in 1% loss in clang and size improvement of 4%, would you be ok to take that change?> > I know its not really "fair" to view regressions as more important than > missed opportunities, but the reality is that regressions *are* more > problematic than missed opportunities. > >You need to be more specific on why the regression is problematic (and more so than missed opportunities). Text size increase (shared and much smaller than heap) is usually not a big issue except for tiny devices which is likely to use Os/Oz in the build. For servers, there are are simple rules to compare cpu vs ram resource cost. David> _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/912da932/attachment.html>
On 9 Jul 2015, at 08:00, Xinliang David Li <xinliangli at gmail.com> wrote:> > You need to be more specific on why the regression is problematic (and more so than missed opportunities). Text size increase (shared and much smaller than heap) is usually not a big issue except for tiny devices which is likely to use Os/Oz in the build. For servers, there are are simple rules to compare cpu vs ram resource cost.Text size increase also means more TLB and i-cache misses. We’ve observed that single-program benchmarks show this quite poorly, but the aggregate increase in i-cache and TLB pressure across the entire system can degrade performance overall on a typical desktop workload (not so relevant for single-application servers, but very relevant for server VMs). I believe some folks at Apple did a more systematic analysis of this, but I don’t know if their detailed results are public. David
On Wed, Jul 8, 2015 at 10:40 PM, Chandler Carruth <chandlerc at google.com> wrote:> On Wed, Jul 8, 2015 at 1:46 PM Hal Finkel <hfinkel at anl.gov> wrote: >> >> ----- Original Message ----- >> > From: "Xinliang David Li" <davidxl at google.com> >> > To: "Chandler Carruth" <chandlerc at gmail.com> >> > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" >> > <llvmdev at cs.uiuc.edu> >> > Sent: Wednesday, July 8, 2015 12:25:18 AM >> > Subject: Re: [LLVMdev] Inline hint for methods defined in-class >> > >> > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth >> > <chandlerc at gmail.com> wrote: >> > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> >> > > wrote: >> > >> >> > >> I'm reviving this thread after a while and CCing cfe-commits as >> > >> suggested by David Blaikie. I've also collected numbers building >> > >> chrome (from chromium, on Linux) with and without this patch as >> > >> suggested by David. I've re-posted the proposed patch and >> > >> performance/size numbers collected at the top to make it easily >> > >> readable for those reading it through cfe-commits. >> > > >> > > >> > > First off, thanks for collecting the numbers and broadening the >> > > distribution. Also, sorry it took me so long to get over to this >> > > thread. >> > > >> > > I want to lay out my stance on this issue at a theoretical and >> > > practical >> > > level first. I'll follow up with thoughts on the numbers as well >> > > after that. >> > > >> > > I think that tying *any* optimizer behavior to the 'inline' keyword >> > > is >> > > fundamentally the wrong direction. >> > >> > Chandler, thanks for sharing your thought -- however I don't think it >> > is wrong, let alone 'fundamentally wrong'. Despite all the analysis >> > that can be done, the inliner is in the end heuristic based. In lack >> > of the profile data, when inlining two calls yield the same static >> > benefit and size cost, it is reasonable for the inliner to think the >> > call to the function with inline hint to yield more high >> > dynamic/runtime benefit -- thus it has a higher static size budget to >> > burn. >> > >> > >We have reasons why we have done this >> > > historically, and we can't just do an immediate about face, but we >> > > should be >> > > actively looking for ways to *reduce* the optimizer's reliance on >> > > this >> > > keyword to convey any meaning whatsoever. >> > >> > yes those additional things will be done, but they are really >> > orthogonal. >> > >> > > >> > > The reason I think that is the correct direction is because, for >> > > better or >> > > worse, the 'inline' keyword in C++ is not about optimization, but >> > > about >> > > linkage. >> > >> > It is about both optimization and linkage. In fact the linkage simply >> > serves as an implementation detail. In C++ standard 7.1.2, paragraph >> > 2 says: >> >> The fact that C++ combines, into one keyword, a change in semantics >> (linkage) and an optimization hint is quite unfortunate. I wish it were >> otherwise. > > > We could work to change it? I specifically proposed adding a way to move > away from this unfortunate design. > >> >> However, as it stands, I support this change. The benchmark numbers are >> encouraging, and it replaces an implementation quirk with the underlying >> (unfortunate) language design choice. The implementation quirk is that >> putting the inline keyword on an in-class function definition changes the >> behavior of the optimizer. However, according to the language specification, >> that definition should have implied that keyword. While an implementation is >> certainly free to do arbitrary things with hints, this behavior violates the >> spirit of the language specification. > > > I strongly disagree that this is the spirit of the language specification. > Even if it was historically, I think we should move away from that. The > language shouldn't be trying to do this with a language keyword, and it > shouldn't be coupling semantics to hints. I'm very happy to take this up > with the committee, but I don't see why we shouldn't push Clang in that > direction here when there is no issue of conformance. > > To see how broken this is, let's look at how miserably small the difference > is between the un-hinted and hinted thresholds. We've ended up shrinking > this difference over time in LLVM because increasing the hinted threshold > caused lots of performance regressions and size regressions. > >> >> It makes a meaningless use of a standardized keyword meaningful, and >> that's the greater transgression. > > > So here is what I want to do: > > 1) Add a non-semantic attribute that conveys this hint. We could even convey > a much *stronger* hint with this rather than just a tiny hint the way it is > today because it wouldn't end up being forced onto every template regardless > of whether that makes sense. > > 2) Start lobbying to remove the hint from the 'inline' keyword by working > with the people who see regressions from this to use the new annotation to > recover the performance. > > 3) Completely remove the semantic coupling of the optimizer hint and fix the > meaningless use of the standardized keyword at the same time. > > But the more places where we use the inline hint today, the harder #2 will > become. I've already tried once before to remove the hint and couldn't > because of benchmarks that had been tightly tuned and coupled to the > existing (quirky) behavior. I really think that doing this more will make > getting to #3 harder. Making progress toward a cleaner design harder seems > worse than coping with the quirks that have existed in Clang for over 5 > years for another few years. > >> >> In addition, it does tend to be the case that in-class function >> definitions are small and suitable for inlining. > > > But if they are small and suitable for inlining, won't the existing > threshold work just fine? > >> >> >> I agree. A 1% performance increase is worth a 4% code-size increase when >> not optimizing for size. > > > I really don't. Maybe in -O3 or something, but a 4% code-size increase is a > hard regression to swallow. Especially in a single benchmark, and where many > other benchmarks show no benefit at all. This isn't a matter of "most code > gets better, so we need to tolerate the unfortunate variance of size".The numbers I presented under Google internal benchmarks is a geomean of 20 benchmarks and many of them show benefit (and some show a performance regression). In terms of size, only clang shows > 1% size increase, so that's the real outlier here. - Easwaran> I know its not really "fair" to view regressions as more important than > missed opportunities, but the reality is that regressions *are* more > problematic than missed opportunities. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >