Xinliang David Li
2015-Jul-08 05:25 UTC
[LLVMdev] Inline hint for methods defined in-class
On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth <chandlerc at gmail.com> wrote:> On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> wrote: >> >> I'm reviving this thread after a while and CCing cfe-commits as >> suggested by David Blaikie. I've also collected numbers building >> chrome (from chromium, on Linux) with and without this patch as >> suggested by David. I've re-posted the proposed patch and >> performance/size numbers collected at the top to make it easily >> readable for those reading it through cfe-commits. > > > First off, thanks for collecting the numbers and broadening the > distribution. Also, sorry it took me so long to get over to this thread. > > I want to lay out my stance on this issue at a theoretical and practical > level first. I'll follow up with thoughts on the numbers as well after that. > > I think that tying *any* optimizer behavior to the 'inline' keyword is > fundamentally the wrong direction.Chandler, thanks for sharing your thought -- however I don't think it is wrong, let alone 'fundamentally wrong'. Despite all the analysis that can be done, the inliner is in the end heuristic based. In lack of the profile data, when inlining two calls yield the same static benefit and size cost, it is reasonable for the inliner to think the call to the function with inline hint to yield more high dynamic/runtime benefit -- thus it has a higher static size budget to burn.>We have reasons why we have done this > historically, and we can't just do an immediate about face, but we should be > actively looking for ways to *reduce* the optimizer's reliance on this > keyword to convey any meaning whatsoever.yes those additional things will be done, but they are really orthogonal.> > The reason I think that is the correct direction is because, for better or > worse, the 'inline' keyword in C++ is not about optimization, but about > linkage.It is about both optimization and linkage. In fact the linkage simply serves as an implementation detail. In C++ standard 7.1.2, paragraph 2 says: "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by 7.1.2 shall still be respected." Developers see those and rely on those to give compiler the hints. Most importantly, paragraph 3 says: "A function defined within a class definition is an inline function. The inline specifier shall not appear on a block scope function declaration.93 If the inline specifier is used in a friend declaration, that declaration shall be a definition or the function shall have previously been declared inline." Here we can see regardless of how optimizer will honor the hint and to what extent, and based on what analysis, it is basically incorrect to drop the attribute on the floor for in-class function definitions. Eswaran's fix is justified with this reason alone. The side effect of changing inliner behavior is irrelevant.> It has a functional impact and can be both necessary or impossible > to use to meet those functional requirements. This in turn leaves > programmers in a lurch if the functional requirements are ever in tension > with the optimizer requirements.Not sure what you mean. Performance conscious programmers use it all the time.> > We're also working really hard to get more widely deployed cross-module > optimization strategies, in part to free programmers from the requirement > that they put all their performance critical code in header files. That > makes compilation faster, and has lots of benefits to the factoring and > design of the code itself. We shouldn't then create an incentive to keep > things in header files so that they pick up a hint to the optimizer.> > Ultimately, the world will be a better place if we can eventually move code > away from relying on the hint provided by the 'inline' keyword to the > optimizer. >While I would like to see that happen some day, I do think it is an independent matter.> > That doesn't mean that the core concept of hinting to the optimizer that a > particular function is a particularly good candidate for inlining is without > value.yes.>While I think it is a bad practice that we shouldn't encourage in > code (especially portable code)yes -- there are indeed programmers who use this casually without considering performance.> I can see the desire to at least have *some* > attribute which is nothing more or less than a hint to the optimizer to > inline harder[1].yes -- there are programmers who use the attribute consciously.> It would help people work around inliner bugs in the short > term, and even help debug inliner-rooted optimization problems.I think it is a good hint to the compiler even in the longer term. With PGO, we should minimize the reliance on the hint though.>Codebases > with strong portability requirements could still (and probably should) > forbid or tightly control access to this kind of hint. I would want really > strong documentation about how this attribute *completely voids* your > performance warranty (if such a thing exists) as from version to version of > the compiler it may go from a helpful hint to a devastatingly bad hint.Why? If the compiler becomes smarter and smarter, the inline hint will become more and more irrelevant and eventually has no effect -- why would the performance warranty be voided? If the compiler is not yet smart enough, why would the compiler refuse to take the hint and forbid developer provide the hint?> But > I think I could be persuaded to live with such a hint existing. But I'm > *really* uncomfortable with it being tied to something that also impacts > linkage or other semantics of the program.For consistent with standard, we should pass the attribute. Linkage is not affected in anyway.> > [1]: Currently, the only other hint we have available is pretty terrible as > it *also* has semantic effects: the always_inline attribute. > > >> >> The proposed patch will add InlineHint to methods defined inside a class: >> >> --- a/lib/CodeGen/CodeGenFunction.cpp >> +++ b/lib/CodeGen/CodeGenFunction.cpp >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { >> if (!CGM.getCodeGenOpts().NoInline) { >> for (auto RI : FD->redecls()) >> - if (RI->isInlineSpecified()) { >> + if (RI->isInlined()) { >> Fn->addFnAttr(llvm::Attribute::InlineHint); >> break; >> } >> >> Here are the performance and size numbers I've collected: >> >> >> - C++ subset of Spec: No performance effects, < 0.1% size increase >> (all size numbers are text sizes returned by 'size') >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large .ii >> file) , 4.1% size increase > > > FWIW, this size increase seems *really* bad. I think that kills this > approach even in the short term.Re. size and performance trade-off -- 0.9% performance improvement should greatly win the size cost. Besides among all programs see, only clang sees this size increase with all the others seeing negligible size increase. This is not a short term vs long term situation. It is basically a bug fix that FE drops the attribute. If it exposes inliner heuristic bug, that should be fixed/tuned separately. With the hint correctly passed in, Easwaran will do further tuning including time based analysis.> >> >> - Chrome: no performance improvement, 0.24% size increase >> - Google internal benchmark suite (geomean of ~20 benchmarks): ~1.8% >> performance improvement, no size regression > > > I'm also somewhat worried about the lack of any performance improvements > outside of the Google benchmarks. That somewhat strongly suggests that our > benchmarks are overly coupled to this hint already. The fact that neither > Chrome, Clang, nor SPEC improved is... not at all encouraging.Other than Google benchmarks, we do see Clang improve performance. Besides, current inliner needs to be further tuned in order to get more performance benefit. Passing the hint through is simply an enabler. Also remember that most of SPEC benchmarks are C programs. C++ programs with heavy use of virtual functions may not benefit a lot either. David> > -Chandler
----- Original Message -----> From: "Xinliang David Li" <davidxl at google.com> > To: "Chandler Carruth" <chandlerc at gmail.com> > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, July 8, 2015 12:25:18 AM > Subject: Re: [LLVMdev] Inline hint for methods defined in-class > > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth > <chandlerc at gmail.com> wrote: > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> > > wrote: > >> > >> I'm reviving this thread after a while and CCing cfe-commits as > >> suggested by David Blaikie. I've also collected numbers building > >> chrome (from chromium, on Linux) with and without this patch as > >> suggested by David. I've re-posted the proposed patch and > >> performance/size numbers collected at the top to make it easily > >> readable for those reading it through cfe-commits. > > > > > > First off, thanks for collecting the numbers and broadening the > > distribution. Also, sorry it took me so long to get over to this > > thread. > > > > I want to lay out my stance on this issue at a theoretical and > > practical > > level first. I'll follow up with thoughts on the numbers as well > > after that. > > > > I think that tying *any* optimizer behavior to the 'inline' keyword > > is > > fundamentally the wrong direction. > > Chandler, thanks for sharing your thought -- however I don't think it > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > that can be done, the inliner is in the end heuristic based. In lack > of the profile data, when inlining two calls yield the same static > benefit and size cost, it is reasonable for the inliner to think the > call to the function with inline hint to yield more high > dynamic/runtime benefit -- thus it has a higher static size budget to > burn. > > >We have reasons why we have done this > > historically, and we can't just do an immediate about face, but we > > should be > > actively looking for ways to *reduce* the optimizer's reliance on > > this > > keyword to convey any meaning whatsoever. > > yes those additional things will be done, but they are really > orthogonal. > > > > > The reason I think that is the correct direction is because, for > > better or > > worse, the 'inline' keyword in C++ is not about optimization, but > > about > > linkage. > > It is about both optimization and linkage. In fact the linkage simply > serves as an implementation detail. In C++ standard 7.1.2, paragraph > 2 says:The fact that C++ combines, into one keyword, a change in semantics (linkage) and an optimization hint is quite unfortunate. I wish it were otherwise. However, as it stands, I support this change. The benchmark numbers are encouraging, and it replaces an implementation quirk with the underlying (unfortunate) language design choice. The implementation quirk is that putting the inline keyword on an in-class function definition changes the behavior of the optimizer. However, according to the language specification, that definition should have implied that keyword. While an implementation is certainly free to do arbitrary things with hints, this behavior violates the spirit of the language specification. It makes a meaningless use of a standardized keyword meaningful, and that's the greater transgression. In addition, it does tend to be the case that in-class function definitions are small and suitable for inlining.> > "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier > declares an inline function. The inline specifier indicates to the > implementation that inline substitution of the function body at the > point of call is to be preferred to the usual function call > mechanism. > An implementation is not required to perform this inline substitution > at the point of call; however, even if this inline substitution is > omitted, the other rules for inline functions defined by 7.1.2 shall > still be respected." > > Developers see those and rely on those to give compiler the hints. > > Most importantly, paragraph 3 says: > > "A function defined within a class definition is an inline function. > The inline specifier shall not appear on a block scope function > declaration.93 If the inline specifier is used in a friend > declaration, that declaration shall be a definition or the function > shall have previously been declared inline." > > Here we can see regardless of how optimizer will honor the hint and > to > what extent, and based on what analysis, > it is basically incorrect to drop the attribute on the floor for > in-class function definitions. Eswaran's fix is justified with this > reason alone. The side effect of changing inliner behavior is > irrelevant. > > > It has a functional impact and can be both necessary or impossible > > to use to meet those functional requirements. This in turn leaves > > programmers in a lurch if the functional requirements are ever in > > tension > > with the optimizer requirements. > > Not sure what you mean. Performance conscious programmers use it all > the time. > > > > > We're also working really hard to get more widely deployed > > cross-module > > optimization strategies, in part to free programmers from the > > requirement > > that they put all their performance critical code in header files. > > That > > makes compilation faster, and has lots of benefits to the factoring > > and > > design of the code itself. We shouldn't then create an incentive to > > keep > > things in header files so that they pick up a hint to the > > optimizer. > > > > > Ultimately, the world will be a better place if we can eventually > > move code > > away from relying on the hint provided by the 'inline' keyword to > > the > > optimizer. > > > > While I would like to see that happen some day, I do think it is an > independent matter. > > > > > That doesn't mean that the core concept of hinting to the optimizer > > that a > > particular function is a particularly good candidate for inlining > > is without > > value. > > yes. > > >While I think it is a bad practice that we shouldn't encourage in > > code (especially portable code) > > yes -- there are indeed programmers who use this casually without > considering performance. > > > I can see the desire to at least have *some* > > attribute which is nothing more or less than a hint to the > > optimizer to > > inline harder[1]. > > yes -- there are programmers who use the attribute consciously. > > > It would help people work around inliner bugs in the short > > term, and even help debug inliner-rooted optimization problems. > > I think it is a good hint to the compiler even in the longer term. > With PGO, we should minimize the reliance on the hint though. > > >Codebases > > with strong portability requirements could still (and probably > > should) > > forbid or tightly control access to this kind of hint. I would want > > really > > strong documentation about how this attribute *completely voids* > > your > > performance warranty (if such a thing exists) as from version to > > version of > > the compiler it may go from a helpful hint to a devastatingly bad > > hint. > > Why? If the compiler becomes smarter and smarter, the inline hint > will > become more and more irrelevant and eventually has no effect -- why > would the performance warranty be voided? If the compiler is not yet > smart enough, why would the compiler refuse to take the hint and > forbid developer provide the hint? > > > But > > I think I could be persuaded to live with such a hint existing. But > > I'm > > *really* uncomfortable with it being tied to something that also > > impacts > > linkage or other semantics of the program. > > For consistent with standard, we should pass the attribute. Linkage > is > not affected in anyway. > > > > > [1]: Currently, the only other hint we have available is pretty > > terrible as > > it *also* has semantic effects: the always_inline attribute. > > > > > >> > >> The proposed patch will add InlineHint to methods defined inside a > >> class: > >> > >> --- a/lib/CodeGen/CodeGenFunction.cpp > >> +++ b/lib/CodeGen/CodeGenFunction.cpp > >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl > >> GD, > >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) > >> { > >> if (!CGM.getCodeGenOpts().NoInline) { > >> for (auto RI : FD->redecls()) > >> - if (RI->isInlineSpecified()) { > >> + if (RI->isInlined()) { > >> Fn->addFnAttr(llvm::Attribute::InlineHint); > >> break; > >> } > >> > >> Here are the performance and size numbers I've collected: > >> > >> > >> - C++ subset of Spec: No performance effects, < 0.1% size increase > >> (all size numbers are text sizes returned by 'size') > >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large > >> .ii > >> file) , 4.1% size increase > > > > > > FWIW, this size increase seems *really* bad. I think that kills > > this > > approach even in the short term. > > Re. size and performance trade-off -- 0.9% performance improvement > should greatly win the size cost. Besides among all programs see, > only > clang sees this size increase with all the others seeing negligible > size increase.I agree. A 1% performance increase is worth a 4% code-size increase when not optimizing for size. -Hal> > This is not a short term vs long term situation. It is basically a > bug > fix that FE drops the attribute. If it exposes inliner heuristic bug, > that should be fixed/tuned separately. With the hint correctly > passed > in, Easwaran will do further tuning including time based analysis. > > > > >> > >> - Chrome: no performance improvement, 0.24% size increase > >> - Google internal benchmark suite (geomean of ~20 benchmarks): > >> ~1.8% > >> performance improvement, no size regression > > > > > > I'm also somewhat worried about the lack of any performance > > improvements > > outside of the Google benchmarks. That somewhat strongly suggests > > that our > > benchmarks are overly coupled to this hint already. The fact that > > neither > > Chrome, Clang, nor SPEC improved is... not at all encouraging. > > Other than Google benchmarks, we do see Clang improve performance. > Besides, current inliner needs to be further tuned in order to get > more performance benefit. Passing the hint through is simply an > enabler. Also remember that most of SPEC benchmarks are C programs. > C++ programs with heavy use of virtual functions may not benefit a > lot > either. > > David > > > > > > -Chandler > _______________________________________________ > cfe-commits mailing list > cfe-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Xinliang David Li
2015-Jul-08 20:57 UTC
[LLVMdev] Inline hint for methods defined in-class
On Wed, Jul 8, 2015 at 1:43 PM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- >> From: "Xinliang David Li" <davidxl at google.com> >> To: "Chandler Carruth" <chandlerc at gmail.com> >> Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" <llvmdev at cs.uiuc.edu> >> Sent: Wednesday, July 8, 2015 12:25:18 AM >> Subject: Re: [LLVMdev] Inline hint for methods defined in-class >> >> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth >> <chandlerc at gmail.com> wrote: >> > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> >> > wrote: >> >> >> >> I'm reviving this thread after a while and CCing cfe-commits as >> >> suggested by David Blaikie. I've also collected numbers building >> >> chrome (from chromium, on Linux) with and without this patch as >> >> suggested by David. I've re-posted the proposed patch and >> >> performance/size numbers collected at the top to make it easily >> >> readable for those reading it through cfe-commits. >> > >> > >> > First off, thanks for collecting the numbers and broadening the >> > distribution. Also, sorry it took me so long to get over to this >> > thread. >> > >> > I want to lay out my stance on this issue at a theoretical and >> > practical >> > level first. I'll follow up with thoughts on the numbers as well >> > after that. >> > >> > I think that tying *any* optimizer behavior to the 'inline' keyword >> > is >> > fundamentally the wrong direction. >> >> Chandler, thanks for sharing your thought -- however I don't think it >> is wrong, let alone 'fundamentally wrong'. Despite all the analysis >> that can be done, the inliner is in the end heuristic based. In lack >> of the profile data, when inlining two calls yield the same static >> benefit and size cost, it is reasonable for the inliner to think the >> call to the function with inline hint to yield more high >> dynamic/runtime benefit -- thus it has a higher static size budget to >> burn. >> >> >We have reasons why we have done this >> > historically, and we can't just do an immediate about face, but we >> > should be >> > actively looking for ways to *reduce* the optimizer's reliance on >> > this >> > keyword to convey any meaning whatsoever. >> >> yes those additional things will be done, but they are really >> orthogonal. >> >> > >> > The reason I think that is the correct direction is because, for >> > better or >> > worse, the 'inline' keyword in C++ is not about optimization, but >> > about >> > linkage. >> >> It is about both optimization and linkage. In fact the linkage simply >> serves as an implementation detail. In C++ standard 7.1.2, paragraph >> 2 says: > > The fact that C++ combines, into one keyword, a change in semantics (linkage) and an optimization hint is quite unfortunate. I wish it were otherwise. However, as it stands, I support this change. The benchmark numbers are encouraging, and it replaces an implementation quirk with the underlying (unfortunate) language design choice. The implementation quirk is that putting the inline keyword on an in-class function definition changes the behavior of the optimizer. However, according to the language specification, that definition should have implied that keyword. While an implementation is certainly free to do arbitrary things with hints, this behavior violates the spirit of the language specification. It makes a meaningless use of a standardized keyword meaningful, and that's the greater transgression. In addition, it does tend to be the case that in-class function definitions are small and suitable for inlining.Well said! thanks, David> >> >> "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier >> declares an inline function. The inline specifier indicates to the >> implementation that inline substitution of the function body at the >> point of call is to be preferred to the usual function call >> mechanism. >> An implementation is not required to perform this inline substitution >> at the point of call; however, even if this inline substitution is >> omitted, the other rules for inline functions defined by 7.1.2 shall >> still be respected." >> >> Developers see those and rely on those to give compiler the hints. >> >> Most importantly, paragraph 3 says: >> >> "A function defined within a class definition is an inline function. >> The inline specifier shall not appear on a block scope function >> declaration.93 If the inline specifier is used in a friend >> declaration, that declaration shall be a definition or the function >> shall have previously been declared inline." >> >> Here we can see regardless of how optimizer will honor the hint and >> to >> what extent, and based on what analysis, >> it is basically incorrect to drop the attribute on the floor for >> in-class function definitions. Eswaran's fix is justified with this >> reason alone. The side effect of changing inliner behavior is >> irrelevant. >> >> > It has a functional impact and can be both necessary or impossible >> > to use to meet those functional requirements. This in turn leaves >> > programmers in a lurch if the functional requirements are ever in >> > tension >> > with the optimizer requirements. >> >> Not sure what you mean. Performance conscious programmers use it all >> the time. >> >> > >> > We're also working really hard to get more widely deployed >> > cross-module >> > optimization strategies, in part to free programmers from the >> > requirement >> > that they put all their performance critical code in header files. >> > That >> > makes compilation faster, and has lots of benefits to the factoring >> > and >> > design of the code itself. We shouldn't then create an incentive to >> > keep >> > things in header files so that they pick up a hint to the >> > optimizer. >> >> > >> > Ultimately, the world will be a better place if we can eventually >> > move code >> > away from relying on the hint provided by the 'inline' keyword to >> > the >> > optimizer. >> > >> >> While I would like to see that happen some day, I do think it is an >> independent matter. >> >> > >> > That doesn't mean that the core concept of hinting to the optimizer >> > that a >> > particular function is a particularly good candidate for inlining >> > is without >> > value. >> >> yes. >> >> >While I think it is a bad practice that we shouldn't encourage in >> > code (especially portable code) >> >> yes -- there are indeed programmers who use this casually without >> considering performance. >> >> > I can see the desire to at least have *some* >> > attribute which is nothing more or less than a hint to the >> > optimizer to >> > inline harder[1]. >> >> yes -- there are programmers who use the attribute consciously. >> >> > It would help people work around inliner bugs in the short >> > term, and even help debug inliner-rooted optimization problems. >> >> I think it is a good hint to the compiler even in the longer term. >> With PGO, we should minimize the reliance on the hint though. >> >> >Codebases >> > with strong portability requirements could still (and probably >> > should) >> > forbid or tightly control access to this kind of hint. I would want >> > really >> > strong documentation about how this attribute *completely voids* >> > your >> > performance warranty (if such a thing exists) as from version to >> > version of >> > the compiler it may go from a helpful hint to a devastatingly bad >> > hint. >> >> Why? If the compiler becomes smarter and smarter, the inline hint >> will >> become more and more irrelevant and eventually has no effect -- why >> would the performance warranty be voided? If the compiler is not yet >> smart enough, why would the compiler refuse to take the hint and >> forbid developer provide the hint? >> >> > But >> > I think I could be persuaded to live with such a hint existing. But >> > I'm >> > *really* uncomfortable with it being tied to something that also >> > impacts >> > linkage or other semantics of the program. >> >> For consistent with standard, we should pass the attribute. Linkage >> is >> not affected in anyway. >> >> > >> > [1]: Currently, the only other hint we have available is pretty >> > terrible as >> > it *also* has semantic effects: the always_inline attribute. >> > >> > >> >> >> >> The proposed patch will add InlineHint to methods defined inside a >> >> class: >> >> >> >> --- a/lib/CodeGen/CodeGenFunction.cpp >> >> +++ b/lib/CodeGen/CodeGenFunction.cpp >> >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl >> >> GD, >> >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) >> >> { >> >> if (!CGM.getCodeGenOpts().NoInline) { >> >> for (auto RI : FD->redecls()) >> >> - if (RI->isInlineSpecified()) { >> >> + if (RI->isInlined()) { >> >> Fn->addFnAttr(llvm::Attribute::InlineHint); >> >> break; >> >> } >> >> >> >> Here are the performance and size numbers I've collected: >> >> >> >> >> >> - C++ subset of Spec: No performance effects, < 0.1% size increase >> >> (all size numbers are text sizes returned by 'size') >> >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large >> >> .ii >> >> file) , 4.1% size increase >> > >> > >> > FWIW, this size increase seems *really* bad. I think that kills >> > this >> > approach even in the short term. >> >> Re. size and performance trade-off -- 0.9% performance improvement >> should greatly win the size cost. Besides among all programs see, >> only >> clang sees this size increase with all the others seeing negligible >> size increase. > > I agree. A 1% performance increase is worth a 4% code-size increase when not optimizing for size. > > -Hal > >> >> This is not a short term vs long term situation. It is basically a >> bug >> fix that FE drops the attribute. If it exposes inliner heuristic bug, >> that should be fixed/tuned separately. With the hint correctly >> passed >> in, Easwaran will do further tuning including time based analysis. >> >> > >> >> >> >> - Chrome: no performance improvement, 0.24% size increase >> >> - Google internal benchmark suite (geomean of ~20 benchmarks): >> >> ~1.8% >> >> performance improvement, no size regression >> > >> > >> > I'm also somewhat worried about the lack of any performance >> > improvements >> > outside of the Google benchmarks. That somewhat strongly suggests >> > that our >> > benchmarks are overly coupled to this hint already. The fact that >> > neither >> > Chrome, Clang, nor SPEC improved is... not at all encouraging. >> >> Other than Google benchmarks, we do see Clang improve performance. >> Besides, current inliner needs to be further tuned in order to get >> more performance benefit. Passing the hint through is simply an >> enabler. Also remember that most of SPEC benchmarks are C programs. >> C++ programs with heavy use of virtual functions may not benefit a >> lot >> either. >> >> David >> >> >> > >> > -Chandler >> _______________________________________________ >> cfe-commits mailing list >> cfe-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory
On Tue, Jul 7, 2015 at 10:27 PM Xinliang David Li <davidxl at google.com> wrote:> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth <chandlerc at gmail.com> > wrote: > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> wrote: > >> > >> I'm reviving this thread after a while and CCing cfe-commits as > >> suggested by David Blaikie. I've also collected numbers building > >> chrome (from chromium, on Linux) with and without this patch as > >> suggested by David. I've re-posted the proposed patch and > >> performance/size numbers collected at the top to make it easily > >> readable for those reading it through cfe-commits. > > > > > > First off, thanks for collecting the numbers and broadening the > > distribution. Also, sorry it took me so long to get over to this thread. > > > > I want to lay out my stance on this issue at a theoretical and practical > > level first. I'll follow up with thoughts on the numbers as well after > that. > > > > I think that tying *any* optimizer behavior to the 'inline' keyword is > > fundamentally the wrong direction. > > Chandler, thanks for sharing your thought -- however I don't think it > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > that can be done, the inliner is in the end heuristic based. In lack > of the profile data, when inlining two calls yield the same static > benefit and size cost, it is reasonable for the inliner to think the > call to the function with inline hint to yield more high > dynamic/runtime benefit -- thus it has a higher static size budget to > burn. >This is an argument for having *some* source level hinting construct. While I think such a construct is very risky, I did actually suggest that we add such a hint because I recognize some of the practical necessities for it. My primary argument is against using the 'inline' keyword as the source of the hint, and especially using the happenstance of a method body in a class as the source of the hint.> > >We have reasons why we have done this > > historically, and we can't just do an immediate about face, but we > should be > > actively looking for ways to *reduce* the optimizer's reliance on this > > keyword to convey any meaning whatsoever. > > yes those additional things will be done, but they are really orthogonal. >I'm not talking about additional things, I'm talking about separating the optimization hint from the semantics and linkage changing constructs. That does not seem orthogonal.> > > > > The reason I think that is the correct direction is because, for better > or > > worse, the 'inline' keyword in C++ is not about optimization, but about > > linkage. > > It is about both optimization and linkage. In fact the linkage simply > serves as an implementation detail. In C++ standard 7.1.2, paragraph > 2 says: > > "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier > declares an inline function. The inline specifier indicates to the > implementation that inline substitution of the function body at the > point of call is to be preferred to the usual function call mechanism. > An implementation is not required to perform this inline substitution > at the point of call; however, even if this inline substitution is > omitted, the other rules for inline functions defined by 7.1.2 shall > still be respected." > > Developers see those and rely on those to give compiler the hints. >This is essentially a nonsense paragraph for a standardized specification of a programming language. How hard you optimize code doesn't have any bearing on the conformance and behavior of the program. =/ I think this paragraph is part of the historical context. I think we should change C++ to remove it (and I will propose that change) and I think we should change C++ to support a standardized *non*-semantic hint if folks really want to see that in the form of a C++11-style [[attribute]]. I'm also really, really confident that most developers are not using the wording of the standard as the basis of how they tune the performance of their code. =/> > Most importantly, paragraph 3 says: > > "A function defined within a class definition is an inline function. > The inline specifier shall not appear on a block scope function > declaration.93 If the inline specifier is used in a friend > declaration, that declaration shall be a definition or the function > shall have previously been declared inline." > > Here we can see regardless of how optimizer will honor the hint and to > what extent, and based on what analysis, > it is basically incorrect to drop the attribute on the floor for > in-class function definitions. Eswaran's fix is justified with this > reason alone. The side effect of changing inliner behavior is > irrelevant. >I don't really understand what you're saying. Clang correctly carries all of the *semantics* required for in-class method bodies. We simply don't attach an optimization hint? I don't think this is "incorrect". Nothing in the standard says how hard we should try (and it can't, which is why the standard doesn't make sense to give advice here).> It has a functional impact and can be both necessary or impossible > > to use to meet those functional requirements. This in turn leaves > > programmers in a lurch if the functional requirements are ever in tension > > with the optimizer requirements. > > Not sure what you mean. Performance conscious programmers use it all the > time. >You might *have* to use the inline keyword to get correct linkage even when it causes the optimizer to inline and that *hurts* performance. Similarly, you might *have* to not use the inline keyword to get correct linkage even though you would like to give the optimizer a hint for performance reasons (and are doing LTO, so you can). Essentially, there is no guarantee that the semantic requirements of linkage are correctly aligned with the desired optimizer hint.> > > > > We're also working really hard to get more widely deployed cross-module > > optimization strategies, in part to free programmers from the requirement > > that they put all their performance critical code in header files. That > > makes compilation faster, and has lots of benefits to the factoring and > > design of the code itself. We shouldn't then create an incentive to keep > > things in header files so that they pick up a hint to the optimizer. > > > > > Ultimately, the world will be a better place if we can eventually move > code > > away from relying on the hint provided by the 'inline' keyword to the > > optimizer. > > > > While I would like to see that happen some day, I do think it is an > independent matter. >Adding further optimization hints based around the linkage further entrenches that model. If we want to move in this direction, this patch is a step in the *wrong* direction.> > >Codebases > > with strong portability requirements could still (and probably should) > > forbid or tightly control access to this kind of hint. I would want > really > > strong documentation about how this attribute *completely voids* your > > performance warranty (if such a thing exists) as from version to version > of > > the compiler it may go from a helpful hint to a devastatingly bad hint. > > Why? If the compiler becomes smarter and smarter, the inline hint will > become more and more irrelevant and eventually has no effect -- why > would the performance warranty be voided? If the compiler is not yet > smart enough, why would the compiler refuse to take the hint and > forbid developer provide the hint? >I didn't suggest the compiler would do anything. I suggested that the idea of hinting an optimizer about how to inline code is inherently non-portable (its specific to an optimizer) and thus would likely be less used in code bases with unusually strong portability concerns.> > > But > > I think I could be persuaded to live with such a hint existing. But I'm > > *really* uncomfortable with it being tied to something that also impacts > > linkage or other semantics of the program. > > For consistent with standard, we should pass the attribute. Linkage is > not affected in anyway. >I do not think that there is any standards-consistency argument for one optimization hint over another. We already get the linkage correct for all of these things. The only consideration is the *degree* to which we prefer to actually do inlining in the optimizer. The standard at no point makes guarantees about these degrees.> > > > > [1]: Currently, the only other hint we have available is pretty terrible > as > > it *also* has semantic effects: the always_inline attribute. > > > > > >> > >> The proposed patch will add InlineHint to methods defined inside a > class: > >> > >> --- a/lib/CodeGen/CodeGenFunction.cpp > >> +++ b/lib/CodeGen/CodeGenFunction.cpp > >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, > >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { > >> if (!CGM.getCodeGenOpts().NoInline) { > >> for (auto RI : FD->redecls()) > >> - if (RI->isInlineSpecified()) { > >> + if (RI->isInlined()) { > >> Fn->addFnAttr(llvm::Attribute::InlineHint); > >> break; > >> } > >> > >> Here are the performance and size numbers I've collected: > >> > >> > >> - C++ subset of Spec: No performance effects, < 0.1% size increase > >> (all size numbers are text sizes returned by 'size') > >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large .ii > >> file) , 4.1% size increase > > > > > > FWIW, this size increase seems *really* bad. I think that kills this > > approach even in the short term. > > Re. size and performance trade-off -- 0.9% performance improvement > should greatly win the size cost. Besides among all programs see, only > clang sees this size increase with all the others seeing negligible > size increase. >But the other programs don't see any performance wins. We can likely get this 1% of runtime performance without paying that high of a size cost. This is a real concern -- there are uses of Clang and LLVM that are very sensitive to size regressions. And I suspect we'll see other applications that also see the size loss. If we could more *selectively* use a dedicated hint to get the performance boost, we'd almost certainly not have to give up this much code size.> > > >> > >> - Chrome: no performance improvement, 0.24% size increase > >> - Google internal benchmark suite (geomean of ~20 benchmarks): ~1.8% > >> performance improvement, no size regression > > > > > > I'm also somewhat worried about the lack of any performance improvements > > outside of the Google benchmarks. That somewhat strongly suggests that > our > > benchmarks are overly coupled to this hint already. The fact that neither > > Chrome, Clang, nor SPEC improved is... not at all encouraging. > > Other than Google benchmarks, we do see Clang improve performance. > Besides, current inliner needs to be further tuned in order to get > more performance benefit. Passing the hint through is simply an > enabler. Also remember that most of SPEC benchmarks are C programs. > C++ programs with heavy use of virtual functions may not benefit a lot > either. >I don't really understand why that changes my point... Are you saying that without this hint, we can't do the subsequent work on the inliner? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/608d05bc/attachment.html>
On Wed, Jul 8, 2015 at 1:46 PM Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Xinliang David Li" <davidxl at google.com> > > To: "Chandler Carruth" <chandlerc at gmail.com> > > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" < > llvmdev at cs.uiuc.edu> > > Sent: Wednesday, July 8, 2015 12:25:18 AM > > Subject: Re: [LLVMdev] Inline hint for methods defined in-class > > > > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth > > <chandlerc at gmail.com> wrote: > > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> > > > wrote: > > >> > > >> I'm reviving this thread after a while and CCing cfe-commits as > > >> suggested by David Blaikie. I've also collected numbers building > > >> chrome (from chromium, on Linux) with and without this patch as > > >> suggested by David. I've re-posted the proposed patch and > > >> performance/size numbers collected at the top to make it easily > > >> readable for those reading it through cfe-commits. > > > > > > > > > First off, thanks for collecting the numbers and broadening the > > > distribution. Also, sorry it took me so long to get over to this > > > thread. > > > > > > I want to lay out my stance on this issue at a theoretical and > > > practical > > > level first. I'll follow up with thoughts on the numbers as well > > > after that. > > > > > > I think that tying *any* optimizer behavior to the 'inline' keyword > > > is > > > fundamentally the wrong direction. > > > > Chandler, thanks for sharing your thought -- however I don't think it > > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > > that can be done, the inliner is in the end heuristic based. In lack > > of the profile data, when inlining two calls yield the same static > > benefit and size cost, it is reasonable for the inliner to think the > > call to the function with inline hint to yield more high > > dynamic/runtime benefit -- thus it has a higher static size budget to > > burn. > > > > >We have reasons why we have done this > > > historically, and we can't just do an immediate about face, but we > > > should be > > > actively looking for ways to *reduce* the optimizer's reliance on > > > this > > > keyword to convey any meaning whatsoever. > > > > yes those additional things will be done, but they are really > > orthogonal. > > > > > > > > The reason I think that is the correct direction is because, for > > > better or > > > worse, the 'inline' keyword in C++ is not about optimization, but > > > about > > > linkage. > > > > It is about both optimization and linkage. In fact the linkage simply > > serves as an implementation detail. In C++ standard 7.1.2, paragraph > > 2 says: > > The fact that C++ combines, into one keyword, a change in semantics > (linkage) and an optimization hint is quite unfortunate. I wish it were > otherwise.We could work to change it? I specifically proposed adding a way to move away from this unfortunate design.> However, as it stands, I support this change. The benchmark numbers are > encouraging, and it replaces an implementation quirk with the underlying > (unfortunate) language design choice. The implementation quirk is that > putting the inline keyword on an in-class function definition changes the > behavior of the optimizer. However, according to the language > specification, that definition should have implied that keyword. While an > implementation is certainly free to do arbitrary things with hints, this > behavior violates the spirit of the language specification.I strongly disagree that this is the spirit of the language specification. Even if it was historically, I think we should move away from that. The language shouldn't be trying to do this with a language keyword, and it shouldn't be coupling semantics to hints. I'm very happy to take this up with the committee, but I don't see why we shouldn't push Clang in that direction here when there is no issue of conformance. To see how broken this is, let's look at how miserably small the difference is between the un-hinted and hinted thresholds. We've ended up shrinking this difference over time in LLVM because increasing the hinted threshold caused lots of performance regressions and size regressions.> It makes a meaningless use of a standardized keyword meaningful, and > that's the greater transgression.So here is what I want to do: 1) Add a non-semantic attribute that conveys this hint. We could even convey a much *stronger* hint with this rather than just a tiny hint the way it is today because it wouldn't end up being forced onto every template regardless of whether that makes sense. 2) Start lobbying to remove the hint from the 'inline' keyword by working with the people who see regressions from this to use the new annotation to recover the performance. 3) Completely remove the semantic coupling of the optimizer hint and fix the meaningless use of the standardized keyword at the same time. But the more places where we use the inline hint today, the harder #2 will become. I've already tried once before to remove the hint and couldn't because of benchmarks that had been tightly tuned and coupled to the existing (quirky) behavior. I really think that doing this more will make getting to #3 harder. Making progress toward a cleaner design harder seems worse than coping with the quirks that have existed in Clang for over 5 years for another few years.> In addition, it does tend to be the case that in-class function > definitions are small and suitable for inlining. >But if they are small and suitable for inlining, won't the existing threshold work just fine?> > I agree. A 1% performance increase is worth a 4% code-size increase when > not optimizing for size. >I really don't. Maybe in -O3 or something, but a 4% code-size increase is a hard regression to swallow. Especially in a single benchmark, and where many other benchmarks show no benefit at all. This isn't a matter of "most code gets better, so we need to tolerate the unfortunate variance of size". I know its not really "fair" to view regressions as more important than missed opportunities, but the reality is that regressions *are* more problematic than missed opportunities. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/13b35053/attachment.html>
Xinliang David Li
2015-Jul-09 06:49 UTC
[LLVMdev] Inline hint for methods defined in-class
On Wed, Jul 8, 2015 at 10:29 PM, Chandler Carruth <chandlerc at gmail.com> wrote:> On Tue, Jul 7, 2015 at 10:27 PM Xinliang David Li <davidxl at google.com> > wrote: >> >> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth <chandlerc at gmail.com> >> wrote: >> > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> wrote: >> >> >> >> I'm reviving this thread after a while and CCing cfe-commits as >> >> suggested by David Blaikie. I've also collected numbers building >> >> chrome (from chromium, on Linux) with and without this patch as >> >> suggested by David. I've re-posted the proposed patch and >> >> performance/size numbers collected at the top to make it easily >> >> readable for those reading it through cfe-commits. >> > >> > >> > First off, thanks for collecting the numbers and broadening the >> > distribution. Also, sorry it took me so long to get over to this thread. >> > >> > I want to lay out my stance on this issue at a theoretical and practical >> > level first. I'll follow up with thoughts on the numbers as well after >> > that. >> > >> > I think that tying *any* optimizer behavior to the 'inline' keyword is >> > fundamentally the wrong direction. >> >> Chandler, thanks for sharing your thought -- however I don't think it >> is wrong, let alone 'fundamentally wrong'. Despite all the analysis >> that can be done, the inliner is in the end heuristic based. In lack >> of the profile data, when inlining two calls yield the same static >> benefit and size cost, it is reasonable for the inliner to think the >> call to the function with inline hint to yield more high >> dynamic/runtime benefit -- thus it has a higher static size budget to >> burn. > > > This is an argument for having *some* source level hinting construct. While > I think such a construct is very risky, I did actually suggest that we add > such a hint because I recognize some of the practical necessities for it. > > My primary argument is against using the 'inline' keyword as the source of > the hint, and especially using the happenstance of a method body in a class > as the source of the hint.Not sure whether I agree with you on the ideal state or not, but we are talking about that an implicit hint that exists today, not something that may or may not exist in the future.> >> >> >> >We have reasons why we have done this >> > historically, and we can't just do an immediate about face, but we >> > should be >> > actively looking for ways to *reduce* the optimizer's reliance on this >> > keyword to convey any meaning whatsoever. >> >> yes those additional things will be done, but they are really orthogonal. > > > I'm not talking about additional things, I'm talking about separating the > optimization hint from the semantics and linkage changing constructs. That > does not seem orthogonal. > > >> >> >> > >> > The reason I think that is the correct direction is because, for better >> > or >> > worse, the 'inline' keyword in C++ is not about optimization, but about >> > linkage. >> >> It is about both optimization and linkage. In fact the linkage simply >> serves as an implementation detail. In C++ standard 7.1.2, paragraph >> 2 says: >> >> "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier >> declares an inline function. The inline specifier indicates to the >> implementation that inline substitution of the function body at the >> point of call is to be preferred to the usual function call mechanism. >> An implementation is not required to perform this inline substitution >> at the point of call; however, even if this inline substitution is >> omitted, the other rules for inline functions defined by 7.1.2 shall >> still be respected." >> >> Developers see those and rely on those to give compiler the hints. > > > This is essentially a nonsense paragraph for a standardized specification of > a programming language. How hard you optimize code doesn't have any bearing > on the conformance and behavior of the program. =/ > > I think this paragraph is part of the historical context. I think we should > change C++ to remove it (and I will propose that change) and I think we > should change C++ to support a standardized *non*-semantic hint if folks > really want to see that in the form of a C++11-style [[attribute]].Let's not turn this into a standards committee discussion.> > I'm also really, really confident that most developers are not using the > wording of the standard as the basis of how they tune the performance of > their code. =/I think you may be wrong if you are talking about developer who tune performance. They may not use the words in standards exactly, but what standard says is well known.> >> >> >> Most importantly, paragraph 3 says: >> >> "A function defined within a class definition is an inline function. >> The inline specifier shall not appear on a block scope function >> declaration.93 If the inline specifier is used in a friend >> declaration, that declaration shall be a definition or the function >> shall have previously been declared inline." >> >> Here we can see regardless of how optimizer will honor the hint and to >> what extent, and based on what analysis, >> it is basically incorrect to drop the attribute on the floor for >> in-class function definitions. Eswaran's fix is justified with this >> reason alone. The side effect of changing inliner behavior is >> irrelevant. > > > I don't really understand what you're saying. Clang correctly carries all of > the *semantics* required for in-class method bodies. We simply don't attach > an optimization hint? I don't think this is "incorrect". Nothing in the > standard says how hard we should try (and it can't, which is why the > standard doesn't make sense to give advice here).It is at least inconsistent.> > >> > It has a functional impact and can be both necessary or impossible >> > to use to meet those functional requirements. This in turn leaves >> > programmers in a lurch if the functional requirements are ever in >> > tension >> > with the optimizer requirements. >> >> Not sure what you mean. Performance conscious programmers use it all the >> time. > > > You might *have* to use the inline keyword to get correct linkage even when > it causes the optimizer to inline and that *hurts* performance.Why would a programmer use the inline keyword just to get the linkage? The vague linkage itself does not bring you any more benefit other than increased compile/link time?> > Similarly, you might *have* to not use the inline keyword to get correct > linkage even though you would like to give the optimizer a hint for > performance reasons (and are doing LTO, so you can).In what situation the linkage due to 'inline' is bad and you want to avoid it badly? Compile time reasons?> > Essentially, there is no guarantee that the semantic requirements of linkage > are correctly aligned with the desired optimizer hint.It might be more flexible to un-tie 'inline' keyword with semantics related linkage (e.g, by introducing a keyword to explicitly change linkage), but that discussion belongs to standard's committee.> >> >> >> > >> > We're also working really hard to get more widely deployed cross-module >> > optimization strategies, in part to free programmers from the >> > requirement >> > that they put all their performance critical code in header files. That >> > makes compilation faster, and has lots of benefits to the factoring and >> > design of the code itself. We shouldn't then create an incentive to keep >> > things in header files so that they pick up a hint to the optimizer. >> >> > >> > Ultimately, the world will be a better place if we can eventually move >> > code >> > away from relying on the hint provided by the 'inline' keyword to the >> > optimizer. >> > >> >> While I would like to see that happen some day, I do think it is an >> independent matter. > > > Adding further optimization hints based around the linkage further > entrenches that model. If we want to move in this direction, this patch is a > step in the *wrong* direction. > >> >> >> >Codebases >> > with strong portability requirements could still (and probably should) >> > forbid or tightly control access to this kind of hint. I would want >> > really >> > strong documentation about how this attribute *completely voids* your >> > performance warranty (if such a thing exists) as from version to version >> > of >> > the compiler it may go from a helpful hint to a devastatingly bad hint. >> >> Why? If the compiler becomes smarter and smarter, the inline hint will >> become more and more irrelevant and eventually has no effect -- why >> would the performance warranty be voided? If the compiler is not yet >> smart enough, why would the compiler refuse to take the hint and >> forbid developer provide the hint? > > > I didn't suggest the compiler would do anything. I suggested that the idea > of hinting an optimizer about how to inline code is inherently non-portable > (its specific to an optimizer) and thus would likely be less used in code > bases with unusually strong portability concerns.By strong portability concerns, you really mean strong performance concerns? If user has performance concerns on the code, the code is performance critical, and they will be more inclined to give hints, no? You think they will drop the performance on the floor because they worry the hints may be ignored in the future? Seems unlikely.> >> >> >> > But >> > I think I could be persuaded to live with such a hint existing. But I'm >> > *really* uncomfortable with it being tied to something that also impacts >> > linkage or other semantics of the program. >> >> For consistent with standard, we should pass the attribute. Linkage is >> not affected in anyway. > > > I do not think that there is any standards-consistency argument for one > optimization hint over another. We already get the linkage correct for all > of these things. The only consideration is the *degree* to which we prefer > to actually do inlining in the optimizer. The standard at no point makes > guarantees about these degrees. > >> >> >> > >> > [1]: Currently, the only other hint we have available is pretty terrible >> > as >> > it *also* has semantic effects: the always_inline attribute. >> > >> > >> >> >> >> The proposed patch will add InlineHint to methods defined inside a >> >> class: >> >> >> >> --- a/lib/CodeGen/CodeGenFunction.cpp >> >> +++ b/lib/CodeGen/CodeGenFunction.cpp >> >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, >> >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { >> >> if (!CGM.getCodeGenOpts().NoInline) { >> >> for (auto RI : FD->redecls()) >> >> - if (RI->isInlineSpecified()) { >> >> + if (RI->isInlined()) { >> >> Fn->addFnAttr(llvm::Attribute::InlineHint); >> >> break; >> >> } >> >> >> >> Here are the performance and size numbers I've collected: >> >> >> >> >> >> - C++ subset of Spec: No performance effects, < 0.1% size increase >> >> (all size numbers are text sizes returned by 'size') >> >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large .ii >> >> file) , 4.1% size increase >> > >> > >> > FWIW, this size increase seems *really* bad. I think that kills this >> > approach even in the short term. >> >> Re. size and performance trade-off -- 0.9% performance improvement >> should greatly win the size cost. Besides among all programs see, only >> clang sees this size increase with all the others seeing negligible >> size increase. > > > But the other programs don't see any performance wins.What do you mean? Other than clang, many other programs in our internal benchmarks win in performance.> > We can likely get this 1% of runtime performance without paying that high of > a size cost. This is a real concern -- there are uses of Clang and LLVM that > are very sensitive to size regressions. And I suspect we'll see other > applications that also see the size loss. > > If we could more *selectively* use a dedicated hint to get the performance > boost, we'd almost certainly not have to give up this much code size.Clang seems to be an outlier here.> >> >> > >> >> >> >> - Chrome: no performance improvement, 0.24% size increase >> >> - Google internal benchmark suite (geomean of ~20 benchmarks): ~1.8% >> >> performance improvement, no size regression >> > >> > >> > I'm also somewhat worried about the lack of any performance improvements >> > outside of the Google benchmarks. That somewhat strongly suggests that >> > our >> > benchmarks are overly coupled to this hint already. The fact that >> > neither >> > Chrome, Clang, nor SPEC improved is... not at all encouraging. >> >> Other than Google benchmarks, we do see Clang improve performance. >> Besides, current inliner needs to be further tuned in order to get >> more performance benefit. Passing the hint through is simply an >> enabler. Also remember that most of SPEC benchmarks are C programs. >> C++ programs with heavy use of virtual functions may not benefit a lot >> either. > > > I don't really understand why that changes my point... Are you saying that > without this hint, we can't do the subsequent work on the inliner?No -- with the hint passed in, it is more flexible for us to do more fine grained tuning selectively. David
----- Original Message -----> From: "Chandler Carruth" <chandlerc at gmail.com> > To: "Xinliang David Li" <davidxl at google.com>, "Chandler Carruth" <chandlerc at gmail.com> > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, July 9, 2015 12:29:09 AM > Subject: Re: [LLVMdev] Inline hint for methods defined in-class > > On Tue, Jul 7, 2015 at 10:27 PM Xinliang David Li < > davidxl at google.com > wrote: > > > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth < > chandlerc at gmail.com > wrote: > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman < eraman at google.com > > > wrote: > >> > >> I'm reviving this thread after a while and CCing cfe-commits as > >> suggested by David Blaikie. I've also collected numbers building > >> chrome (from chromium, on Linux) with and without this patch as > >> suggested by David. I've re-posted the proposed patch and > >> performance/size numbers collected at the top to make it easily > >> readable for those reading it through cfe-commits. > > > > > > First off, thanks for collecting the numbers and broadening the > > distribution. Also, sorry it took me so long to get over to this > > thread. > > > > I want to lay out my stance on this issue at a theoretical and > > practical > > level first. I'll follow up with thoughts on the numbers as well > > after that. > > > > I think that tying *any* optimizer behavior to the 'inline' keyword > > is > > fundamentally the wrong direction. > > Chandler, thanks for sharing your thought -- however I don't think it > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > that can be done, the inliner is in the end heuristic based. In lack > of the profile data, when inlining two calls yield the same static > benefit and size cost, it is reasonable for the inliner to think the > call to the function with inline hint to yield more high > dynamic/runtime benefit -- thus it has a higher static size budget to > burn. > > > > This is an argument for having *some* source level hinting construct. > While I think such a construct is very risky, I did actually suggest > that we add such a hint because I recognize some of the practical > necessities for it. > > > My primary argument is against using the 'inline' keyword as the > source of the hint, and especially using the happenstance of a > method body in a class as the source of the hint. > > > > > >We have reasons why we have done this > > historically, and we can't just do an immediate about face, but we > > should be > > actively looking for ways to *reduce* the optimizer's reliance on > > this > > keyword to convey any meaning whatsoever. > > yes those additional things will be done, but they are really > orthogonal. > > > > I'm not talking about additional things, I'm talking about separating > the optimization hint from the semantics and linkage changing > constructs. That does not seem orthogonal. > > > > > > > > The reason I think that is the correct direction is because, for > > better or > > worse, the 'inline' keyword in C++ is not about optimization, but > > about > > linkage. > > It is about both optimization and linkage. In fact the linkage simply > serves as an implementation detail. In C++ standard 7.1.2, paragraph > 2 says: > > "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier > declares an inline function. The inline specifier indicates to the > implementation that inline substitution of the function body at the > point of call is to be preferred to the usual function call > mechanism. > An implementation is not required to perform this inline substitution > at the point of call; however, even if this inline substitution is > omitted, the other rules for inline functions defined by 7.1.2 shall > still be respected." > > Developers see those and rely on those to give compiler the hints. > > > > This is essentially a nonsense paragraph for a standardized > specification of a programming language. How hard you optimize code > doesn't have any bearing on the conformance and behavior of the > program. =/ > > > I think this paragraph is part of the historical context. I think we > should change C++ to remove it (and I will propose that change) and > I think we should change C++ to support a standardized > *non*-semantic hint if folks really want to see that in the form of > a C++11-style [[attribute]]. >I don't understand exactly what you propose to change? I doubt you'll be able to convince the committee to make the 'inline' keyword not imply the optimization hint. I also doubt that you'll be able to remove the linkage-semantics attached to the 'inline' keyword, because that would be a massively-breaking change. The only thing you might be able to do is to remove the fact that in-class definitions imply the optimization hint part of 'inline'. This is practical, in a sense, but I highly suspect you'll run into exactly this problem because every vendor will run benchmarks and assert that making that change will, in general, degrade performance (for exactly the same reasons it would generally help performance with LLVM). This is not a random result, it is because...> > I'm also really, really confident that most developers are not using > the wording of the standard as the basis of how they tune the > performance of their code. =/ >I'm really confident you're wrong. Almost all, even moderately-experienced, C++ developers with whom I've worked explicitly make the decision on what code to put in the class definition and what not to put in the class definition on the understanding that putting the function definition in the class is a hint/instruction to the compiler to inline it. Most don't really understand about the linkage changes in detail, and frankly, that's just a work-around to make it possible for this scheme to work. But they certainly do know that putting functions in the class definition should make the compiler inline them. And the other issue underlying this is that the language design decision does not seem random either. It makes a lot of sense, putting small function definitions in the class definitions tends to enhance readability, but putting large ones in the class definition tends to harm it. Small functions that do simple things, which are the kinds of functions for which readability is enhanced by having them in the class definition, tend to also be good inlining candidates. Thus, it is not even clear that it is helpful to the average programmer to change this rule. Please do feel free to bring up this issue on the EWG mailing list, I'll be happy to be wrong about this (because, as I've said, this current arrangement is highly suboptimal), but in the mean time, I disagree with you. How about we get this change in controlled by a feature flag so that we can all do further experiments, and encourage our users to do so, more easily? -Hal> > > Most importantly, paragraph 3 says: > > "A function defined within a class definition is an inline function. > The inline specifier shall not appear on a block scope function > declaration.93 If the inline specifier is used in a friend > declaration, that declaration shall be a definition or the function > shall have previously been declared inline." > > Here we can see regardless of how optimizer will honor the hint and > to > what extent, and based on what analysis, > it is basically incorrect to drop the attribute on the floor for > in-class function definitions. Eswaran's fix is justified with this > reason alone. The side effect of changing inliner behavior is > irrelevant. > > > > I don't really understand what you're saying. Clang correctly carries > all of the *semantics* required for in-class method bodies. We > simply don't attach an optimization hint? I don't think this is > "incorrect". Nothing in the standard says how hard we should try > (and it can't, which is why the standard doesn't make sense to give > advice here). > > > > > > > It has a functional impact and can be both necessary or impossible > > to use to meet those functional requirements. This in turn leaves > > programmers in a lurch if the functional requirements are ever in > > tension > > with the optimizer requirements. > > Not sure what you mean. Performance conscious programmers use it all > the time. > > > > You might *have* to use the inline keyword to get correct linkage > even when it causes the optimizer to inline and that *hurts* > performance. > > > Similarly, you might *have* to not use the inline keyword to get > correct linkage even though you would like to give the optimizer a > hint for performance reasons (and are doing LTO, so you can). > > > Essentially, there is no guarantee that the semantic requirements of > linkage are correctly aligned with the desired optimizer hint. > > > > > > > We're also working really hard to get more widely deployed > > cross-module > > optimization strategies, in part to free programmers from the > > requirement > > that they put all their performance critical code in header files. > > That > > makes compilation faster, and has lots of benefits to the factoring > > and > > design of the code itself. We shouldn't then create an incentive to > > keep > > things in header files so that they pick up a hint to the > > optimizer. > > > > > Ultimately, the world will be a better place if we can eventually > > move code > > away from relying on the hint provided by the 'inline' keyword to > > the > > optimizer. > > > > While I would like to see that happen some day, I do think it is an > independent matter. > > > > Adding further optimization hints based around the linkage further > entrenches that model. If we want to move in this direction, this > patch is a step in the *wrong* direction. > > > > >Codebases > > with strong portability requirements could still (and probably > > should) > > forbid or tightly control access to this kind of hint. I would want > > really > > strong documentation about how this attribute *completely voids* > > your > > performance warranty (if such a thing exists) as from version to > > version of > > the compiler it may go from a helpful hint to a devastatingly bad > > hint. > > Why? If the compiler becomes smarter and smarter, the inline hint > will > become more and more irrelevant and eventually has no effect -- why > would the performance warranty be voided? If the compiler is not yet > smart enough, why would the compiler refuse to take the hint and > forbid developer provide the hint? > > > > I didn't suggest the compiler would do anything. I suggested that the > idea of hinting an optimizer about how to inline code is inherently > non-portable (its specific to an optimizer) and thus would likely be > less used in code bases with unusually strong portability concerns. > > > > > > But > > I think I could be persuaded to live with such a hint existing. But > > I'm > > *really* uncomfortable with it being tied to something that also > > impacts > > linkage or other semantics of the program. > > For consistent with standard, we should pass the attribute. Linkage > is > not affected in anyway. > > > > I do not think that there is any standards-consistency argument for > one optimization hint over another. We already get the linkage > correct for all of these things. The only consideration is the > *degree* to which we prefer to actually do inlining in the > optimizer. The standard at no point makes guarantees about these > degrees. > > > > > > > [1]: Currently, the only other hint we have available is pretty > > terrible as > > it *also* has semantic effects: the always_inline attribute. > > > > > >> > >> The proposed patch will add InlineHint to methods defined inside a > >> class: > >> > >> --- a/lib/CodeGen/CodeGenFunction.cpp > >> +++ b/lib/CodeGen/CodeGenFunction.cpp > >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl > >> GD, > >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { > >> if (!CGM.getCodeGenOpts().NoInline) { > >> for (auto RI : FD->redecls()) > >> - if (RI->isInlineSpecified()) { > >> + if (RI->isInlined()) { > >> Fn->addFnAttr(llvm::Attribute::InlineHint); > >> break; > >> } > >> > >> Here are the performance and size numbers I've collected: > >> > >> > >> - C++ subset of Spec: No performance effects, < 0.1% size increase > >> (all size numbers are text sizes returned by 'size') > >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large > >> .ii > >> file) , 4.1% size increase > > > > > > FWIW, this size increase seems *really* bad. I think that kills > > this > > approach even in the short term. > > Re. size and performance trade-off -- 0.9% performance improvement > should greatly win the size cost. Besides among all programs see, > only > clang sees this size increase with all the others seeing negligible > size increase. > > > > But the other programs don't see any performance wins. > > > We can likely get this 1% of runtime performance without paying that > high of a size cost. This is a real concern -- there are uses of > Clang and LLVM that are very sensitive to size regressions. And I > suspect we'll see other applications that also see the size loss. > > > If we could more *selectively* use a dedicated hint to get the > performance boost, we'd almost certainly not have to give up this > much code size. > > > > > >> > >> - Chrome: no performance improvement, 0.24% size increase > >> - Google internal benchmark suite (geomean of ~20 benchmarks): > >> ~1.8% > >> performance improvement, no size regression > > > > > > I'm also somewhat worried about the lack of any performance > > improvements > > outside of the Google benchmarks. That somewhat strongly suggests > > that our > > benchmarks are overly coupled to this hint already. The fact that > > neither > > Chrome, Clang, nor SPEC improved is... not at all encouraging. > > Other than Google benchmarks, we do see Clang improve performance. > Besides, current inliner needs to be further tuned in order to get > more performance benefit. Passing the hint through is simply an > enabler. Also remember that most of SPEC benchmarks are C programs. > C++ programs with heavy use of virtual functions may not benefit a > lot > either. > > > > I don't really understand why that changes my point... Are you saying > that without this hint, we can't do the subsequent work on the > inliner? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
On Tue, Jul 7, 2015 at 10:25 PM, Xinliang David Li <davidxl at google.com> wrote:> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth <chandlerc at gmail.com> > wrote: > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> wrote: > >> > >> I'm reviving this thread after a while and CCing cfe-commits as > >> suggested by David Blaikie. I've also collected numbers building > >> chrome (from chromium, on Linux) with and without this patch as > >> suggested by David. I've re-posted the proposed patch and > >> performance/size numbers collected at the top to make it easily > >> readable for those reading it through cfe-commits. > > > > > > First off, thanks for collecting the numbers and broadening the > > distribution. Also, sorry it took me so long to get over to this thread. > > > > I want to lay out my stance on this issue at a theoretical and practical > > level first. I'll follow up with thoughts on the numbers as well after > that. > > > > I think that tying *any* optimizer behavior to the 'inline' keyword is > > fundamentally the wrong direction. > > Chandler, thanks for sharing your thought -- however I don't think it > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > that can be done, the inliner is in the end heuristic based. In lack > of the profile data, when inlining two calls yield the same static > benefit and size cost, it is reasonable for the inliner to think the > call to the function with inline hint to yield more high > dynamic/runtime benefit -- thus it has a higher static size budget to > burn. > > >We have reasons why we have done this > > historically, and we can't just do an immediate about face, but we > should be > > actively looking for ways to *reduce* the optimizer's reliance on this > > keyword to convey any meaning whatsoever. > > yes those additional things will be done, but they are really orthogonal. > > > > > The reason I think that is the correct direction is because, for better > or > > worse, the 'inline' keyword in C++ is not about optimization, but about > > linkage. > > It is about both optimization and linkage. In fact the linkage simply > serves as an implementation detail. In C++ standard 7.1.2, paragraph > 2 says: > > "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier > declares an inline function. The inline specifier indicates to the > implementation that inline substitution of the function body at the > point of call is to be preferred to the usual function call mechanism. > An implementation is not required to perform this inline substitution > at the point of call; however, even if this inline substitution is > omitted, the other rules for inline functions defined by 7.1.2 shall > still be respected." >This describes Clang's current behavior, not the behavior you propose. Note that there is a difference between an "inline specifier" (meaning that the inline keyword appeared in the declaration, see the grammar description in paragraph 1 of 7.1.2, and the introductory text in 7.1) and a function being an "inline function" (meaning that multiple definitions are permitted in different translation units, see 3.2/4 and /6). The above wording clearly says that the inline *specifier* is a hint that we should inline the function, and...> Developers see those and rely on those to give compiler the hints. > > Most importantly, paragraph 3 says: > > "A function defined within a class definition is an inline function. >... the same is *not* true for a function definition that appears within a class definition. That is merely an inline function (that is, it can be defined in multiple translation units).> The inline specifier shall not appear on a block scope function > declaration.93 If the inline specifier is used in a friend > declaration, that declaration shall be a definition or the function > shall have previously been declared inline." > > Here we can see regardless of how optimizer will honor the hint and to > what extent, and based on what analysis, > it is basically incorrect to drop the attribute on the floor for > in-class function definitions. Eswaran's fix is justified with this > reason alone. The side effect of changing inliner behavior is > irrelevant. > > > It has a functional impact and can be both necessary or impossible > > to use to meet those functional requirements. This in turn leaves > > programmers in a lurch if the functional requirements are ever in tension > > with the optimizer requirements. > > Not sure what you mean. Performance conscious programmers use it all the > time. > > > > > We're also working really hard to get more widely deployed cross-module > > optimization strategies, in part to free programmers from the requirement > > that they put all their performance critical code in header files. That > > makes compilation faster, and has lots of benefits to the factoring and > > design of the code itself. We shouldn't then create an incentive to keep > > things in header files so that they pick up a hint to the optimizer. > > > > > Ultimately, the world will be a better place if we can eventually move > code > > away from relying on the hint provided by the 'inline' keyword to the > > optimizer. > > > > While I would like to see that happen some day, I do think it is an > independent matter. > > > > > That doesn't mean that the core concept of hinting to the optimizer that > a > > particular function is a particularly good candidate for inlining is > without > > value. > > yes. > > >While I think it is a bad practice that we shouldn't encourage in > > code (especially portable code) > > yes -- there are indeed programmers who use this casually without > considering performance. > > > I can see the desire to at least have *some* > > attribute which is nothing more or less than a hint to the optimizer to > > inline harder[1]. > > yes -- there are programmers who use the attribute consciously. > > > It would help people work around inliner bugs in the short > > term, and even help debug inliner-rooted optimization problems. > > I think it is a good hint to the compiler even in the longer term. > With PGO, we should minimize the reliance on the hint though. > > >Codebases > > with strong portability requirements could still (and probably should) > > forbid or tightly control access to this kind of hint. I would want > really > > strong documentation about how this attribute *completely voids* your > > performance warranty (if such a thing exists) as from version to version > of > > the compiler it may go from a helpful hint to a devastatingly bad hint. > > Why? If the compiler becomes smarter and smarter, the inline hint will > become more and more irrelevant and eventually has no effect -- why > would the performance warranty be voided? If the compiler is not yet > smart enough, why would the compiler refuse to take the hint and > forbid developer provide the hint? > > > But > > I think I could be persuaded to live with such a hint existing. But I'm > > *really* uncomfortable with it being tied to something that also impacts > > linkage or other semantics of the program. > > For consistent with standard, we should pass the attribute. Linkage is > not affected in anyway. > > > > > [1]: Currently, the only other hint we have available is pretty terrible > as > > it *also* has semantic effects: the always_inline attribute. > > > > > >> > >> The proposed patch will add InlineHint to methods defined inside a > class: > >> > >> --- a/lib/CodeGen/CodeGenFunction.cpp > >> +++ b/lib/CodeGen/CodeGenFunction.cpp > >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, > >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { > >> if (!CGM.getCodeGenOpts().NoInline) { > >> for (auto RI : FD->redecls()) > >> - if (RI->isInlineSpecified()) { > >> + if (RI->isInlined()) { > >> Fn->addFnAttr(llvm::Attribute::InlineHint); > >> break; > >> } > >> > >> Here are the performance and size numbers I've collected: > >> > >> > >> - C++ subset of Spec: No performance effects, < 0.1% size increase > >> (all size numbers are text sizes returned by 'size') > >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large .ii > >> file) , 4.1% size increase > > > > > > FWIW, this size increase seems *really* bad. I think that kills this > > approach even in the short term. > > Re. size and performance trade-off -- 0.9% performance improvement > should greatly win the size cost. Besides among all programs see, only > clang sees this size increase with all the others seeing negligible > size increase. > > This is not a short term vs long term situation. It is basically a bug > fix that FE drops the attribute. If it exposes inliner heuristic bug, > that should be fixed/tuned separately. With the hint correctly passed > in, Easwaran will do further tuning including time based analysis. > > > > >> > >> - Chrome: no performance improvement, 0.24% size increase > >> - Google internal benchmark suite (geomean of ~20 benchmarks): ~1.8% > >> performance improvement, no size regression > > > > > > I'm also somewhat worried about the lack of any performance improvements > > outside of the Google benchmarks. That somewhat strongly suggests that > our > > benchmarks are overly coupled to this hint already. The fact that neither > > Chrome, Clang, nor SPEC improved is... not at all encouraging. > > Other than Google benchmarks, we do see Clang improve performance. > Besides, current inliner needs to be further tuned in order to get > more performance benefit. Passing the hint through is simply an > enabler. Also remember that most of SPEC benchmarks are C programs. > C++ programs with heavy use of virtual functions may not benefit a lot > either. > > David > > > > > > -Chandler > _______________________________________________ > cfe-commits mailing list > cfe-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/13e1107e/attachment.html>
Xinliang David Li
2015-Jul-09 21:19 UTC
[LLVMdev] Inline hint for methods defined in-class
Not sure if this is the de facto understanding of developers (or other compilers), what you are saying is that there is difference between two types of C++ inline functions: 1) inline specified; 2) in-class definition. I am fine with that interpretation, but clang at least not tell backend that 2) is an inline function. Using linkonce linkage may be too broad for that purpose. thanks, David On Thu, Jul 9, 2015 at 1:40 PM, Richard Smith <richard at metafoo.co.uk> wrote:> On Tue, Jul 7, 2015 at 10:25 PM, Xinliang David Li <davidxl at google.com> > wrote: >> >> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth <chandlerc at gmail.com> >> wrote: >> > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com> wrote: >> >> >> >> I'm reviving this thread after a while and CCing cfe-commits as >> >> suggested by David Blaikie. I've also collected numbers building >> >> chrome (from chromium, on Linux) with and without this patch as >> >> suggested by David. I've re-posted the proposed patch and >> >> performance/size numbers collected at the top to make it easily >> >> readable for those reading it through cfe-commits. >> > >> > >> > First off, thanks for collecting the numbers and broadening the >> > distribution. Also, sorry it took me so long to get over to this thread. >> > >> > I want to lay out my stance on this issue at a theoretical and practical >> > level first. I'll follow up with thoughts on the numbers as well after >> > that. >> > >> > I think that tying *any* optimizer behavior to the 'inline' keyword is >> > fundamentally the wrong direction. >> >> Chandler, thanks for sharing your thought -- however I don't think it >> is wrong, let alone 'fundamentally wrong'. Despite all the analysis >> that can be done, the inliner is in the end heuristic based. In lack >> of the profile data, when inlining two calls yield the same static >> benefit and size cost, it is reasonable for the inliner to think the >> call to the function with inline hint to yield more high >> dynamic/runtime benefit -- thus it has a higher static size budget to >> burn. >> >> >We have reasons why we have done this >> > historically, and we can't just do an immediate about face, but we >> > should be >> > actively looking for ways to *reduce* the optimizer's reliance on this >> > keyword to convey any meaning whatsoever. >> >> yes those additional things will be done, but they are really orthogonal. >> >> > >> > The reason I think that is the correct direction is because, for better >> > or >> > worse, the 'inline' keyword in C++ is not about optimization, but about >> > linkage. >> >> It is about both optimization and linkage. In fact the linkage simply >> serves as an implementation detail. In C++ standard 7.1.2, paragraph >> 2 says: >> >> "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier >> declares an inline function. The inline specifier indicates to the >> implementation that inline substitution of the function body at the >> point of call is to be preferred to the usual function call mechanism. >> An implementation is not required to perform this inline substitution >> at the point of call; however, even if this inline substitution is >> omitted, the other rules for inline functions defined by 7.1.2 shall >> still be respected." > > > This describes Clang's current behavior, not the behavior you propose. > > Note that there is a difference between an "inline specifier" (meaning that > the inline keyword appeared in the declaration, see the grammar description > in paragraph 1 of 7.1.2, and the introductory text in 7.1) and a function > being an "inline function" (meaning that multiple definitions are permitted > in different translation units, see 3.2/4 and /6). > > The above wording clearly says that the inline *specifier* is a hint that we > should inline the function, and... > >> >> Developers see those and rely on those to give compiler the hints. >> >> Most importantly, paragraph 3 says: >> >> "A function defined within a class definition is an inline function. > > > ... the same is *not* true for a function definition that appears within a > class definition. That is merely an inline function (that is, it can be > defined in multiple translation units). > >> >> The inline specifier shall not appear on a block scope function >> declaration.93 If the inline specifier is used in a friend >> declaration, that declaration shall be a definition or the function >> shall have previously been declared inline." >> >> Here we can see regardless of how optimizer will honor the hint and to >> what extent, and based on what analysis, >> it is basically incorrect to drop the attribute on the floor for >> in-class function definitions. Eswaran's fix is justified with this >> reason alone. The side effect of changing inliner behavior is >> irrelevant. >> >> > It has a functional impact and can be both necessary or impossible >> > to use to meet those functional requirements. This in turn leaves >> > programmers in a lurch if the functional requirements are ever in >> > tension >> > with the optimizer requirements. >> >> Not sure what you mean. Performance conscious programmers use it all the >> time. >> >> > >> > We're also working really hard to get more widely deployed cross-module >> > optimization strategies, in part to free programmers from the >> > requirement >> > that they put all their performance critical code in header files. That >> > makes compilation faster, and has lots of benefits to the factoring and >> > design of the code itself. We shouldn't then create an incentive to keep >> > things in header files so that they pick up a hint to the optimizer. >> >> > >> > Ultimately, the world will be a better place if we can eventually move >> > code >> > away from relying on the hint provided by the 'inline' keyword to the >> > optimizer. >> > >> >> While I would like to see that happen some day, I do think it is an >> independent matter. >> >> > >> > That doesn't mean that the core concept of hinting to the optimizer that >> > a >> > particular function is a particularly good candidate for inlining is >> > without >> > value. >> >> yes. >> >> >While I think it is a bad practice that we shouldn't encourage in >> > code (especially portable code) >> >> yes -- there are indeed programmers who use this casually without >> considering performance. >> >> > I can see the desire to at least have *some* >> > attribute which is nothing more or less than a hint to the optimizer to >> > inline harder[1]. >> >> yes -- there are programmers who use the attribute consciously. >> >> > It would help people work around inliner bugs in the short >> > term, and even help debug inliner-rooted optimization problems. >> >> I think it is a good hint to the compiler even in the longer term. >> With PGO, we should minimize the reliance on the hint though. >> >> >Codebases >> > with strong portability requirements could still (and probably should) >> > forbid or tightly control access to this kind of hint. I would want >> > really >> > strong documentation about how this attribute *completely voids* your >> > performance warranty (if such a thing exists) as from version to version >> > of >> > the compiler it may go from a helpful hint to a devastatingly bad hint. >> >> Why? If the compiler becomes smarter and smarter, the inline hint will >> become more and more irrelevant and eventually has no effect -- why >> would the performance warranty be voided? If the compiler is not yet >> smart enough, why would the compiler refuse to take the hint and >> forbid developer provide the hint? >> >> > But >> > I think I could be persuaded to live with such a hint existing. But I'm >> > *really* uncomfortable with it being tied to something that also impacts >> > linkage or other semantics of the program. >> >> For consistent with standard, we should pass the attribute. Linkage is >> not affected in anyway. >> >> > >> > [1]: Currently, the only other hint we have available is pretty terrible >> > as >> > it *also* has semantic effects: the always_inline attribute. >> > >> > >> >> >> >> The proposed patch will add InlineHint to methods defined inside a >> >> class: >> >> >> >> --- a/lib/CodeGen/CodeGenFunction.cpp >> >> +++ b/lib/CodeGen/CodeGenFunction.cpp >> >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, >> >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { >> >> if (!CGM.getCodeGenOpts().NoInline) { >> >> for (auto RI : FD->redecls()) >> >> - if (RI->isInlineSpecified()) { >> >> + if (RI->isInlined()) { >> >> Fn->addFnAttr(llvm::Attribute::InlineHint); >> >> break; >> >> } >> >> >> >> Here are the performance and size numbers I've collected: >> >> >> >> >> >> - C++ subset of Spec: No performance effects, < 0.1% size increase >> >> (all size numbers are text sizes returned by 'size') >> >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large .ii >> >> file) , 4.1% size increase >> > >> > >> > FWIW, this size increase seems *really* bad. I think that kills this >> > approach even in the short term. >> >> Re. size and performance trade-off -- 0.9% performance improvement >> should greatly win the size cost. Besides among all programs see, only >> clang sees this size increase with all the others seeing negligible >> size increase. >> >> This is not a short term vs long term situation. It is basically a bug >> fix that FE drops the attribute. If it exposes inliner heuristic bug, >> that should be fixed/tuned separately. With the hint correctly passed >> in, Easwaran will do further tuning including time based analysis. >> >> > >> >> >> >> - Chrome: no performance improvement, 0.24% size increase >> >> - Google internal benchmark suite (geomean of ~20 benchmarks): ~1.8% >> >> performance improvement, no size regression >> > >> > >> > I'm also somewhat worried about the lack of any performance improvements >> > outside of the Google benchmarks. That somewhat strongly suggests that >> > our >> > benchmarks are overly coupled to this hint already. The fact that >> > neither >> > Chrome, Clang, nor SPEC improved is... not at all encouraging. >> >> Other than Google benchmarks, we do see Clang improve performance. >> Besides, current inliner needs to be further tuned in order to get >> more performance benefit. Passing the hint through is simply an >> enabler. Also remember that most of SPEC benchmarks are C programs. >> C++ programs with heavy use of virtual functions may not benefit a lot >> either. >> >> David >> >> >> > >> > -Chandler >> _______________________________________________ >> cfe-commits mailing list >> cfe-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > >
----- Original Message -----> From: "Richard Smith" <richard at metafoo.co.uk> > To: "Xinliang David Li" <davidxl at google.com> > Cc: "cfe commits" <cfe-commits at cs.uiuc.edu>, "<llvmdev at cs.uiuc.edu> List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, July 9, 2015 3:40:54 PM > Subject: Re: [LLVMdev] Inline hint for methods defined in-class > > On Tue, Jul 7, 2015 at 10:25 PM, Xinliang David Li < > davidxl at google.com > wrote: > > > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth < > chandlerc at gmail.com > wrote: > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman < eraman at google.com > > > wrote: > >> > >> I'm reviving this thread after a while and CCing cfe-commits as > >> suggested by David Blaikie. I've also collected numbers building > >> chrome (from chromium, on Linux) with and without this patch as > >> suggested by David. I've re-posted the proposed patch and > >> performance/size numbers collected at the top to make it easily > >> readable for those reading it through cfe-commits. > > > > > > First off, thanks for collecting the numbers and broadening the > > distribution. Also, sorry it took me so long to get over to this > > thread. > > > > I want to lay out my stance on this issue at a theoretical and > > practical > > level first. I'll follow up with thoughts on the numbers as well > > after that. > > > > I think that tying *any* optimizer behavior to the 'inline' keyword > > is > > fundamentally the wrong direction. > > Chandler, thanks for sharing your thought -- however I don't think it > is wrong, let alone 'fundamentally wrong'. Despite all the analysis > that can be done, the inliner is in the end heuristic based. In lack > of the profile data, when inlining two calls yield the same static > benefit and size cost, it is reasonable for the inliner to think the > call to the function with inline hint to yield more high > dynamic/runtime benefit -- thus it has a higher static size budget to > burn. > > >We have reasons why we have done this > > historically, and we can't just do an immediate about face, but we > > should be > > actively looking for ways to *reduce* the optimizer's reliance on > > this > > keyword to convey any meaning whatsoever. > > yes those additional things will be done, but they are really > orthogonal. > > > > > The reason I think that is the correct direction is because, for > > better or > > worse, the 'inline' keyword in C++ is not about optimization, but > > about > > linkage. > > It is about both optimization and linkage. In fact the linkage simply > serves as an implementation detail. In C++ standard 7.1.2, paragraph > 2 says: > > "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier > declares an inline function. The inline specifier indicates to the > implementation that inline substitution of the function body at the > point of call is to be preferred to the usual function call > mechanism. > An implementation is not required to perform this inline substitution > at the point of call; however, even if this inline substitution is > omitted, the other rules for inline functions defined by 7.1.2 shall > still be respected." > > > > This describes Clang's current behavior, not the behavior you > propose. > > > Note that there is a difference between an "inline specifier" > (meaning that the inline keyword appeared in the declaration, see > the grammar description in paragraph 1 of 7.1.2, and the > introductory text in 7.1) and a function being an "inline function" > (meaning that multiple definitions are permitted in different > translation units, see 3.2/4 and /6). > > > The above wording clearly says that the inline *specifier* is a hint > that we should inline the function, and...Is it really that clear? 1.3p3 says, "Terms that are used only in a small portion of this International Standard are defined where they are used and italicized where they are defined.". 7.1.2p2 says, "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier declares an inline function." and "inline function" is italicized there. Thus, "inline function", as a term, is *defined* to be a function declared with an inline specifier. 7.1.2p3 then also says that a "function defined within a class definition is an inline function." I would read this to mean, "as if it had an inline specifier", because that is how the term is defined in the paragraph above it. And, frankly, I don't think the intent of the drafting was to create a bifurcated system. If we're to read "inline function" as something other than, "a function intended to be inlined", it seems the standard would have used a different term to indicate the linkage effect from the hinting. -Hal> > Developers see those and rely on those to give compiler the hints. > > Most importantly, paragraph 3 says: > > "A function defined within a class definition is an inline function. > > > > ... the same is *not* true for a function definition that appears > within a class definition. That is merely an inline function (that > is, it can be defined in multiple translation units). > > > The inline specifier shall not appear on a block scope function > declaration.93 If the inline specifier is used in a friend > declaration, that declaration shall be a definition or the function > shall have previously been declared inline." > > Here we can see regardless of how optimizer will honor the hint and > to > what extent, and based on what analysis, > it is basically incorrect to drop the attribute on the floor for > in-class function definitions. Eswaran's fix is justified with this > reason alone. The side effect of changing inliner behavior is > irrelevant. > > > It has a functional impact and can be both necessary or impossible > > to use to meet those functional requirements. This in turn leaves > > programmers in a lurch if the functional requirements are ever in > > tension > > with the optimizer requirements. > > Not sure what you mean. Performance conscious programmers use it all > the time. > > > > > We're also working really hard to get more widely deployed > > cross-module > > optimization strategies, in part to free programmers from the > > requirement > > that they put all their performance critical code in header files. > > That > > makes compilation faster, and has lots of benefits to the factoring > > and > > design of the code itself. We shouldn't then create an incentive to > > keep > > things in header files so that they pick up a hint to the > > optimizer. > > > > > Ultimately, the world will be a better place if we can eventually > > move code > > away from relying on the hint provided by the 'inline' keyword to > > the > > optimizer. > > > > While I would like to see that happen some day, I do think it is an > independent matter. > > > > > That doesn't mean that the core concept of hinting to the optimizer > > that a > > particular function is a particularly good candidate for inlining > > is without > > value. > > yes. > > >While I think it is a bad practice that we shouldn't encourage in > > code (especially portable code) > > yes -- there are indeed programmers who use this casually without > considering performance. > > > I can see the desire to at least have *some* > > attribute which is nothing more or less than a hint to the > > optimizer to > > inline harder[1]. > > yes -- there are programmers who use the attribute consciously. > > > It would help people work around inliner bugs in the short > > term, and even help debug inliner-rooted optimization problems. > > I think it is a good hint to the compiler even in the longer term. > With PGO, we should minimize the reliance on the hint though. > > >Codebases > > with strong portability requirements could still (and probably > > should) > > forbid or tightly control access to this kind of hint. I would want > > really > > strong documentation about how this attribute *completely voids* > > your > > performance warranty (if such a thing exists) as from version to > > version of > > the compiler it may go from a helpful hint to a devastatingly bad > > hint. > > Why? If the compiler becomes smarter and smarter, the inline hint > will > become more and more irrelevant and eventually has no effect -- why > would the performance warranty be voided? If the compiler is not yet > smart enough, why would the compiler refuse to take the hint and > forbid developer provide the hint? > > > But > > I think I could be persuaded to live with such a hint existing. But > > I'm > > *really* uncomfortable with it being tied to something that also > > impacts > > linkage or other semantics of the program. > > For consistent with standard, we should pass the attribute. Linkage > is > not affected in anyway. > > > > > [1]: Currently, the only other hint we have available is pretty > > terrible as > > it *also* has semantic effects: the always_inline attribute. > > > > > >> > >> The proposed patch will add InlineHint to methods defined inside a > >> class: > >> > >> --- a/lib/CodeGen/CodeGenFunction.cpp > >> +++ b/lib/CodeGen/CodeGenFunction.cpp > >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl > >> GD, > >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D)) { > >> if (!CGM.getCodeGenOpts().NoInline) { > >> for (auto RI : FD->redecls()) > >> - if (RI->isInlineSpecified()) { > >> + if (RI->isInlined()) { > >> Fn->addFnAttr(llvm::Attribute::InlineHint); > >> break; > >> } > >> > >> Here are the performance and size numbers I've collected: > >> > >> > >> - C++ subset of Spec: No performance effects, < 0.1% size increase > >> (all size numbers are text sizes returned by 'size') > >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large > >> .ii > >> file) , 4.1% size increase > > > > > > FWIW, this size increase seems *really* bad. I think that kills > > this > > approach even in the short term. > > Re. size and performance trade-off -- 0.9% performance improvement > should greatly win the size cost. Besides among all programs see, > only > clang sees this size increase with all the others seeing negligible > size increase. > > This is not a short term vs long term situation. It is basically a > bug > fix that FE drops the attribute. If it exposes inliner heuristic bug, > that should be fixed/tuned separately. With the hint correctly passed > in, Easwaran will do further tuning including time based analysis. > > > > >> > >> - Chrome: no performance improvement, 0.24% size increase > >> - Google internal benchmark suite (geomean of ~20 benchmarks): > >> ~1.8% > >> performance improvement, no size regression > > > > > > I'm also somewhat worried about the lack of any performance > > improvements > > outside of the Google benchmarks. That somewhat strongly suggests > > that our > > benchmarks are overly coupled to this hint already. The fact that > > neither > > Chrome, Clang, nor SPEC improved is... not at all encouraging. > > Other than Google benchmarks, we do see Clang improve performance. > Besides, current inliner needs to be further tuned in order to get > more performance benefit. Passing the hint through is simply an > enabler. Also remember that most of SPEC benchmarks are C programs. > C++ programs with heavy use of virtual functions may not benefit a > lot > either. > > David > > > > > > -Chandler > _______________________________________________ > cfe-commits mailing list > cfe-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > > > _______________________________________________ > cfe-commits mailing list > cfe-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory