Teresa Johnson
2015-Jun-08 18:46 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote:> The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, do > you want to say more there? > > --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) > +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) > @@ -121,6 +121,7 @@ class PassManagerBuilder { > bool VerifyInput; > bool VerifyOutput; > bool MergeFunctions; > + bool LTO; > > While in the context of clang it makes sense that "LTO" means "generate an > object file appropriate for later LTO", but in the context of LLVM, it might > be taken to mean "we are doing LTO optimizations now". I think we probably > want to name the boolean something more specific (PreLTO? CompileForLTO?). > Later if we customize further we can split the codepath as is done for > populateLTOPassManager.Yeah, I was concerned that the name might be confusing in this context, so if we opt to go with a single llvm option to mean -flto I like the idea of something like CompileForLTO (or maybe PrepareForLTO?).> > + if (!LTO) { > + // Remove avail extern fns and globals definitions if we aren't > + // doing an -flto compilation. For LTO we will preserve these > > LLVM doesn't have an -flto flag, so I'd probably keep this generic to > "compiling an object file for later LTO".Ok> > + // so they are eligible for inlining at link-time. Note if they > + // are unreferenced they will be removed by GlobalDCE below, so > + // this only impacts referenced available externally globals. > + // Eventually they will be suppressed during codegen, but > eliminating > + // here is a compile-time savings. > > More than the compile time savings, it actually increases the power of > GlobalDCE and can reduce object file size, so I'd mention that instead.Ok, right in your case that is the benefit. Will update the comment.> > + MPM.add(createElimAvailExternPass()); > + } > > + // Drop initializers of available externally global variables. > + for (Module::global_iterator I = M.global_begin(), E = M.global_end(); > + I != E; ++I) { > + if (!I->hasAvailableExternallyLinkage()) > + continue; > + if (I->hasInitializer()) { > + Constant *Init = I->getInitializer(); > + I->setInitializer(nullptr); > + if (isSafeToDestroyConstant(Init)) > + Init->destroyConstant(); > + } > + I->removeDeadConstantUsers(); > + NumVariables++; > + } > > Do you need to set the linkage of global variables to external? I noticed > that Function::deleteBody() does this but setInitializer(nullptr) does not. > Ditto for aliases.Hmm, I am not sure - I basically am doing the same things GlobalDCE does (aside from erasing them from the function/variable/alias list). I guess it didn't matter in the GlobalDCE case since they were being completely erased rather than leaving a decl behind. Will make that change and retest. Thanks! Teresa> > On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> wrote: >> >> Talked to Eric Fri and he suggested that this might be the first of >> several places where we want behavior of LTO compiles to diverge from >> normal -O2 compiles. So for now I have implemented this such that we >> pass down -flto to the -cc1 job, and that gets propagated as a code >> gen option and into the PassManagerBuilder. I have left the current >> logic translating -flto to the -emit-llvm-bc option, since IMO that is >> cleaner for controlling the output type. Let me know what you think of >> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or >> if is it better to record just the individual behaviors we want (i.e. >> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to >> "ElimAvailExtern" or something like that). >> >> Patches attached. I modified an existing available external clang test >> to check for the new O2 (LTO vs not) behaviors. >> >> Thanks, >> Teresa >> >> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> >> wrote: >> > >> > >> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> >> > wrote: >> >> >> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote: >> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com> >> >> > wrote: >> >> >> >> >> >> Agreed. Although I assume you mean invoke the new pass under a >> >> >> ThinLTO-only option so that avail extern are not dropped in the >> >> >> compile pass before the LTO link? >> >> > >> >> > >> >> > No, this pass would actually be an improvement to the standard -O2 >> >> > pipeline. >> >> > The special case is the regular LTO pass pipeline, which wants to run >> >> > GlobalDCE but doesn't want to drop available_externally function >> >> > definitions >> >> > until after linker-stage inlining. >> >> >> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses >> >> this same -O2 optimization pipeline as without LTO. Clang communicates >> >> the fact that we are doing an LTO compile to llvm via the >> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit >> >> before codegen. So I would either need to key off of that or setup a >> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or >> >> is there some other way that I am missing? >> >> >> >> Incidentally, we'll also want to skip this new pass and keep any >> >> referenced avail extern functions in the ThinLTO -c compile step for >> >> the same reasons (and there are no imported functions yet at that >> >> point). >> >> >> > >> > Ultimately for any planned LTO build we're going to want to do a >> > different >> > pass pipeline, it's probably worth considering what should be done >> > before >> > and during LTO. >> > >> > -eric >> > >> >> >> >> Teresa >> >> >> >> > >> >> > Consider this test case: >> >> > >> >> > declare void @blah() >> >> > define i32 @main() { >> >> > call void @foo() >> >> > ret i32 0 >> >> > } >> >> > define available_externally void @foo() noinline { >> >> > call void @bar() >> >> > ret void >> >> > } >> >> > define linkonce_odr void @bar() noinline { >> >> > call void @blah() >> >> > ret void >> >> > } >> >> > >> >> > If I run opt -O2 on this and feed it to llc, it actually generates >> >> > code >> >> > for >> >> > bar, even though there are no actual references to bar in the final >> >> > code: >> >> > >> >> > main: # @main >> >> > pushq %rax >> >> > callq foo >> >> > xorl %eax, %eax >> >> > popq %rdx >> >> > retq >> >> > >> >> > bar: # @bar >> >> > jmp blah # TAILCALL >> >> > >> >> > This corner case happens to come up organically with dllimported >> >> > classes, >> >> > which is why I happened to think of it. :) I'm happy with a flag to >> >> > globaldce for LTO and the original patch, honestly. >> >> >> >> >> >> >> >> -- >> >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> >> 408-460-2413 >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 > >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
Reid Kleckner
2015-Jun-08 18:52 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On Mon, Jun 8, 2015 at 11:46 AM, Teresa Johnson <tejohnson at google.com> wrote:> On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote: > > The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, > do > > you want to say more there? > > > > --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) > > +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) > > @@ -121,6 +121,7 @@ class PassManagerBuilder { > > bool VerifyInput; > > bool VerifyOutput; > > bool MergeFunctions; > > + bool LTO; > > > > While in the context of clang it makes sense that "LTO" means "generate > an > > object file appropriate for later LTO", but in the context of LLVM, it > might > > be taken to mean "we are doing LTO optimizations now". I think we > probably > > want to name the boolean something more specific (PreLTO? > CompileForLTO?). > > Later if we customize further we can split the codepath as is done for > > populateLTOPassManager. > > Yeah, I was concerned that the name might be confusing in this > context, so if we opt to go with a single llvm option to mean -flto I > like the idea of something like CompileForLTO (or maybe > PrepareForLTO?).I like PrepareForLTO!> > Do you need to set the linkage of global variables to external? I noticed > > that Function::deleteBody() does this but setInitializer(nullptr) does > not. > > Ditto for aliases. > > Hmm, I am not sure - I basically am doing the same things GlobalDCE > does (aside from erasing them from the function/variable/alias list). > I guess it didn't matter in the GlobalDCE case since they were being > completely erased rather than leaving a decl behind. Will make that > change and retest. >Sounds good. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/da7e8ef8/attachment.html>
Eric Christopher
2015-Jun-08 20:08 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On Mon, Jun 8, 2015 at 11:46 AM Teresa Johnson <tejohnson at google.com> wrote:> On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote: > > The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, > do > > you want to say more there? > > > > --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) > > +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) > > @@ -121,6 +121,7 @@ class PassManagerBuilder { > > bool VerifyInput; > > bool VerifyOutput; > > bool MergeFunctions; > > + bool LTO; > > > > While in the context of clang it makes sense that "LTO" means "generate > an > > object file appropriate for later LTO", but in the context of LLVM, it > might > > be taken to mean "we are doing LTO optimizations now". I think we > probably > > want to name the boolean something more specific (PreLTO? > CompileForLTO?). > > Later if we customize further we can split the codepath as is done for > > populateLTOPassManager. > > Yeah, I was concerned that the name might be confusing in this > context, so if we opt to go with a single llvm option to mean -flto I > like the idea of something like CompileForLTO (or maybe > PrepareForLTO?). > >I'd rather not have it be an llvm option at all. Just construct a different set of passes... This would also solve the problem of needing multiple sets of options to be passed to the builder. It'd be a bit of a change (i.e. having clang do the pass setup), but I think it'd be worth it to have clang be able to explicitly say which passes it wants. It's also a major change and would probably need to be discussed more widely so for now I guess this is sorta ok.> > More than the compile time savings, it actually increases the power of > > GlobalDCE and can reduce object file size, so I'd mention that instead. > > Ok, right in your case that is the benefit. Will update the comment. > >In yours too, just by keeping the amount of IR smaller :) -eric> > > > + MPM.add(createElimAvailExternPass()); > > + } > > > > + // Drop initializers of available externally global variables. > > + for (Module::global_iterator I = M.global_begin(), E = M.global_end(); > > + I != E; ++I) { > > + if (!I->hasAvailableExternallyLinkage()) > > + continue; > > + if (I->hasInitializer()) { > > + Constant *Init = I->getInitializer(); > > + I->setInitializer(nullptr); > > + if (isSafeToDestroyConstant(Init)) > > + Init->destroyConstant(); > > + } > > + I->removeDeadConstantUsers(); > > + NumVariables++; > > + } > > > > Do you need to set the linkage of global variables to external? I noticed > > that Function::deleteBody() does this but setInitializer(nullptr) does > not. > > Ditto for aliases. > > Hmm, I am not sure - I basically am doing the same things GlobalDCE > does (aside from erasing them from the function/variable/alias list). > I guess it didn't matter in the GlobalDCE case since they were being > completely erased rather than leaving a decl behind. Will make that > change and retest. > > Thanks! > Teresa > > > > > On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> > wrote: > >> > >> Talked to Eric Fri and he suggested that this might be the first of > >> several places where we want behavior of LTO compiles to diverge from > >> normal -O2 compiles. So for now I have implemented this such that we > >> pass down -flto to the -cc1 job, and that gets propagated as a code > >> gen option and into the PassManagerBuilder. I have left the current > >> logic translating -flto to the -emit-llvm-bc option, since IMO that is > >> cleaner for controlling the output type. Let me know what you think of > >> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or > >> if is it better to record just the individual behaviors we want (i.e. > >> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to > >> "ElimAvailExtern" or something like that). > >> > >> Patches attached. I modified an existing available external clang test > >> to check for the new O2 (LTO vs not) behaviors. > >> > >> Thanks, > >> Teresa > >> > >> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> > >> wrote: > >> > > >> > > >> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> > >> > wrote: > >> >> > >> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> > wrote: > >> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson < > tejohnson at google.com> > >> >> > wrote: > >> >> >> > >> >> >> Agreed. Although I assume you mean invoke the new pass under a > >> >> >> ThinLTO-only option so that avail extern are not dropped in the > >> >> >> compile pass before the LTO link? > >> >> > > >> >> > > >> >> > No, this pass would actually be an improvement to the standard -O2 > >> >> > pipeline. > >> >> > The special case is the regular LTO pass pipeline, which wants to > run > >> >> > GlobalDCE but doesn't want to drop available_externally function > >> >> > definitions > >> >> > until after linker-stage inlining. > >> >> > >> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses > >> >> this same -O2 optimization pipeline as without LTO. Clang > communicates > >> >> the fact that we are doing an LTO compile to llvm via the > >> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit > >> >> before codegen. So I would either need to key off of that or setup a > >> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or > >> >> is there some other way that I am missing? > >> >> > >> >> Incidentally, we'll also want to skip this new pass and keep any > >> >> referenced avail extern functions in the ThinLTO -c compile step for > >> >> the same reasons (and there are no imported functions yet at that > >> >> point). > >> >> > >> > > >> > Ultimately for any planned LTO build we're going to want to do a > >> > different > >> > pass pipeline, it's probably worth considering what should be done > >> > before > >> > and during LTO. > >> > > >> > -eric > >> > > >> >> > >> >> Teresa > >> >> > >> >> > > >> >> > Consider this test case: > >> >> > > >> >> > declare void @blah() > >> >> > define i32 @main() { > >> >> > call void @foo() > >> >> > ret i32 0 > >> >> > } > >> >> > define available_externally void @foo() noinline { > >> >> > call void @bar() > >> >> > ret void > >> >> > } > >> >> > define linkonce_odr void @bar() noinline { > >> >> > call void @blah() > >> >> > ret void > >> >> > } > >> >> > > >> >> > If I run opt -O2 on this and feed it to llc, it actually generates > >> >> > code > >> >> > for > >> >> > bar, even though there are no actual references to bar in the final > >> >> > code: > >> >> > > >> >> > main: # @main > >> >> > pushq %rax > >> >> > callq foo > >> >> > xorl %eax, %eax > >> >> > popq %rdx > >> >> > retq > >> >> > > >> >> > bar: # @bar > >> >> > jmp blah # TAILCALL > >> >> > > >> >> > This corner case happens to come up organically with dllimported > >> >> > classes, > >> >> > which is why I happened to think of it. :) I'm happy with a flag to > >> >> > globaldce for LTO and the original patch, honestly. > >> >> > >> >> > >> >> > >> >> -- > >> >> Teresa Johnson | Software Engineer | tejohnson at google.com | > >> >> 408-460-2413 > >> >> _______________________________________________ > >> >> LLVM Developers mailing list > >> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > >> > >> > >> -- > >> Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 > > > > > > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/2c055b9a/attachment.html>
David Chisnall
2015-Jun-08 20:18 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On 8 Jun 2015, at 21:08, Eric Christopher <echristo at gmail.com> wrote:> > I'd rather not have it be an llvm option at all. Just construct a different set of passes... > > This would also solve the problem of needing multiple sets of options to be passed to the builder. It'd be a bit of a change (i.e. having clang do the pass setup), but I think it'd be worth it to have clang be able to explicitly say which passes it wants.There’s some value in making it possible for clang and opt to run the same set of passes. It’s also nice to be able to provide libraries that can inject passes at specific points in the optimisation sequence (this was one of the design goals of PassManagerBuilder) without needing to modify every front end that might use them. David
Reid Kleckner
2015-Jun-08 20:33 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On Mon, Jun 8, 2015 at 1:08 PM, Eric Christopher <echristo at gmail.com> wrote:> > I'd rather not have it be an llvm option at all. Just construct a > different set of passes... > > This would also solve the problem of needing multiple sets of options to > be passed to the builder. It'd be a bit of a change (i.e. having clang do > the pass setup), but I think it'd be worth it to have clang be able to > explicitly say which passes it wants. > > It's also a major change and would probably need to be discussed more > widely so for now I guess this is sorta ok. >I think for now the bool on PassManagerBuilder is a good way to make progress. Eventually, we may want to customize the LTO compilation phase optimization more fully (use lower inlining threshold, defer vectorization to link time), but we can cross that bridge when we get there. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/06b4f6a8/attachment.html>