Teresa Johnson
2015-Jun-08 16:03 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
Talked to Eric Fri and he suggested that this might be the first of several places where we want behavior of LTO compiles to diverge from normal -O2 compiles. So for now I have implemented this such that we pass down -flto to the -cc1 job, and that gets propagated as a code gen option and into the PassManagerBuilder. I have left the current logic translating -flto to the -emit-llvm-bc option, since IMO that is cleaner for controlling the output type. Let me know what you think of saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or if is it better to record just the individual behaviors we want (i.e. change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to "ElimAvailExtern" or something like that). Patches attached. I modified an existing available external clang test to check for the new O2 (LTO vs not) behaviors. Thanks, Teresa On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> wrote:> > > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> wrote: >> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote: >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com> >> > wrote: >> >> >> >> Agreed. Although I assume you mean invoke the new pass under a >> >> ThinLTO-only option so that avail extern are not dropped in the >> >> compile pass before the LTO link? >> > >> > >> > No, this pass would actually be an improvement to the standard -O2 >> > pipeline. >> > The special case is the regular LTO pass pipeline, which wants to run >> > GlobalDCE but doesn't want to drop available_externally function >> > definitions >> > until after linker-stage inlining. >> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses >> this same -O2 optimization pipeline as without LTO. Clang communicates >> the fact that we are doing an LTO compile to llvm via the >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit >> before codegen. So I would either need to key off of that or setup a >> new flag to indicate to llvm that we are doing an LTO -c compile. Or >> is there some other way that I am missing? >> >> Incidentally, we'll also want to skip this new pass and keep any >> referenced avail extern functions in the ThinLTO -c compile step for >> the same reasons (and there are no imported functions yet at that >> point). >> > > Ultimately for any planned LTO build we're going to want to do a different > pass pipeline, it's probably worth considering what should be done before > and during LTO. > > -eric > >> >> Teresa >> >> > >> > Consider this test case: >> > >> > declare void @blah() >> > define i32 @main() { >> > call void @foo() >> > ret i32 0 >> > } >> > define available_externally void @foo() noinline { >> > call void @bar() >> > ret void >> > } >> > define linkonce_odr void @bar() noinline { >> > call void @blah() >> > ret void >> > } >> > >> > If I run opt -O2 on this and feed it to llc, it actually generates code >> > for >> > bar, even though there are no actual references to bar in the final >> > code: >> > >> > main: # @main >> > pushq %rax >> > callq foo >> > xorl %eax, %eax >> > popq %rdx >> > retq >> > >> > bar: # @bar >> > jmp blah # TAILCALL >> > >> > This corner case happens to come up organically with dllimported >> > classes, >> > which is why I happened to think of it. :) I'm happy with a flag to >> > globaldce for LTO and the original patch, honestly. >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- Index: include/llvm/InitializePasses.h ==================================================================--- include/llvm/InitializePasses.h (revision 237590) +++ include/llvm/InitializePasses.h (working copy) @@ -130,6 +130,7 @@ void initializeSanitizerCoverageModulePass(PassReg void initializeDataFlowSanitizerPass(PassRegistry&); void initializeScalarizerPass(PassRegistry&); void initializeEarlyCSELegacyPassPass(PassRegistry &); +void initializeElimAvailExternPass(PassRegistry&); void initializeExpandISelPseudosPass(PassRegistry&); void initializeFunctionAttrsPass(PassRegistry&); void initializeGCMachineCodeAnalysisPass(PassRegistry&); Index: include/llvm/Transforms/IPO.h ==================================================================--- include/llvm/Transforms/IPO.h (revision 237590) +++ include/llvm/Transforms/IPO.h (working copy) @@ -71,6 +71,12 @@ ModulePass *createGlobalOptimizerPass(); ModulePass *createGlobalDCEPass(); //===----------------------------------------------------------------------===// +/// createElimAvailExternPass - This transform is designed to eliminate +/// available external globals (functions or global variables) +/// +ModulePass *createElimAvailExternPass(); + +//===----------------------------------------------------------------------===// /// createGVExtractionPass - If deleteFn is true, this pass deletes /// the specified global values. Otherwise, it deletes as much of the module as /// possible, except for the global values specified. Index: include/llvm/Transforms/IPO/PassManagerBuilder.h ==================================================================--- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) @@ -121,6 +121,7 @@ class PassManagerBuilder { bool VerifyInput; bool VerifyOutput; bool MergeFunctions; + bool LTO; private: /// ExtensionList - This is list of all of the extensions that are registered. Index: lib/Transforms/IPO/CMakeLists.txt ==================================================================--- lib/Transforms/IPO/CMakeLists.txt (revision 237590) +++ lib/Transforms/IPO/CMakeLists.txt (working copy) @@ -3,6 +3,7 @@ add_llvm_library(LLVMipo BarrierNoopPass.cpp ConstantMerge.cpp DeadArgumentElimination.cpp + ElimAvailExtern.cpp ExtractGV.cpp FunctionAttrs.cpp GlobalDCE.cpp Index: lib/Transforms/IPO/ElimAvailExtern.cpp ==================================================================--- lib/Transforms/IPO/ElimAvailExtern.cpp (revision 0) +++ lib/Transforms/IPO/ElimAvailExtern.cpp (working copy) @@ -0,0 +1,90 @@ +//===-- ElimAvailExtern.cpp - DCE unreachable internal functions ----------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This transform is designed to eliminate available external global +// definitions from the program, turning them into declarations. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Transforms/IPO.h" +#include "llvm/ADT/Statistic.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/Module.h" +#include "llvm/Transforms/Utils/CtorUtils.h" +#include "llvm/Transforms/Utils/GlobalStatus.h" +#include "llvm/Pass.h" +using namespace llvm; + +#define DEBUG_TYPE "elimavailextern" + +STATISTIC(NumAliases , "Number of global aliases removed"); +STATISTIC(NumFunctions, "Number of functions removed"); +STATISTIC(NumVariables, "Number of global variables removed"); + +namespace { + struct ElimAvailExtern : public ModulePass { + static char ID; // Pass identification, replacement for typeid + ElimAvailExtern() : ModulePass(ID) { + initializeElimAvailExternPass(*PassRegistry::getPassRegistry()); + } + + // run - Do the ElimAvailExtern pass on the specified module, optionally + // updating the specified callgraph to reflect the changes. + // + bool runOnModule(Module &M) override; + }; +} + +char ElimAvailExtern::ID = 0; +INITIALIZE_PASS(ElimAvailExtern, "elimavailextern", + "Eliminate Available External Globals", false, false) + +ModulePass *llvm::createElimAvailExternPass() { return new ElimAvailExtern(); } + +bool ElimAvailExtern::runOnModule(Module &M) { + bool Changed = false; + + // Drop initializers of available externally global variables. + for (Module::global_iterator I = M.global_begin(), E = M.global_end(); + I != E; ++I) { + if (!I->hasAvailableExternallyLinkage()) + continue; + if (I->hasInitializer()) { + Constant *Init = I->getInitializer(); + I->setInitializer(nullptr); + if (isSafeToDestroyConstant(Init)) + Init->destroyConstant(); + } + I->removeDeadConstantUsers(); + NumVariables++; + } + + // Drop the bodies of available externally functions. + for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) { + if (!I->hasAvailableExternallyLinkage()) + continue; + if (!I->isDeclaration()) + I->deleteBody(); + I->removeDeadConstantUsers(); + NumFunctions++; + } + + // Drop targets of available externally aliases. + for (Module::alias_iterator I = M.alias_begin(), E = M.alias_end(); I != E; + ++I) { + if (!I->hasAvailableExternallyLinkage()) + continue; + I->setAliasee(nullptr); + I->removeDeadConstantUsers(); + NumAliases++; + } + + return Changed; +} Index: lib/Transforms/IPO/PassManagerBuilder.cpp ==================================================================--- lib/Transforms/IPO/PassManagerBuilder.cpp (revision 237590) +++ lib/Transforms/IPO/PassManagerBuilder.cpp (working copy) @@ -106,6 +106,7 @@ PassManagerBuilder::PassManagerBuilder() { VerifyInput = false; VerifyOutput = false; MergeFunctions = false; + LTO = false; } PassManagerBuilder::~PassManagerBuilder() { @@ -407,6 +408,16 @@ void PassManagerBuilder::populateModulePassManager // GlobalOpt already deletes dead functions and globals, at -O2 try a // late pass of GlobalDCE. It is capable of deleting dead cycles. if (OptLevel > 1) { + if (!LTO) { + // Remove avail extern fns and globals definitions if we aren't + // doing an -flto compilation. For LTO we will preserve these + // so they are eligible for inlining at link-time. Note if they + // are unreferenced they will be removed by GlobalDCE below, so + // this only impacts referenced available externally globals. + // Eventually they will be suppressed during codegen, but eliminating + // here is a compile-time savings. + MPM.add(createElimAvailExternPass()); + } MPM.add(createGlobalDCEPass()); // Remove dead fns and globals. MPM.add(createConstantMergePass()); // Merge dup global constants } -------------- next part -------------- Index: include/clang/Driver/Options.td ==================================================================--- include/clang/Driver/Options.td (revision 237590) +++ include/clang/Driver/Options.td (working copy) @@ -636,7 +636,7 @@ def flat__namespace : Flag<["-"], "flat_namespace" def flax_vector_conversions : Flag<["-"], "flax-vector-conversions">, Group<f_Group>; def flimited_precision_EQ : Joined<["-"], "flimited-precision=">, Group<f_Group>; def flto_EQ : Joined<["-"], "flto=">, Group<clang_ignored_gcc_optimization_f_Group>; -def flto : Flag<["-"], "flto">, Group<f_Group>; +def flto : Flag<["-"], "flto">, Flags<[CC1Option]>, Group<f_Group>; def fno_lto : Flag<["-"], "fno-lto">, Group<f_Group>; def fmacro_backtrace_limit_EQ : Joined<["-"], "fmacro-backtrace-limit=">, Group<f_Group>, Flags<[DriverOption, CoreOption]>; Index: include/clang/Frontend/CodeGenOptions.def ==================================================================--- include/clang/Frontend/CodeGenOptions.def (revision 237590) +++ include/clang/Frontend/CodeGenOptions.def (working copy) @@ -67,6 +67,7 @@ CODEGENOPT(InstrumentFunctions , 1, 0) ///< Set wh CODEGENOPT(InstrumentForProfiling , 1, 0) ///< Set when -pg is enabled. CODEGENOPT(LessPreciseFPMAD , 1, 0) ///< Enable less precise MAD instructions to ///< be generated. +CODEGENOPT(LTO , 1, 0) ///< Set when -flto is enabled. CODEGENOPT(MergeAllConstants , 1, 1) ///< Merge identical constants. CODEGENOPT(MergeFunctions , 1, 0) ///< Set when -fmerge-functions is enabled. CODEGENOPT(MSVolatile , 1, 0) ///< Set when /volatile:ms is enabled. Index: lib/CodeGen/BackendUtil.cpp ==================================================================--- lib/CodeGen/BackendUtil.cpp (revision 237590) +++ lib/CodeGen/BackendUtil.cpp (working copy) @@ -287,6 +287,7 @@ void EmitAssemblyHelper::CreatePasses() { PMBuilder.DisableUnitAtATime = !CodeGenOpts.UnitAtATime; PMBuilder.DisableUnrollLoops = !CodeGenOpts.UnrollLoops; PMBuilder.MergeFunctions = CodeGenOpts.MergeFunctions; + PMBuilder.LTO = CodeGenOpts.LTO; PMBuilder.RerollLoops = CodeGenOpts.RerollLoops; PMBuilder.addExtension(PassManagerBuilder::EP_EarlyAsPossible, Index: lib/Driver/Tools.cpp ==================================================================--- lib/Driver/Tools.cpp (revision 237590) +++ lib/Driver/Tools.cpp (working copy) @@ -2676,6 +2676,10 @@ void Clang::ConstructJob(Compilation &C, const Job assert((isa<CompileJobAction>(JA) || isa<BackendJobAction>(JA)) && "Invalid action for clang tool."); + if (JA.getType() == types::TY_LTO_IR || + JA.getType() == types::TY_LTO_BC) { + CmdArgs.push_back("-flto"); + } if (JA.getType() == types::TY_Nothing) { CmdArgs.push_back("-fsyntax-only"); } else if (JA.getType() == types::TY_LLVM_IR || Index: lib/Frontend/CompilerInvocation.cpp ==================================================================--- lib/Frontend/CompilerInvocation.cpp (revision 237590) +++ lib/Frontend/CompilerInvocation.cpp (working copy) @@ -489,6 +489,8 @@ static bool ParseCodeGenArgs(CodeGenOptions &Opts, Opts.MergeFunctions = Args.hasArg(OPT_fmerge_functions); + Opts.LTO = Args.hasArg(OPT_flto); + Opts.MSVolatile = Args.hasArg(OPT_fms_volatile); Opts.VectorizeBB = Args.hasArg(OPT_vectorize_slp_aggressive); Index: test/CodeGen/available-externally-suppress.c ==================================================================--- test/CodeGen/available-externally-suppress.c (revision 237590) +++ test/CodeGen/available-externally-suppress.c (working copy) @@ -1,6 +1,10 @@ // RUN: %clang_cc1 -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s +// RUN: %clang_cc1 -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s +// RUN: %clang_cc1 -flto -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s -check-prefix=LTO // Ensure that we don't emit available_externally functions at -O0. +// Also should not emit them at -O2, unless -flto is present in which case +// we should preserve them for link-time inlining decisions. int x; inline void f0(int y) { x = y; } @@ -7,6 +11,8 @@ inline void f0(int y) { x = y; } // CHECK-LABEL: define void @test() // CHECK: declare void @f0(i32) +// LTO-LABEL: define void @test() +// LTO: define available_externally void @f0 void test() { f0(17); } @@ -19,9 +25,13 @@ inline int __attribute__((always_inline)) f1(int x } // CHECK: @test1 +// LTO: @test1 int test1(int x) { // CHECK: br i1 // CHECK-NOT: call {{.*}} @f1 // CHECK: ret i32 + // LTO: br i1 + // LTO-NOT: call {{.*}} @f1 + // LTO: ret i32 return f1(x); }
Reid Kleckner
2015-Jun-08 18:33 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, do you want to say more there? --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) @@ -121,6 +121,7 @@ class PassManagerBuilder { bool VerifyInput; bool VerifyOutput; bool MergeFunctions; + bool LTO; While in the context of clang it makes sense that "LTO" means "generate an object file appropriate for later LTO", but in the context of LLVM, it might be taken to mean "we are doing LTO optimizations now". I think we probably want to name the boolean something more specific (PreLTO? CompileForLTO?). Later if we customize further we can split the codepath as is done for populateLTOPassManager. + if (!LTO) { + // Remove avail extern fns and globals definitions if we aren't + // doing an -flto compilation. For LTO we will preserve these LLVM doesn't have an -flto flag, so I'd probably keep this generic to "compiling an object file for later LTO". + // so they are eligible for inlining at link-time. Note if they + // are unreferenced they will be removed by GlobalDCE below, so + // this only impacts referenced available externally globals. + // Eventually they will be suppressed during codegen, but eliminating + // here is a compile-time savings. More than the compile time savings, it actually increases the power of GlobalDCE and can reduce object file size, so I'd mention that instead. + MPM.add(createElimAvailExternPass()); + } + // Drop initializers of available externally global variables. + for (Module::global_iterator I = M.global_begin(), E = M.global_end(); + I != E; ++I) { + if (!I->hasAvailableExternallyLinkage()) + continue; + if (I->hasInitializer()) { + Constant *Init = I->getInitializer(); + I->setInitializer(nullptr); + if (isSafeToDestroyConstant(Init)) + Init->destroyConstant(); + } + I->removeDeadConstantUsers(); + NumVariables++; + } Do you need to set the linkage of global variables to external? I noticed that Function::deleteBody() does this but setInitializer(nullptr) does not. Ditto for aliases. On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> wrote:> Talked to Eric Fri and he suggested that this might be the first of > several places where we want behavior of LTO compiles to diverge from > normal -O2 compiles. So for now I have implemented this such that we > pass down -flto to the -cc1 job, and that gets propagated as a code > gen option and into the PassManagerBuilder. I have left the current > logic translating -flto to the -emit-llvm-bc option, since IMO that is > cleaner for controlling the output type. Let me know what you think of > saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or > if is it better to record just the individual behaviors we want (i.e. > change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to > "ElimAvailExtern" or something like that). > > Patches attached. I modified an existing available external clang test > to check for the new O2 (LTO vs not) behaviors. > > Thanks, > Teresa > > On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> > wrote: > > > > > > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> > wrote: > >> > >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote: > >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com> > >> > wrote: > >> >> > >> >> Agreed. Although I assume you mean invoke the new pass under a > >> >> ThinLTO-only option so that avail extern are not dropped in the > >> >> compile pass before the LTO link? > >> > > >> > > >> > No, this pass would actually be an improvement to the standard -O2 > >> > pipeline. > >> > The special case is the regular LTO pass pipeline, which wants to run > >> > GlobalDCE but doesn't want to drop available_externally function > >> > definitions > >> > until after linker-stage inlining. > >> > >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses > >> this same -O2 optimization pipeline as without LTO. Clang communicates > >> the fact that we are doing an LTO compile to llvm via the > >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit > >> before codegen. So I would either need to key off of that or setup a > >> new flag to indicate to llvm that we are doing an LTO -c compile. Or > >> is there some other way that I am missing? > >> > >> Incidentally, we'll also want to skip this new pass and keep any > >> referenced avail extern functions in the ThinLTO -c compile step for > >> the same reasons (and there are no imported functions yet at that > >> point). > >> > > > > Ultimately for any planned LTO build we're going to want to do a > different > > pass pipeline, it's probably worth considering what should be done before > > and during LTO. > > > > -eric > > > >> > >> Teresa > >> > >> > > >> > Consider this test case: > >> > > >> > declare void @blah() > >> > define i32 @main() { > >> > call void @foo() > >> > ret i32 0 > >> > } > >> > define available_externally void @foo() noinline { > >> > call void @bar() > >> > ret void > >> > } > >> > define linkonce_odr void @bar() noinline { > >> > call void @blah() > >> > ret void > >> > } > >> > > >> > If I run opt -O2 on this and feed it to llc, it actually generates > code > >> > for > >> > bar, even though there are no actual references to bar in the final > >> > code: > >> > > >> > main: # @main > >> > pushq %rax > >> > callq foo > >> > xorl %eax, %eax > >> > popq %rdx > >> > retq > >> > > >> > bar: # @bar > >> > jmp blah # TAILCALL > >> > > >> > This corner case happens to come up organically with dllimported > >> > classes, > >> > which is why I happened to think of it. :) I'm happy with a flag to > >> > globaldce for LTO and the original patch, honestly. > >> > >> > >> > >> -- > >> Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/d79b7699/attachment.html>
Teresa Johnson
2015-Jun-08 18:46 UTC
[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote:> The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, do > you want to say more there? > > --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590) > +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy) > @@ -121,6 +121,7 @@ class PassManagerBuilder { > bool VerifyInput; > bool VerifyOutput; > bool MergeFunctions; > + bool LTO; > > While in the context of clang it makes sense that "LTO" means "generate an > object file appropriate for later LTO", but in the context of LLVM, it might > be taken to mean "we are doing LTO optimizations now". I think we probably > want to name the boolean something more specific (PreLTO? CompileForLTO?). > Later if we customize further we can split the codepath as is done for > populateLTOPassManager.Yeah, I was concerned that the name might be confusing in this context, so if we opt to go with a single llvm option to mean -flto I like the idea of something like CompileForLTO (or maybe PrepareForLTO?).> > + if (!LTO) { > + // Remove avail extern fns and globals definitions if we aren't > + // doing an -flto compilation. For LTO we will preserve these > > LLVM doesn't have an -flto flag, so I'd probably keep this generic to > "compiling an object file for later LTO".Ok> > + // so they are eligible for inlining at link-time. Note if they > + // are unreferenced they will be removed by GlobalDCE below, so > + // this only impacts referenced available externally globals. > + // Eventually they will be suppressed during codegen, but > eliminating > + // here is a compile-time savings. > > More than the compile time savings, it actually increases the power of > GlobalDCE and can reduce object file size, so I'd mention that instead.Ok, right in your case that is the benefit. Will update the comment.> > + MPM.add(createElimAvailExternPass()); > + } > > + // Drop initializers of available externally global variables. > + for (Module::global_iterator I = M.global_begin(), E = M.global_end(); > + I != E; ++I) { > + if (!I->hasAvailableExternallyLinkage()) > + continue; > + if (I->hasInitializer()) { > + Constant *Init = I->getInitializer(); > + I->setInitializer(nullptr); > + if (isSafeToDestroyConstant(Init)) > + Init->destroyConstant(); > + } > + I->removeDeadConstantUsers(); > + NumVariables++; > + } > > Do you need to set the linkage of global variables to external? I noticed > that Function::deleteBody() does this but setInitializer(nullptr) does not. > Ditto for aliases.Hmm, I am not sure - I basically am doing the same things GlobalDCE does (aside from erasing them from the function/variable/alias list). I guess it didn't matter in the GlobalDCE case since they were being completely erased rather than leaving a decl behind. Will make that change and retest. Thanks! Teresa> > On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> wrote: >> >> Talked to Eric Fri and he suggested that this might be the first of >> several places where we want behavior of LTO compiles to diverge from >> normal -O2 compiles. So for now I have implemented this such that we >> pass down -flto to the -cc1 job, and that gets propagated as a code >> gen option and into the PassManagerBuilder. I have left the current >> logic translating -flto to the -emit-llvm-bc option, since IMO that is >> cleaner for controlling the output type. Let me know what you think of >> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or >> if is it better to record just the individual behaviors we want (i.e. >> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to >> "ElimAvailExtern" or something like that). >> >> Patches attached. I modified an existing available external clang test >> to check for the new O2 (LTO vs not) behaviors. >> >> Thanks, >> Teresa >> >> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> >> wrote: >> > >> > >> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> >> > wrote: >> >> >> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote: >> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com> >> >> > wrote: >> >> >> >> >> >> Agreed. Although I assume you mean invoke the new pass under a >> >> >> ThinLTO-only option so that avail extern are not dropped in the >> >> >> compile pass before the LTO link? >> >> > >> >> > >> >> > No, this pass would actually be an improvement to the standard -O2 >> >> > pipeline. >> >> > The special case is the regular LTO pass pipeline, which wants to run >> >> > GlobalDCE but doesn't want to drop available_externally function >> >> > definitions >> >> > until after linker-stage inlining. >> >> >> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses >> >> this same -O2 optimization pipeline as without LTO. Clang communicates >> >> the fact that we are doing an LTO compile to llvm via the >> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit >> >> before codegen. So I would either need to key off of that or setup a >> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or >> >> is there some other way that I am missing? >> >> >> >> Incidentally, we'll also want to skip this new pass and keep any >> >> referenced avail extern functions in the ThinLTO -c compile step for >> >> the same reasons (and there are no imported functions yet at that >> >> point). >> >> >> > >> > Ultimately for any planned LTO build we're going to want to do a >> > different >> > pass pipeline, it's probably worth considering what should be done >> > before >> > and during LTO. >> > >> > -eric >> > >> >> >> >> Teresa >> >> >> >> > >> >> > Consider this test case: >> >> > >> >> > declare void @blah() >> >> > define i32 @main() { >> >> > call void @foo() >> >> > ret i32 0 >> >> > } >> >> > define available_externally void @foo() noinline { >> >> > call void @bar() >> >> > ret void >> >> > } >> >> > define linkonce_odr void @bar() noinline { >> >> > call void @blah() >> >> > ret void >> >> > } >> >> > >> >> > If I run opt -O2 on this and feed it to llc, it actually generates >> >> > code >> >> > for >> >> > bar, even though there are no actual references to bar in the final >> >> > code: >> >> > >> >> > main: # @main >> >> > pushq %rax >> >> > callq foo >> >> > xorl %eax, %eax >> >> > popq %rdx >> >> > retq >> >> > >> >> > bar: # @bar >> >> > jmp blah # TAILCALL >> >> > >> >> > This corner case happens to come up organically with dllimported >> >> > classes, >> >> > which is why I happened to think of it. :) I'm happy with a flag to >> >> > globaldce for LTO and the original patch, honestly. >> >> >> >> >> >> >> >> -- >> >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> >> 408-460-2413 >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 > >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413