Here is some timing information from running the Zig standard library tests: $ ./zig test ../std/index.zig --enable-timing-info Name Start End Duration Percent Initialize 0.0000 0.0010 0.0010 0.0001 Semantic Analysis 0.0010 0.9968 0.9958 0.1192 Code Generation 0.9968 1.4000 0.4032 0.0483 LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 Build Dependencies 8.1759 8.3341 0.1581 0.0189 LLVM Link 8.3341 8.3530 0.0189 0.0023 Total 0.0000 8.3530 8.3530 1.0000 81% of the time was spent waiting for LLVM to turn a Module into an object file. This is with optimizations off, FastISel, no module verification, etc. How can I speed this up? Any tips or things to look into? Here's the function that 81% of the time is spent inside: bool ZigLLVMTargetMachineEmitToFile(LLVMTargetMachineRef targ_machine_ref, LLVMModuleRef module_ref, const char *filename, ZigLLVM_EmitOutputType output_type, char **error_message, bool is_debug, bool is_small) { std::error_code EC; raw_fd_ostream dest(filename, EC, sys::fs::F_None); if (EC) { *error_message = strdup((const char *)StringRef(EC.message()).bytes_begin()); return true; } TargetMachine* target_machine reinterpret_cast<TargetMachine*>(targ_machine_ref); target_machine->setO0WantsFastISel(true); Module* module = unwrap(module_ref); PassManagerBuilder *PMBuilder = new(std::nothrow) PassManagerBuilder(); if (PMBuilder == nullptr) { *error_message = strdup("memory allocation failure"); return true; } PMBuilder->OptLevel = target_machine->getOptLevel(); PMBuilder->SizeLevel = is_small ? 2 : 0; PMBuilder->DisableTailCalls = is_debug; PMBuilder->DisableUnitAtATime = is_debug; PMBuilder->DisableUnrollLoops = is_debug; PMBuilder->SLPVectorize = !is_debug; PMBuilder->LoopVectorize = !is_debug; PMBuilder->RerollLoops = !is_debug; // Leaving NewGVN as default (off) because when on it caused issue #673 //PMBuilder->NewGVN = !is_debug; PMBuilder->DisableGVNLoadPRE = is_debug; PMBuilder->VerifyInput = assertions_on; PMBuilder->VerifyOutput = assertions_on; PMBuilder->MergeFunctions = !is_debug; PMBuilder->PrepareForLTO = false; PMBuilder->PrepareForThinLTO = false; PMBuilder->PerformThinLTO = false; TargetLibraryInfoImpl tlii(Triple(module->getTargetTriple())); PMBuilder->LibraryInfo = &tlii; if (is_debug) { PMBuilder->Inliner = createAlwaysInlinerLegacyPass(false); } else { target_machine->adjustPassManager(*PMBuilder); PMBuilder->addExtension(PassManagerBuilder::EP_EarlyAsPossible, addDiscriminatorsPass); PMBuilder->Inliner createFunctionInliningPass(PMBuilder->OptLevel, PMBuilder->SizeLevel, false); } addCoroutinePassesToExtensionPoints(*PMBuilder); // Set up the per-function pass manager. legacy::FunctionPassManager FPM = legacy::FunctionPassManager(module); auto tliwp = new(std::nothrow) TargetLibraryInfoWrapperPass(tlii); FPM.add(tliwp); FPM.add(createTargetTransformInfoWrapperPass(target_machine->getTargetIRAnalysis())); if (assertions_on) { FPM.add(createVerifierPass()); } PMBuilder->populateFunctionPassManager(FPM); // Set up the per-module pass manager. legacy::PassManager MPM; MPM.add(createTargetTransformInfoWrapperPass(target_machine->getTargetIRAnalysis())); PMBuilder->populateModulePassManager(MPM); // Set output pass. TargetMachine::CodeGenFileType ft; if (output_type != ZigLLVM_EmitLLVMIr) { switch (output_type) { case ZigLLVM_EmitAssembly: ft = TargetMachine::CGFT_AssemblyFile; break; case ZigLLVM_EmitBinary: ft = TargetMachine::CGFT_ObjectFile; break; default: abort(); } if (target_machine->addPassesToEmitFile(MPM, dest, ft)) { *error_message = strdup("TargetMachine can't emit a file of this type"); return true; } } // run per function optimization passes FPM.doInitialization(); for (Function &F : *module) if (!F.isDeclaration()) FPM.run(F); FPM.doFinalization(); MPM.run(*module); if (output_type == ZigLLVM_EmitLLVMIr) { if (LLVMPrintModuleToFile(module_ref, filename, error_message)) { return true; } } return false; } -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/d7df5280/attachment.html>
On 9/11/2018 5:48 PM, Andrew Kelley via llvm-dev wrote:> Here is some timing information from running the Zig standard library > tests: > > $ ./zig test ../std/index.zig --enable-timing-info > Name Start End Duration Percent > Initialize 0.0000 0.0010 0.0010 0.0001 > Semantic Analysis 0.0010 0.9968 0.9958 0.1192 > Code Generation 0.9968 1.4000 0.4032 0.0483 > LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 > Build Dependencies 8.1759 8.3341 0.1581 0.0189 > LLVM Link 8.3341 8.3530 0.0189 0.0023 > Total 0.0000 8.3530 8.3530 1.0000 > > 81% of the time was spent waiting for LLVM to turn a Module into an > object file. This is with optimizations off, FastISel, no module > verification, etc. > > How can I speed this up? Any tips or things to look into?First step is probably setting TimePassesIsEnabled to true and looking at the output. It's hard to say where the time is going without any numbers. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/37533d23/attachment.html>
Thanks, that was a really helpful suggestion. If you're curious- here are some of the high cost areas: ===-------------------------------------------------------------------------== DWARF Emission ===-------------------------------------------------------------------------== Total Execution Time: 2.0117 seconds (2.0185 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.7977 ( 64.4%) 0.4517 ( 58.4%) 1.2494 ( 62.1%) 1.2557 ( 62.2%) Debug Info Emission 0.4383 ( 35.4%) 0.3221 ( 41.6%) 0.7603 ( 37.8%) 0.7608 ( 37.7%) DWARF Exception Writer 0.0019 ( 0.2%) 0.0000 ( 0.0%) 0.0019 ( 0.1%) 0.0019 ( 0.1%) DWARF Debug Writer 1.2379 (100.0%) 0.7738 (100.0%) 2.0117 (100.0%) 2.0185 (100.0%) Total ===-------------------------------------------------------------------------== ... Pass execution timing report ... ===-------------------------------------------------------------------------== Total Execution Time: 10.3340 seconds (10.3289 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 2.4747 ( 31.5%) 1.6998 ( 68.5%) 4.1744 ( 40.4%) 4.1745 ( 40.4%) X86 Assembly Printer 2.3159 ( 29.5%) 0.0103 ( 0.4%) 2.3262 ( 22.5%) 2.3262 ( 22.5%) Coroutine Splitting 1.4348 ( 18.3%) 0.4092 ( 16.5%) 1.8440 ( 17.8%) 1.8428 ( 17.8%) X86 DAG->DAG Instruction Selection 0.1875 ( 2.4%) 0.0384 ( 1.5%) 0.2259 ( 2.2%) 0.2252 ( 2.2%) Fast Register Allocator I'm curious about the X86 Assembly Printer. What is it doing? The module in question has some inline assembly in it. So I would think that LLVM would parse it and then render it as machine code - at what point does it get printed? As for Coroutine Splitting - is this about as fast as it will get? If not then what's the point of LLVM having coroutines? Frontends would be better off implementing coroutines on top of structs, like Rust does. On Tue, Sep 11, 2018 at 9:01 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 9/11/2018 5:48 PM, Andrew Kelley via llvm-dev wrote: > > Here is some timing information from running the Zig standard library > tests: > > $ ./zig test ../std/index.zig --enable-timing-info > Name Start End Duration Percent > Initialize 0.0000 0.0010 0.0010 0.0001 > Semantic Analysis 0.0010 0.9968 0.9958 0.1192 > Code Generation 0.9968 1.4000 0.4032 0.0483 > LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 > Build Dependencies 8.1759 8.3341 0.1581 0.0189 > LLVM Link 8.3341 8.3530 0.0189 0.0023 > Total 0.0000 8.3530 8.3530 1.0000 > > 81% of the time was spent waiting for LLVM to turn a Module into an object > file. This is with optimizations off, FastISel, no module verification, etc. > > How can I speed this up? Any tips or things to look into? > > > First step is probably setting TimePassesIsEnabled to true and looking at > the output. It's hard to say where the time is going without any numbers. > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/32907d88/attachment.html>
Possibly Parallel Threads
- How to make LLVM go faster?
- [LLVMdev] Execution Engine: CodeGenOpt level
- [Bug 2091] New: scp hangs while copying a large file and being executed as a background process ( with nohup )
- Being VERY careful while using the --delete option
- Re: [PATCH v2] v2v: When picking a default kernel, favour non-debug kernels over debug kernels (RHBZ#1170073).