Thanks, that was a really helpful suggestion. If you're curious- here are some of the high cost areas: ===-------------------------------------------------------------------------== DWARF Emission ===-------------------------------------------------------------------------== Total Execution Time: 2.0117 seconds (2.0185 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.7977 ( 64.4%) 0.4517 ( 58.4%) 1.2494 ( 62.1%) 1.2557 ( 62.2%) Debug Info Emission 0.4383 ( 35.4%) 0.3221 ( 41.6%) 0.7603 ( 37.8%) 0.7608 ( 37.7%) DWARF Exception Writer 0.0019 ( 0.2%) 0.0000 ( 0.0%) 0.0019 ( 0.1%) 0.0019 ( 0.1%) DWARF Debug Writer 1.2379 (100.0%) 0.7738 (100.0%) 2.0117 (100.0%) 2.0185 (100.0%) Total ===-------------------------------------------------------------------------== ... Pass execution timing report ... ===-------------------------------------------------------------------------== Total Execution Time: 10.3340 seconds (10.3289 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 2.4747 ( 31.5%) 1.6998 ( 68.5%) 4.1744 ( 40.4%) 4.1745 ( 40.4%) X86 Assembly Printer 2.3159 ( 29.5%) 0.0103 ( 0.4%) 2.3262 ( 22.5%) 2.3262 ( 22.5%) Coroutine Splitting 1.4348 ( 18.3%) 0.4092 ( 16.5%) 1.8440 ( 17.8%) 1.8428 ( 17.8%) X86 DAG->DAG Instruction Selection 0.1875 ( 2.4%) 0.0384 ( 1.5%) 0.2259 ( 2.2%) 0.2252 ( 2.2%) Fast Register Allocator I'm curious about the X86 Assembly Printer. What is it doing? The module in question has some inline assembly in it. So I would think that LLVM would parse it and then render it as machine code - at what point does it get printed? As for Coroutine Splitting - is this about as fast as it will get? If not then what's the point of LLVM having coroutines? Frontends would be better off implementing coroutines on top of structs, like Rust does. On Tue, Sep 11, 2018 at 9:01 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 9/11/2018 5:48 PM, Andrew Kelley via llvm-dev wrote: > > Here is some timing information from running the Zig standard library > tests: > > $ ./zig test ../std/index.zig --enable-timing-info > Name Start End Duration Percent > Initialize 0.0000 0.0010 0.0010 0.0001 > Semantic Analysis 0.0010 0.9968 0.9958 0.1192 > Code Generation 0.9968 1.4000 0.4032 0.0483 > LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 > Build Dependencies 8.1759 8.3341 0.1581 0.0189 > LLVM Link 8.3341 8.3530 0.0189 0.0023 > Total 0.0000 8.3530 8.3530 1.0000 > > 81% of the time was spent waiting for LLVM to turn a Module into an object > file. This is with optimizations off, FastISel, no module verification, etc. > > How can I speed this up? Any tips or things to look into? > > > First step is probably setting TimePassesIsEnabled to true and looking at > the output. It's hard to say where the time is going without any numbers. > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/32907d88/attachment.html>
The X86 assembly printer is badly named. I think its a leftover from before LLVM had an integrated assembler. It's where the assembly would have been printed. Now it is where MachineInstrs are converted to MCInsts and either printed or turned into binary. ~Craig On Tue, Sep 11, 2018 at 6:49 PM Andrew Kelley via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Thanks, that was a really helpful suggestion. If you're curious- here are > some of the high cost areas: > > > ===-------------------------------------------------------------------------==> DWARF Emission > > ===-------------------------------------------------------------------------==> Total Execution Time: 2.0117 seconds (2.0185 wall clock) > > ---User Time--- --System Time-- --User+System-- ---Wall Time--- > --- Name --- > 0.7977 ( 64.4%) 0.4517 ( 58.4%) 1.2494 ( 62.1%) 1.2557 ( 62.2%) > Debug Info Emission > 0.4383 ( 35.4%) 0.3221 ( 41.6%) 0.7603 ( 37.8%) 0.7608 ( 37.7%) > DWARF Exception Writer > 0.0019 ( 0.2%) 0.0000 ( 0.0%) 0.0019 ( 0.1%) 0.0019 ( 0.1%) > DWARF Debug Writer > 1.2379 (100.0%) 0.7738 (100.0%) 2.0117 (100.0%) 2.0185 (100.0%) > Total > > ===-------------------------------------------------------------------------==> ... Pass execution timing report ... > > ===-------------------------------------------------------------------------==> Total Execution Time: 10.3340 seconds (10.3289 wall clock) > > ---User Time--- --System Time-- --User+System-- ---Wall Time--- > --- Name --- > 2.4747 ( 31.5%) 1.6998 ( 68.5%) 4.1744 ( 40.4%) 4.1745 ( 40.4%) > X86 Assembly Printer > 2.3159 ( 29.5%) 0.0103 ( 0.4%) 2.3262 ( 22.5%) 2.3262 ( 22.5%) > Coroutine Splitting > 1.4348 ( 18.3%) 0.4092 ( 16.5%) 1.8440 ( 17.8%) 1.8428 ( 17.8%) > X86 DAG->DAG Instruction Selection > 0.1875 ( 2.4%) 0.0384 ( 1.5%) 0.2259 ( 2.2%) 0.2252 ( 2.2%) > Fast Register Allocator > > > I'm curious about the X86 Assembly Printer. What is it doing? The module > in question has some inline assembly in it. So I would think that LLVM > would parse it and then render it as machine code - at what point does it > get printed? > > As for Coroutine Splitting - is this about as fast as it will get? If not > then what's the point of LLVM having coroutines? Frontends would be better > off implementing coroutines on top of structs, like Rust does. > > > On Tue, Sep 11, 2018 at 9:01 PM Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> On 9/11/2018 5:48 PM, Andrew Kelley via llvm-dev wrote: >> >> Here is some timing information from running the Zig standard library >> tests: >> >> $ ./zig test ../std/index.zig --enable-timing-info >> Name Start End Duration Percent >> Initialize 0.0000 0.0010 0.0010 0.0001 >> Semantic Analysis 0.0010 0.9968 0.9958 0.1192 >> Code Generation 0.9968 1.4000 0.4032 0.0483 >> LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 >> Build Dependencies 8.1759 8.3341 0.1581 0.0189 >> LLVM Link 8.3341 8.3530 0.0189 0.0023 >> Total 0.0000 8.3530 8.3530 1.0000 >> >> 81% of the time was spent waiting for LLVM to turn a Module into an >> object file. This is with optimizations off, FastISel, no module >> verification, etc. >> >> How can I speed this up? Any tips or things to look into? >> >> >> First step is probably setting TimePassesIsEnabled to true and looking at >> the output. It's hard to say where the time is going without any numbers. >> >> -Eli >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/90d4c03e/attachment.html>
It might be interesting to know whether the percentages are basically the same for non-x86. On Tue, Sep 11, 2018 at 6:54 PM, Craig Topper via llvm-dev < llvm-dev at lists.llvm.org> wrote:> The X86 assembly printer is badly named. I think its a leftover from > before LLVM had an integrated assembler. It's where the assembly would have > been printed. Now it is where MachineInstrs are converted to MCInsts and > either printed or turned into binary. > > ~Craig > > > On Tue, Sep 11, 2018 at 6:49 PM Andrew Kelley via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Thanks, that was a really helpful suggestion. If you're curious- here are >> some of the high cost areas: >> >> ===--------------------------------------------------------- >> ----------------==>> DWARF Emission >> ===--------------------------------------------------------- >> ----------------==>> Total Execution Time: 2.0117 seconds (2.0185 wall clock) >> >> ---User Time--- --System Time-- --User+System-- ---Wall Time--- >> --- Name --- >> 0.7977 ( 64.4%) 0.4517 ( 58.4%) 1.2494 ( 62.1%) 1.2557 ( 62.2%) >> Debug Info Emission >> 0.4383 ( 35.4%) 0.3221 ( 41.6%) 0.7603 ( 37.8%) 0.7608 ( 37.7%) >> DWARF Exception Writer >> 0.0019 ( 0.2%) 0.0000 ( 0.0%) 0.0019 ( 0.1%) 0.0019 ( 0.1%) >> DWARF Debug Writer >> 1.2379 (100.0%) 0.7738 (100.0%) 2.0117 (100.0%) 2.0185 (100.0%) >> Total >> ===--------------------------------------------------------- >> ----------------==>> ... Pass execution timing report ... >> ===--------------------------------------------------------- >> ----------------==>> Total Execution Time: 10.3340 seconds (10.3289 wall clock) >> >> ---User Time--- --System Time-- --User+System-- ---Wall Time--- >> --- Name --- >> 2.4747 ( 31.5%) 1.6998 ( 68.5%) 4.1744 ( 40.4%) 4.1745 ( 40.4%) >> X86 Assembly Printer >> 2.3159 ( 29.5%) 0.0103 ( 0.4%) 2.3262 ( 22.5%) 2.3262 ( 22.5%) >> Coroutine Splitting >> 1.4348 ( 18.3%) 0.4092 ( 16.5%) 1.8440 ( 17.8%) 1.8428 ( 17.8%) >> X86 DAG->DAG Instruction Selection >> 0.1875 ( 2.4%) 0.0384 ( 1.5%) 0.2259 ( 2.2%) 0.2252 ( 2.2%) >> Fast Register Allocator >> >> >> I'm curious about the X86 Assembly Printer. What is it doing? The module >> in question has some inline assembly in it. So I would think that LLVM >> would parse it and then render it as machine code - at what point does it >> get printed? >> >> As for Coroutine Splitting - is this about as fast as it will get? If not >> then what's the point of LLVM having coroutines? Frontends would be better >> off implementing coroutines on top of structs, like Rust does. >> >> >> On Tue, Sep 11, 2018 at 9:01 PM Friedman, Eli <efriedma at codeaurora.org> >> wrote: >> >>> On 9/11/2018 5:48 PM, Andrew Kelley via llvm-dev wrote: >>> >>> Here is some timing information from running the Zig standard library >>> tests: >>> >>> $ ./zig test ../std/index.zig --enable-timing-info >>> Name Start End Duration Percent >>> Initialize 0.0000 0.0010 0.0010 0.0001 >>> Semantic Analysis 0.0010 0.9968 0.9958 0.1192 >>> Code Generation 0.9968 1.4000 0.4032 0.0483 >>> LLVM Emit Output 1.4000 8.1759 6.7760 0.8112 >>> Build Dependencies 8.1759 8.3341 0.1581 0.0189 >>> LLVM Link 8.3341 8.3530 0.0189 0.0023 >>> Total 0.0000 8.3530 8.3530 1.0000 >>> >>> 81% of the time was spent waiting for LLVM to turn a Module into an >>> object file. This is with optimizations off, FastISel, no module >>> verification, etc. >>> >>> How can I speed this up? Any tips or things to look into? >>> >>> >>> First step is probably setting TimePassesIsEnabled to true and looking >>> at the output. It's hard to say where the time is going without any >>> numbers. >>> >>> -Eli >>> >>> -- >>> Employee of Qualcomm Innovation Center, Inc. >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>> >>> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180911/0b730f22/attachment-0001.html>