Diana Picus via llvm-dev
2017-May-22 09:42 UTC
[llvm-dev] Buildbots timing out on full builds
Nope, no sanitizers. On 22 May 2017 at 11:38, Daniel Sanders <daniel_l_sanders at apple.com> wrote:> Is that with -fsanitize=memory too? > > I'm currently building ToT with r303258 reverted. Once that's done I'll commit the revert and start investigating fixes. > >> On 22 May 2017, at 10:22, Diana Picus <diana.picus at linaro.org> wrote: >> >> Hi Daniel, >> >> I did your experiment on a TK1 machine (same as the bots) and for r303258 I get: >> real 18m28.882s >> user 35m37.091s >> sys 0m44.726s >> >> and for r303259: >> real 50m52.048s >> user 88m25.473s >> sys 0m46.548s >> >> If I can help investigate, please let me know, otherwise we can just >> try your fixes and see how they affect compilation time. >> >> Thanks, >> Diana >> >> On 22 May 2017 at 10:49, Daniel Sanders <daniel_l_sanders at apple.com> wrote: >>> r303341 is the re-commit of the r303259 which tripled the number of rules >>> that can be imported into GlobalISel from SelectionDAG. A compile time >>> regression is to be expected but when I looked into it I found it was ~25s >>> on my machine for the whole incremental build rather than the ~12mins you >>> are seeing. I'll take another look. >>> >>> I'm aware of a couple easy improvements we could make to the way the >>> importer works. I was leaving them until we change it over to a state >>> machine but the most obvious is to group rules by their top-level gMIR >>> instruction. This would reduce the cost of the std::sort that handles the >>> rule priorities in generating the source file and will also make it simpler >>> for the compiler to compile it. >>> >>> >>> On 21 May 2017, at 11:16, Vitaly Buka <vitalybuka at google.com> wrote: >>> >>> It must be r303341, I commented on corresponding llvm-commits thread. >>> >>> On Fri, May 19, 2017 at 7:34 AM, Diana Picus via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>>> >>>> Ok, thanks. I'll try to do a bisect next week to see if I can find it. >>>> >>>> Cheers, >>>> Diana >>>> >>>> On 19 May 2017 at 16:29, Daniel Sanders <daniel_l_sanders at apple.com> >>>> wrote: >>>>> >>>>>> On 19 May 2017, at 14:54, Daniel Sanders via llvm-dev >>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>> >>>>>> r303259 will have increased compile-time since it tripled the number of >>>>>> importable >>>>>> SelectionDAG rules but a quick measurement building the affected file: >>>>>> ninja >>>>>> lib/Target/<Target>/CMakeFiles/LLVM<Target>CodeGen.dir/<Target>InstructionSelector.cpp.o >>>>>> for both ARM and AArch64 didn't show a significant increase. I'll check >>>>>> whether >>>>>> it made a different to linking. >>>>> >>>>> I don't think it's r303259. Starting with a fully built r303259, then >>>>> updating to r303258 and running 'ninja' gives me: >>>>> real 2m28.273s >>>>> user 13m23.171s >>>>> sys 0m47.725s >>>>> then updating to r303259 and running 'ninja' again gives me: >>>>> real 2m19.052s >>>>> user 13m38.802s >>>>> sys 0m44.551s >>>>> >>>>>> sanitizer-x86_64-linux-fast also timed out after one of my commits this >>>>>> morning. >>>>>> >>>>>>> On 19 May 2017, at 14:14, Diana Picus <diana.picus at linaro.org> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We've noticed that recently some of our bots (mostly >>>>>>> clang-cmake-armv7-a15 and clang-cmake-thumbv7-a15) started timing out >>>>>>> whenever someone commits a change to TableGen: >>>>>>> >>>>>>> r303418: >>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7268 >>>>>>> r303346: >>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7242 >>>>>>> r303341: >>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7239 >>>>>>> r303259: >>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7198 >>>>>>> >>>>>>> TableGen changes before that (I checked about 3-4 of them) don't have >>>>>>> this problem: >>>>>>> r303253: >>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7197 >>>>>>> >>>>>>> That one in particular actually finishes the whole build in 635s, >>>>>>> which is only a bit over 50% of the timeout limit (1200s). So, between >>>>>>> r303253 and now, something happened that made full builds >>>>>>> significantly slower. Does anyone have any idea what that might have >>>>>>> been? Also, has anyone noticed this on other bots? >>>>>>> >>>>>>> Thanks, >>>>>>> Diana >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >
Daniel Sanders via llvm-dev
2017-May-23 17:50 UTC
[llvm-dev] Buildbots timing out on full builds
Could you give https://reviews.llvm.org/differential/diff/99949/ <https://reviews.llvm.org/differential/diff/99949/> a try? It brings back the reverted commit and fixes two significant compile-time issues. Assuming it works for you too, I'll finish off the patches and post them individually. The first one removes the single-use lambdas in the generated code. These turn out to be _really_ expensive. Replacing them with equivalent gotos saves 11 million allocations (~57%) during the course of compiling AArch64InstructionSelector.cpp.o. The cumulative number of bytes allocated also drops by ~4GB (~36%). The second one is to split up the functions by the number of operands in the top-level instruction. This constrains the scale of the task the register allocator needs to deal with in X86InstructionSelection.cpp.o.> On 22 May 2017, at 10:42, Diana Picus <diana.picus at linaro.org> wrote: > > Nope, no sanitizers. > > On 22 May 2017 at 11:38, Daniel Sanders <daniel_l_sanders at apple.com> wrote: >> Is that with -fsanitize=memory too? >> >> I'm currently building ToT with r303258 reverted. Once that's done I'll commit the revert and start investigating fixes. >> >>> On 22 May 2017, at 10:22, Diana Picus <diana.picus at linaro.org> wrote: >>> >>> Hi Daniel, >>> >>> I did your experiment on a TK1 machine (same as the bots) and for r303258 I get: >>> real 18m28.882s >>> user 35m37.091s >>> sys 0m44.726s >>> >>> and for r303259: >>> real 50m52.048s >>> user 88m25.473s >>> sys 0m46.548s >>> >>> If I can help investigate, please let me know, otherwise we can just >>> try your fixes and see how they affect compilation time. >>> >>> Thanks, >>> Diana >>> >>> On 22 May 2017 at 10:49, Daniel Sanders <daniel_l_sanders at apple.com> wrote: >>>> r303341 is the re-commit of the r303259 which tripled the number of rules >>>> that can be imported into GlobalISel from SelectionDAG. A compile time >>>> regression is to be expected but when I looked into it I found it was ~25s >>>> on my machine for the whole incremental build rather than the ~12mins you >>>> are seeing. I'll take another look. >>>> >>>> I'm aware of a couple easy improvements we could make to the way the >>>> importer works. I was leaving them until we change it over to a state >>>> machine but the most obvious is to group rules by their top-level gMIR >>>> instruction. This would reduce the cost of the std::sort that handles the >>>> rule priorities in generating the source file and will also make it simpler >>>> for the compiler to compile it. >>>> >>>> >>>> On 21 May 2017, at 11:16, Vitaly Buka <vitalybuka at google.com> wrote: >>>> >>>> It must be r303341, I commented on corresponding llvm-commits thread. >>>> >>>> On Fri, May 19, 2017 at 7:34 AM, Diana Picus via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>>> >>>>> Ok, thanks. I'll try to do a bisect next week to see if I can find it. >>>>> >>>>> Cheers, >>>>> Diana >>>>> >>>>> On 19 May 2017 at 16:29, Daniel Sanders <daniel_l_sanders at apple.com> >>>>> wrote: >>>>>> >>>>>>> On 19 May 2017, at 14:54, Daniel Sanders via llvm-dev >>>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>> r303259 will have increased compile-time since it tripled the number of >>>>>>> importable >>>>>>> SelectionDAG rules but a quick measurement building the affected file: >>>>>>> ninja >>>>>>> lib/Target/<Target>/CMakeFiles/LLVM<Target>CodeGen.dir/<Target>InstructionSelector.cpp.o >>>>>>> for both ARM and AArch64 didn't show a significant increase. I'll check >>>>>>> whether >>>>>>> it made a different to linking. >>>>>> >>>>>> I don't think it's r303259. Starting with a fully built r303259, then >>>>>> updating to r303258 and running 'ninja' gives me: >>>>>> real 2m28.273s >>>>>> user 13m23.171s >>>>>> sys 0m47.725s >>>>>> then updating to r303259 and running 'ninja' again gives me: >>>>>> real 2m19.052s >>>>>> user 13m38.802s >>>>>> sys 0m44.551s >>>>>> >>>>>>> sanitizer-x86_64-linux-fast also timed out after one of my commits this >>>>>>> morning. >>>>>>> >>>>>>>> On 19 May 2017, at 14:14, Diana Picus <diana.picus at linaro.org> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> We've noticed that recently some of our bots (mostly >>>>>>>> clang-cmake-armv7-a15 and clang-cmake-thumbv7-a15) started timing out >>>>>>>> whenever someone commits a change to TableGen: >>>>>>>> >>>>>>>> r303418: >>>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7268 >>>>>>>> r303346: >>>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7242 >>>>>>>> r303341: >>>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7239 >>>>>>>> r303259: >>>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7198 >>>>>>>> >>>>>>>> TableGen changes before that (I checked about 3-4 of them) don't have >>>>>>>> this problem: >>>>>>>> r303253: >>>>>>>> http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7197 >>>>>>>> >>>>>>>> That one in particular actually finishes the whole build in 635s, >>>>>>>> which is only a bit over 50% of the timeout limit (1200s). So, between >>>>>>>> r303253 and now, something happened that made full builds >>>>>>>> significantly slower. Does anyone have any idea what that might have >>>>>>>> been? Also, has anyone noticed this on other bots? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Diana >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170523/29182444/attachment.html>
David Blaikie via llvm-dev
2017-May-24 16:31 UTC
[llvm-dev] Buildbots timing out on full builds
On Tue, May 23, 2017 at 10:51 AM Daniel Sanders via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Could you give https://reviews.llvm.org/differential/diff/99949/ a try? > It brings back the reverted commit and fixes two significant compile-time > issues. Assuming it works for you too, I'll finish off the patches and post > them individually. > > The first one removes the single-use lambdas in the generated code. These > turn out to be _really_ expensive. Replacing them with equivalent gotos > saves 11 million allocations (~57%) during the course of compiling > AArch64InstructionSelector.cpp.o. The cumulative number of bytes allocated > also drops by ~4GB (~36%). >(this is outside my wheelhouse, so just as an aside): Could you explain further what aspect of the change was that saved allocations? Lambdas themselves don't allocate memory (std::function of a stateful lambda may allocate memory - but I didn't see any std::function in your change, though I might've missed it), so I'm guessing it's something else/some other aspect of the code in/outside the lambdas and where it moved that changed the allocation pattern?> > The second one is to split up the functions by the number of operands in > the top-level instruction. This constrains the scale of the task the > register allocator needs to deal with in X86InstructionSelection.cpp.o. > > On 22 May 2017, at 10:42, Diana Picus <diana.picus at linaro.org> wrote: > > Nope, no sanitizers. > > On 22 May 2017 at 11:38, Daniel Sanders <daniel_l_sanders at apple.com> > wrote: > > Is that with -fsanitize=memory too? > > I'm currently building ToT with r303258 reverted. Once that's done I'll > commit the revert and start investigating fixes. > > On 22 May 2017, at 10:22, Diana Picus <diana.picus at linaro.org> wrote: > > Hi Daniel, > > I did your experiment on a TK1 machine (same as the bots) and for r303258 > I get: > real 18m28.882s > user 35m37.091s > sys 0m44.726s > > and for r303259: > real 50m52.048s > user 88m25.473s > sys 0m46.548s > > If I can help investigate, please let me know, otherwise we can just > try your fixes and see how they affect compilation time. > > Thanks, > Diana > > On 22 May 2017 at 10:49, Daniel Sanders <daniel_l_sanders at apple.com> > wrote: > > r303341 is the re-commit of the r303259 which tripled the number of rules > that can be imported into GlobalISel from SelectionDAG. A compile time > regression is to be expected but when I looked into it I found it was ~25s > on my machine for the whole incremental build rather than the ~12mins you > are seeing. I'll take another look. > > I'm aware of a couple easy improvements we could make to the way the > importer works. I was leaving them until we change it over to a state > machine but the most obvious is to group rules by their top-level gMIR > instruction. This would reduce the cost of the std::sort that handles the > rule priorities in generating the source file and will also make it simpler > for the compiler to compile it. > > > On 21 May 2017, at 11:16, Vitaly Buka <vitalybuka at google.com> wrote: > > It must be r303341, I commented on corresponding llvm-commits thread. > > On Fri, May 19, 2017 at 7:34 AM, Diana Picus via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > Ok, thanks. I'll try to do a bisect next week to see if I can find it. > > Cheers, > Diana > > On 19 May 2017 at 16:29, Daniel Sanders <daniel_l_sanders at apple.com> > wrote: > > > On 19 May 2017, at 14:54, Daniel Sanders via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > r303259 will have increased compile-time since it tripled the number of > importable > SelectionDAG rules but a quick measurement building the affected file: > ninja > > lib/Target/<Target>/CMakeFiles/LLVM<Target>CodeGen.dir/<Target>InstructionSelector.cpp.o > for both ARM and AArch64 didn't show a significant increase. I'll check > whether > it made a different to linking. > > > I don't think it's r303259. Starting with a fully built r303259, then > updating to r303258 and running 'ninja' gives me: > real 2m28.273s > user 13m23.171s > sys 0m47.725s > then updating to r303259 and running 'ninja' again gives me: > real 2m19.052s > user 13m38.802s > sys 0m44.551s > > sanitizer-x86_64-linux-fast also timed out after one of my commits this > morning. > > On 19 May 2017, at 14:14, Diana Picus <diana.picus at linaro.org> wrote: > > Hi, > > We've noticed that recently some of our bots (mostly > clang-cmake-armv7-a15 and clang-cmake-thumbv7-a15) started timing out > whenever someone commits a change to TableGen: > > r303418: > http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7268 > r303346: > http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7242 > r303341: > http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7239 > r303259: > http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7198 > > TableGen changes before that (I checked about 3-4 of them) don't have > this problem: > r303253: > http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7197 > > That one in particular actually finishes the whole build in 635s, > which is only a bit over 50% of the timeout limit (1200s). So, between > r303253 and now, something happened that made full builds > significantly slower. Does anyone have any idea what that might have > been? Also, has anyone noticed this on other bots? > > Thanks, > Diana > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170524/4ed50923/attachment.html>