Alp Toker
2014-Jan-04 18:57 UTC
[LLVMdev] buildbot failure in LLVM on clang-native-arm-cortex-a9
On 04/01/2014 15:19, llvm.buildmaster at lab.llvm.org wrote:> The Buildbot has detected a new failure on builder clang-native-arm-cortex-a9 while building cfe. > Full details are available at: > http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/14552 > > Buildbot URL: http://lab.llvm.org:8011/ > > Buildslave for this Build: as-bldslv1 > > Build Reason: scheduler > Build Source Stamp: [branch trunk] 198489 > Blamelist: alp > > BUILD FAILED: failed compile > The bug is not reproducible, so it is likely a hardware or OS problem. > make[5]: *** [/home/buildslave/slave_as-bldslv1/clang-native-arm-cortex-a9/llvm/tools/clang/lib/ASTMatchers/Dynamic/Release+Asserts/Registry.o] Error 1 > make[5]: Leaving directory `/home/buildslave/slave_as-bldslv1/clang-native-arm-cortex-a9/llvm/tools/clang/lib/ASTMatchers/Dynamic' > make[4]: *** [Dynamic/.makeall] Error 2 > make[4]: Leaving directory `/home/buildslave/slave_as-bldslv1/clang-native-arm-cortex-a9/llvm/tools/clang/lib/ASTMatchers' > make[3]: *** [ASTMatchers/.makeall] Error 2Would it be possible to skip sending mail on hardware/OS/out-of-disk messages? I imagine this is just a matter of checking the process exit code from the build system: 0 for success, 1 for build failure that sends notifications, everything else is an admin problem. If the script in use has no code owner, I'll appreciate a pointer to what's sending the mails and I'll see if someone can look into it and submit a patch. We should be more proactive and disable noisy build servers until a technical solution is available rather than the other way round, given how they drown out real problems. Thanks Alp.> sincerely, > -The Buildbot > > >-- http://www.nuanti.com the browser experts
Renato Golin
2014-Jan-04 19:21 UTC
[LLVMdev] buildbot failure in LLVM on clang-native-arm-cortex-a9
On 4 January 2014 18:57, Alp Toker <alp at nuanti.com> wrote:> Would it be possible to skip sending mail on hardware/OS/out-of-disk > messages? > > I imagine this is just a matter of checking the process exit code from the > build system: 0 for success, 1 for build failure that sends notifications, > everything else is an admin problem. >No, exit codes don't tell the whole story. One would have to grep for specific messages like "disk full" or "not reproducible". If the script in use has no code owner, I'll appreciate a pointer to what's> sending the mails and I'll see if someone can look into it and submit a > patch. >I have no idea where is this code, or who is responsible. We should be more proactive and disable noisy build servers until a> technical solution is available rather than the other way round, given how > they drown out real problems. >It's not that simple. The ARM boards we have been using are all development boards, built with the quality you'd expect from evaluation hardware. The only production hardware you can find with an ARM chip inside are mobile phones, tablets and the Samsung Chromebook (which we use at Linaro), but they are not fit for being servers by a long shot. The only server-grade ARM hardware, Calxeda, went bankrupt last month. :( Unfortunately, those bots are our only solution for now, and we'll have to keep them running the best we can. We must fix the problem (grep on errors, and all the other things we discussed last week), not turn off the only buildbots we have. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140104/00145141/attachment.html>
Alp Toker
2014-Jan-05 00:03 UTC
[LLVMdev] buildbot failure in LLVM on clang-native-arm-cortex-a9
On 04/01/2014 19:21, Renato Golin wrote:> On 4 January 2014 18:57, Alp Toker <alp at nuanti.com > <mailto:alp at nuanti.com>> wrote: > > Would it be possible to skip sending mail on > hardware/OS/out-of-disk messages? > > I imagine this is just a matter of checking the process exit code > from the build system: 0 for success, 1 for build failure that > sends notifications, everything else is an admin problem. > > > No, exit codes don't tell the whole story. One would have to grep for > specific messages like "disk full" or "not reproducible". > > > If the script in use has no code owner, I'll appreciate a pointer > to what's sending the mails and I'll see if someone can look into > it and submit a patch. > > > I have no idea where is this code, or who is responsible.So, I did some digging: zorg/buildbot/commands/StandardizedTest.py has logic that converts logs and status reports into an actionable test results.> > > We should be more proactive and disable noisy build servers until > a technical solution is available rather than the other way round, > given how they drown out real problems. > > > It's not that simple. The ARM boards we have been using are all > development boards, built with the quality you'd expect from > evaluation hardware. The only production hardware you can find with an > ARM chip inside are mobile phones, tablets and the Samsung Chromebook > (which we use at Linaro), but they are not fit for being servers by a > long shot. The only server-grade ARM hardware, Calxeda, went bankrupt > last month. :( > > Unfortunately, those bots are our only solution for now, and we'll > have to keep them running the best we can. We must fix the problem > (grep on errors, and all the other things we discussed last week), not > turn off the only buildbots we have.I didn't realise these bots were the last line of defence for ARM support! In that case let's keep them in commission and focus on the grep fix you suggest. Agree that stderr is a more practical informant than exit codes. The most spammy patterns are predictable and relate to SVN outage, network failures, out-of-disk-space and non-deterministic results presumably related to the hardware flakiness you described. Those should only be sent the device admins and maybe the module owner, never individual committers to whom they're unactionable. Think we have a handle on this now but a "pong, XXX owns this module" would be appreciated from anyone in the know. Alp.> > cheers, > --renato-- http://www.nuanti.com the browser experts