Daniel Sanders via llvm-dev
2015-Aug-27 12:58 UTC
[llvm-dev] buildbot failure in LLVM on clang-native-arm-cortex-a9
Hi, I agree with the principle but 2 days feels a bit short to me since, accounting for time zone differences, it's closer to 1 working day. For example, an email sent at 9am PDT arrives at 5pm BST and (assuming normal working hours) might be read at 9am BST (1am PDT). Daylight savings can also make a difference since timezones that use it don't agree on when it's in effect. The owner taking a single day off is easily sufficient to go past the 2 day limit. However, the main comment I wanted to make is that it would be useful to be able to tell whether the buildmaster has picked up changes or not. I understand that many changes are automatically applied without a buildmaster restart but at the moment it can be difficult to tell when this happens.> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > Renato Golin via llvm-dev > Sent: 26 August 2015 18:07 > To: Philip Reames > Cc: LLVM Dev; llvm.buildmaster at lab.llvm.org; Tobias Grosser > Subject: Re: [llvm-dev] buildbot failure in LLVM on clang-native-arm-cortex- > a9 > > On 26 August 2015 at 18:03, Philip Reames <listmail at philipreames.com> > wrote: > > 2 days seems fine to me. I don't care what the specific threshold is as > > long as there is one. :) > > Agreed. > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Renato Golin via llvm-dev
2015-Aug-27 14:10 UTC
[llvm-dev] buildbot failure in LLVM on clang-native-arm-cortex-a9
On 27 August 2015 at 13:58, Daniel Sanders <Daniel.Sanders at imgtec.com> wrote:> I agree with the principle but 2 days feels a bit short to me since, accounting for time zone differences, it's closer to 1 working day. For example, an email sent at 9am PDT arrives at 5pm BST and (assuming normal working hours) might be read at 9am BST (1am PDT). Daylight savings can also make a difference since timezones that use it don't agree on when it's in effect. The owner taking a single day off is easily sufficient to go past the 2 day limit.Yeah, that's my feeling, too. But we Philip said, the specifics should be discussed on a proper RFC thread, at least we all agree on some threshold being defined.> However, the main comment I wanted to make is that it would be useful to be able to tell whether the buildmaster has picked up changes or not. I understand that many changes are automatically applied without a buildmaster restart but at the moment it can be difficult to tell when this happens.If the bot owner changes the master and restart the slave, the old master will show the slave as offline and the new one as online. As long as it stops sending emails, that's most of the problem dealt with. But that leaves a trail of unfinished builds and bloats the master by collecting commits for a build that will never happen. We may have to change the logic to stop collecting commits for offline bots. cheers, --renato
Galina Kistanova via llvm-dev
2015-Aug-27 22:25 UTC
[llvm-dev] buildbot failure in LLVM on clang-native-arm-cortex-a9
Hello everyone, Thanks for the discussion. There are 2 threads here: 1. Why builders on Panda boards are failing on the way which looks like the bots are flaky. First of all, the failures are consistent, and do not relate to tests. These are compilation of certain files stalled for longer than 20 mins. The problem looks valid to me . I'm researching the cause. So far it looks like it takes more than 1GB to compile some unit. In particular, I see this with ASTContext.cpp and ASTMatchersInternal.cpp. There are maybe more of such files, I'll research. Anyway, this issue has nothing to do with the way how exactly the build gets orchestrated. I.e. cmake + ninja would demonstrate the same stall as the currently used autoconfig + make. I'm still researching and will report the exact findings as soon as I'll finalize them. A big part of the false "flaky" sense is because we do incremental builds. Some problems remain hidden for a long time and get exposed often by some random event / commit, triggering "I'm so annoyed" discussions. I will re-evaluate the need of incremental builds and will try clear builds to see how much commits would balk together. If it would still be reasonable, I'll switch the Cortex-A9 bots to clean builds. For now, I take these bots down. 2. What to do with bots which are "noisy". First of all, I'm still in the middle of reading the thread. :) In general, I'm with Renato on this. It should not be easy to shut the annoying bot, just because it is not obvious at the moment why it is not happy. Bugging the owner, is fine. I spend quite some time watching the quality of the bots and communicating to the owners. If you are the owner, you know this. Thanks Galina On Thu, Aug 27, 2015 at 7:10 AM, Renato Golin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 27 August 2015 at 13:58, Daniel Sanders <Daniel.Sanders at imgtec.com> > wrote: > > I agree with the principle but 2 days feels a bit short to me since, > accounting for time zone differences, it's closer to 1 working day. For > example, an email sent at 9am PDT arrives at 5pm BST and (assuming normal > working hours) might be read at 9am BST (1am PDT). Daylight savings can > also make a difference since timezones that use it don't agree on when it's > in effect. The owner taking a single day off is easily sufficient to go > past the 2 day limit. > > Yeah, that's my feeling, too. But we Philip said, the specifics should > be discussed on a proper RFC thread, at least we all agree on some > threshold being defined. > > > > However, the main comment I wanted to make is that it would be useful to > be able to tell whether the buildmaster has picked up changes or not. I > understand that many changes are automatically applied without a > buildmaster restart but at the moment it can be difficult to tell when this > happens. > > If the bot owner changes the master and restart the slave, the old > master will show the slave as offline and the new one as online. As > long as it stops sending emails, that's most of the problem dealt > with. But that leaves a trail of unfinished builds and bloats the > master by collecting commits for a build that will never happen. We > may have to change the logic to stop collecting commits for offline > bots. > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150827/449e2450/attachment.html>