Folks, I know it's a reasonably valuable thing to have the buildbot IRC bot publishing results, but the channel is kind of flooded with the messages, and the more bots we put up, the worse it will be. I think we still need the NOC warnings, but not over IRC. The Buildbot NOC page is horrible and useless, since it doesn't know the difference between "it's red and I know it" from "it's broken". For that reason, I have built my own NOC page: http://people.linaro.org/~renato.golin/llvm/arm-bots/ But that machine is too slow to cope with all bots. We may need a project to build such a system on a larger scale. However, for now, I think not printing the green results in IRC would go a long way of cleaning the channel up. Any thoughts? cheers, --renato
On Tue, May 19, 2015 at 7:15 AM, Renato Golin <renato.golin at linaro.org> wrote:> Folks, > > I know it's a reasonably valuable thing to have the buildbot IRC bot > publishing results, but the channel is kind of flooded with the > messages, and the more bots we put up, the worse it will be. > > I think we still need the NOC warnings,NOC?> but not over IRC. The Buildbot > NOC page is horrible and useless, since it doesn't know the difference > between "it's red and I know it" from "it's broken". >What distinction are you drawing there? The difference between freshly red and previously red?> For that reason, I have built my own NOC page: > > http://people.linaro.org/~renato.golin/llvm/arm-bots/What does this do differently from the main buildbot page? (other than only show arm bots?) Is it something we could do to the buildbot page (remove always-red builders, recategorize flaky/problematic builders so at least they're off in the "experimental" section, etc)?> But that machine is too slow to cope with all bots. We may need a > project to build such a system on a larger scale. > > However, for now, I think not printing the green results in IRC would > go a long way of cleaning the channel up. > > Any thoughts? > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/afadd659/attachment.html>
Yes, I also find the amount of bot spam in #llvm is basically intolerable. It makes it difficult to see actual people talking. At first, I just put all the bots on /ignore. Now I have an xchat script to move the botspam to another tab (tabify-004.pl). I'd recommend that the bots should just be moved to #llvm-bots and fix the problem for everyone. Those who are committing changes can join that channel, too, and others don't care. While we're on this subject, I also find the official buildbot page ( lab.llvm.org:8011) almost unusable, since so many columns are either always red, or else are so flaky that they basically randomly alternate between passing and failing. So, at a glance, it's impossible to tell whether the current state of the tree is good. (I certainly haven't memorized which ones are "supposed" to be red, and which are not. Maybe others have). Having flaky and always-failing builds show up on the buildbot pages, and notifying IRC, really has negative utility, since it not only is not providing useful information, but is serving to obscure the actual important failures, and causing people to spend time investigating non-problems. Someone gave me the hint to use the http://bb.pgr.jp/ buildbot page instead, which was a great recommendation -- that page shows problems much more clearly. But it's unfortunate that there *needs* to be a separate "sane builders only" buildmaster. E.g. (and not to pick on this particular bot, this is just one example of many): http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655 -- passed, while the previous failed. But, it's not caused by the commit, it's just arbitrary. Or, yesterday, on #llvm: "Anyone want to give me a clue as to why this bot failed? http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017" -- answer: because it's randomly broken. Wasted the questioner's time trying to investigate the failure. If all the flaky or always-broken builder configurations got hidden from the main pages of buildbot, and stopped sending emails/IRC notifications to anyone but their "owner", that would be a substantial improvement. On Tue, May 19, 2015 at 10:15 AM, Renato Golin <renato.golin at linaro.org> wrote:> Folks, > > I know it's a reasonably valuable thing to have the buildbot IRC bot > publishing results, but the channel is kind of flooded with the > messages, and the more bots we put up, the worse it will be. > > I think we still need the NOC warnings, but not over IRC. The Buildbot > NOC page is horrible and useless, since it doesn't know the difference > between "it's red and I know it" from "it's broken". > > For that reason, I have built my own NOC page: > > http://people.linaro.org/~renato.golin/llvm/arm-bots/ > > But that machine is too slow to cope with all bots. We may need a > project to build such a system on a larger scale. > > However, for now, I think not printing the green results in IRC would > go a long way of cleaning the channel up. > > Any thoughts? > > cheers, > --renato > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/7f2f14ca/attachment.html>
+1 for hiding flaky bots. I routinely see some bot randomally failing after a non-related commit. sanitizer-x86_64-linux may be the worst one. This wastes time and hides real problems. 2015-05-19 20:40 GMT+03:00 James Y Knight <jyknight at google.com>:> Yes, I also find the amount of bot spam in #llvm is basically intolerable. > It makes it difficult to see actual people talking. At first, I just put > all the bots on /ignore. Now I have an xchat script to move the botspam to > another tab (tabify-004.pl). I'd recommend that the bots should just be > moved to #llvm-bots and fix the problem for everyone. Those who are > committing changes can join that channel, too, and others don't care. > > While we're on this subject, I also find the official buildbot page ( > lab.llvm.org:8011) almost unusable, since so many columns are either > always red, or else are so flaky that they basically randomly alternate > between passing and failing. So, at a glance, it's impossible to tell > whether the current state of the tree is good. (I certainly haven't > memorized which ones are "supposed" to be red, and which are not. Maybe > others have). Having flaky and always-failing builds show up on the > buildbot pages, and notifying IRC, really has negative utility, since it > not only is not providing useful information, but is serving to obscure the > actual important failures, and causing people to spend time investigating > non-problems. > > Someone gave me the hint to use the http://bb.pgr.jp/ buildbot page > instead, which was a great recommendation -- that page shows problems much > more clearly. But it's unfortunate that there *needs* to be a separate > "sane builders only" buildmaster. > > E.g. (and not to pick on this particular bot, this is just one example of > many): > http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655 -- > passed, while the previous failed. But, it's not caused by the commit, it's > just arbitrary. > > Or, yesterday, on #llvm: "Anyone want to give me a clue as to why this bot > failed? > http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017" -- > answer: because it's randomly broken. Wasted the questioner's time trying > to investigate the failure. > > > If all the flaky or always-broken builder configurations got hidden from > the main pages of buildbot, and stopped sending emails/IRC notifications to > anyone but their "owner", that would be a substantial improvement. > > On Tue, May 19, 2015 at 10:15 AM, Renato Golin <renato.golin at linaro.org> > wrote: > >> Folks, >> >> I know it's a reasonably valuable thing to have the buildbot IRC bot >> publishing results, but the channel is kind of flooded with the >> messages, and the more bots we put up, the worse it will be. >> >> I think we still need the NOC warnings, but not over IRC. The Buildbot >> NOC page is horrible and useless, since it doesn't know the difference >> between "it's red and I know it" from "it's broken". >> >> For that reason, I have built my own NOC page: >> >> http://people.linaro.org/~renato.golin/llvm/arm-bots/ >> >> But that machine is too slow to cope with all bots. We may need a >> project to build such a system on a larger scale. >> >> However, for now, I think not printing the green results in IRC would >> go a long way of cleaning the channel up. >> >> Any thoughts? >> >> cheers, >> --renato >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev >> > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/0f53d865/attachment.html>
On 19 May 2015 at 18:39, David Blaikie <dblaikie at gmail.com> wrote:> NOC?Sorry, NOC is "network operations centre". the room with big screens showing the status of a data centre, where operators sit and fix problems, always looking at the big screens on the wall, in case they go red.> What distinction are you drawing there? The difference between freshly red > and previously red?Basically, yes.> What does this do differently from the main buildbot page? (other than only > show arm bots?) Is it something we could do to the buildbot page (remove > always-red builders, recategorize flaky/problematic builders so at least > they're off in the "experimental" section, etc)?Separating ARM from the rest is the most important thing to me. but classifying them and only showing the information I want is also important. James Knight has summarised well the problems I have with the current buildbot page. cheers, --renato
When we built green dragon we tried to be really accountable for this sort of cruft, with a goal of 99% useful notifications, or nothing. On green dragon we curate both which builds notify the IRC and which builds show up on the main page. Anything that fails for reasons unrelated to the commit is not allowed to do either. We use the phased build approach to make sure we notify only once per failure. Builds that are red for more than a week are disabled, if we can’t fix it in a week, its not worth building anymore. Because of that, libcxx builds, LLDB and performance builds do not notify and some are disabled. When we email the blamelist, I am CCed on every email, and they are not filtered from my inbox. If the blame list is long, it only emails me, and I track down who broke it. Of course green dragon only runs a small proportion of the total builds. If you can’t look at the build page, and know that everything that is red is a real problem, we have a real problem. Even within builds, if most of the steps are marked as failures, you don’t know what when wrong.> On May 19, 2015, at 10:40 AM, James Y Knight <jyknight at google.com> wrote: > > Yes, I also find the amount of bot spam in #llvm is basically intolerable. It makes it difficult to see actual people talking. At first, I just put all the bots on /ignore. Now I have an xchat script to move the botspam to another tab (tabify-004.pl <http://tabify-004.pl/>). I'd recommend that the bots should just be moved to #llvm-bots and fix the problem for everyone. Those who are committing changes can join that channel, too, and others don't care. > > While we're on this subject, I also find the official buildbot page (lab.llvm.org:8011 <http://lab.llvm.org:8011/>) almost unusable, since so many columns are either always red, or else are so flaky that they basically randomly alternate between passing and failing. So, at a glance, it's impossible to tell whether the current state of the tree is good. (I certainly haven't memorized which ones are "supposed" to be red, and which are not. Maybe others have). Having flaky and always-failing builds show up on the buildbot pages, and notifying IRC, really has negative utility, since it not only is not providing useful information, but is serving to obscure the actual important failures, and causing people to spend time investigating non-problems. > > Someone gave me the hint to use the http://bb.pgr.jp/ <http://bb.pgr.jp/> buildbot page instead, which was a great recommendation -- that page shows problems much more clearly. But it's unfortunate that there *needs* to be a separate "sane builders only" buildmaster. > > E.g. (and not to pick on this particular bot, this is just one example of many): http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655 <http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655> -- passed, while the previous failed. But, it's not caused by the commit, it's just arbitrary. > > Or, yesterday, on #llvm: "Anyone want to give me a clue as to why this bot failed? http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017 <http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017>" -- answer: because it's randomly broken. Wasted the questioner's time trying to investigate the failure. > > > If all the flaky or always-broken builder configurations got hidden from the main pages of buildbot, and stopped sending emails/IRC notifications to anyone but their "owner", that would be a substantial improvement. > > On Tue, May 19, 2015 at 10:15 AM, Renato Golin <renato.golin at linaro.org <mailto:renato.golin at linaro.org>> wrote: > Folks, > > I know it's a reasonably valuable thing to have the buildbot IRC bot > publishing results, but the channel is kind of flooded with the > messages, and the more bots we put up, the worse it will be. > > I think we still need the NOC warnings, but not over IRC. The Buildbot > NOC page is horrible and useless, since it doesn't know the difference > between "it's red and I know it" from "it's broken". > > For that reason, I have built my own NOC page: > > http://people.linaro.org/~renato.golin/llvm/arm-bots/ <http://people.linaro.org/~renato.golin/llvm/arm-bots/> > > But that machine is too slow to cope with all bots. We may need a > project to build such a system on a larger scale. > > However, for now, I think not printing the green results in IRC would > go a long way of cleaning the channel up. > > Any thoughts? > > cheers, > --renato > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev <http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev> > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/5f25867e/attachment.html>
On Tue, May 19, 2015 at 10:40 AM, James Y Knight <jyknight at google.com> wrote:> Yes, I also find the amount of bot spam in #llvm is basically intolerable. > It makes it difficult to see actual people talking. At first, I just put > all the bots on /ignore. Now I have an xchat script to move the botspam to > another tab (tabify-004.pl). I'd recommend that the bots should just be > moved to #llvm-bots and fix the problem for everyone. Those who are > committing changes can join that channel, too, and others don't care. > > While we're on this subject, I also find the official buildbot page ( > lab.llvm.org:8011) almost unusable, since so many columns are either > always red, or else are so flaky that they basically randomly alternate > between passing and failing. So, at a glance, it's impossible to tell > whether the current state of the tree is good. (I certainly haven't > memorized which ones are "supposed" to be red, and which are not. Maybe > others have). Having flaky and always-failing builds show up on the > buildbot pages, and notifying IRC, really has negative utility, since it > not only is not providing useful information, but is serving to obscure the > actual important failures, and causing people to spend time investigating > non-problems. > > Someone gave me the hint to use the http://bb.pgr.jp/ buildbot page > instead, which was a great recommendation -- that page shows problems much > more clearly. But it's unfortunate that there *needs* to be a separate > "sane builders only" buildmaster. > > E.g. (and not to pick on this particular bot, this is just one example of > many): > http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655 -- > passed, while the previous failed. But, it's not caused by the commit, it's > just arbitrary. > > Or, yesterday, on #llvm: "Anyone want to give me a clue as to why this bot > failed? > http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017" -- > answer: because it's randomly broken. Wasted the questioner's time trying > to investigate the failure. >Whenever you get crappy fail-mail, please forward it to llvm-dev, cc'ing the bot owner and request the issue be addressed or the bot be removed. Yeah, I know it's not an ideal process, but it's something to keep issues visible/pushed on. But, yes, having some more formal process to deal with this sort of thing would be nice (I can imagine some process along the lines of "bots start in experimental and need a track record of low flake/false positive results for some period of time before being promoted out of experimental so they can send mail to blame lists and IRC, etc" coupled with some mechanism for demoting a buildbot back into experimental if it starts behaving poorly) - David> > > If all the flaky or always-broken builder configurations got hidden from > the main pages of buildbot, and stopped sending emails/IRC notifications to > anyone but their "owner", that would be a substantial improvement. > > On Tue, May 19, 2015 at 10:15 AM, Renato Golin <renato.golin at linaro.org> > wrote: > >> Folks, >> >> I know it's a reasonably valuable thing to have the buildbot IRC bot >> publishing results, but the channel is kind of flooded with the >> messages, and the more bots we put up, the worse it will be. >> >> I think we still need the NOC warnings, but not over IRC. The Buildbot >> NOC page is horrible and useless, since it doesn't know the difference >> between "it's red and I know it" from "it's broken". >> >> For that reason, I have built my own NOC page: >> >> http://people.linaro.org/~renato.golin/llvm/arm-bots/ >> >> But that machine is too slow to cope with all bots. We may need a >> project to build such a system on a larger scale. >> >> However, for now, I think not printing the green results in IRC would >> go a long way of cleaning the channel up. >> >> Any thoughts? >> >> cheers, >> --renato >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev >> > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/f2716b6f/attachment.html>
On Tue, May 19, 2015 at 10:15 AM, Renato Golin <renato.golin at linaro.org> wrote:> Folks, > > I know it's a reasonably valuable thing to have the buildbot IRC bot > publishing results, but the channel is kind of flooded with the > messages, and the more bots we put up, the worse it will be.I agree. It's very hard to keep track of real conversations some time. I would prefer to have a known central web page where bots publish their status instead of spamming a conversation channel. It serves no useful purpose. Diego.
On 05/19/2015 10:40 AM, James Y Knight wrote:> Yes, I also find the amount of bot spam in #llvm is basically > intolerable. It makes it difficult to see actual people talking. At > first, I just put all the bots on /ignore. Now I have an xchat script > to move the botspam to another tab (tabify-004.pl > <http://tabify-004.pl/>). I'd recommend that the bots should just be > moved to #llvm-bots and fix the problem for everyone. Those who are > committing changes can join that channel, too, and others don't care. > > While we're on this subject, I also find the official buildbot page > (lab.llvm.org:8011 <http://lab.llvm.org:8011/>) almost unusable, since > so many columns are either always red, or else are so flaky that they > basically randomly alternate between passing and failing. So, at a > glance, it's impossible to tell whether the current state of the tree > is good. (I certainly haven't memorized which ones are "supposed" to > be red, and which are not. Maybe others have). Having flaky and > always-failing builds show up on the buildbot pages, and notifying > IRC, really has negative utility, since it not only is not providing > useful information, but is serving to obscure the actual important > failures, and causing people to spend time investigating non-problems. > > Someone gave me the hint to use the http://bb.pgr.jp/ buildbot page > instead, which was a great recommendation -- that page shows problems > much more clearly. But it's unfortunate that there *needs* to be a > separate "sane builders only" buildmaster. > > E.g. (and not to pick on this particular bot, this is just one example > of many): > http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/27655 -- > passed, while the previous failed. But, it's not caused by the commit, > it's just arbitrary. > > Or, yesterday, on #llvm: "Anyone want to give me a clue as to why this > bot failed? > http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18017" > -- answer: because it's randomly broken. Wasted the questioner's time > trying to investigate the failure. > > > If all the flaky or always-broken builder configurations got hidden > from the main pages of buildbot, and stopped sending emails/IRC > notifications to anyone but their "owner", that would be a substantial > improvement.+1, this is a really good summary of issues. I'm in full support of any and all efforts to reduce noise here. I've gotten to the point where I only watch a small handful bots. Anything other than that, I pretty much ignore unless someone emails me directly or replies to a commit. I'm not quite to point of marking buildbot emails as spam, but I'm definitely not paying them much attention either. One particular irritant is getting emails 12-24 hours later about someone else's breakage that has *already been fixed*. The long cycling bots are really irritating in that respect.> > On Tue, May 19, 2015 at 10:15 AM, Renato Golin > <renato.golin at linaro.org <mailto:renato.golin at linaro.org>> wrote: > > Folks, > > I know it's a reasonably valuable thing to have the buildbot IRC bot > publishing results, but the channel is kind of flooded with the > messages, and the more bots we put up, the worse it will be. > > I think we still need the NOC warnings, but not over IRC. The Buildbot > NOC page is horrible and useless, since it doesn't know the difference > between "it's red and I know it" from "it's broken". > > For that reason, I have built my own NOC page: > > http://people.linaro.org/~renato.golin/llvm/arm-bots/ > <http://people.linaro.org/%7Erenato.golin/llvm/arm-bots/> > > But that machine is too slow to cope with all bots. We may need a > project to build such a system on a larger scale. > > However, for now, I think not printing the green results in IRC would > go a long way of cleaning the channel up. > > Any thoughts? > > cheers, > --renato > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/085b0b11/attachment.html>