On Jul 15, 2009, at 4:48 PMPDT, Daniel Dunbar wrote:> That depends on what you call a false positive. The public buildbot > regularly fails because of mailing Frontend tests, and I have had > continues failures of some DejaGNU tests for a long time on some > builders. Its not a false positive per se, but one starts to ignore > the failures because they aren't unexpected.Yes. Probably the only way this will work better is if we get the testsuite to 0 failures, everywhere, conditionalizing as necessary to get rid of expected failures. Then regressions will be more visible. I doubt that will happen unless we freeze the tree for a while and get everybody to fix bugs, or disable tests, instead of doing new stuff (at least, that was the case for gcc).> - Daniel > > On Wed, Jul 15, 2009 at 4:10 PM, Bill Wendling<isanbard at gmail.com> > wrote: >> On Wed, Jul 15, 2009 at 3:43 PM, Eli >> Friedman<eli.friedman at gmail.com> wrote: >>> On Wed, Jul 15, 2009 at 3:01 PM, Bill Wendling<isanbard at gmail.com> >>> wrote: >>>> The core problem, in my opinion, is that people *don't* pay >>>> attention >>>> to the build bot failure messages that come along. >>> >>> That's largely because of the number of false positives. >>> >> There have been fewer and fewer of these in recent times. >> >> -bw >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
2009/7/15 Dale Johannesen <dalej at apple.com>> > On Jul 15, 2009, at 4:48 PMPDT, Daniel Dunbar wrote: > > > That depends on what you call a false positive. The public buildbot > > regularly fails because of mailing Frontend tests, and I have had > > continues failures of some DejaGNU tests for a long time on some > > builders. Its not a false positive per se, but one starts to ignore > > the failures because they aren't unexpected. > > Yes. Probably the only way this will work better is if we get the > testsuite to 0 failures, everywhere, conditionalizing as necessary to > get rid of expected failures. Then regressions will be more visible. > I doubt that will happen unless we freeze the tree for a while and get > everybody to fix bugs, or disable tests, instead of doing new stuff > (at least, that was the case for gcc).This is exactly what we're supposed to do for releases, and in theory, all of the time. We've been having a lot of churn lately. This is a good thing overall, since it means there's lots of contributions going into the project. What's different about this is that we have a lot of large-scale, sweeping changes that touch a lot of code. In the past we've generally serialized this sort of thing between contributors, or broken changes up to be extremely incremental. The reason this is happening less now is that we, as developers, are growing more ambitious with our fixes to LLVM systematic problems, and doing so on a tigher schedule. Once again, this is a good thing. There's two issues with buildbots. Firstly, we need more buildbots on more platforms. For example, there are no Darwin buildbots, so if I commit a change that breaks Darwin I won't get immediate notice about it, nor a log of the failure. We could even consider having a buildbot a prerequisite to being a release-blocking platform. The other is that we need some level of quality control on buildbots. We can accomplish this by either publishing a few buildbot guidelines (ie., don't install llvm-gcc on your buildbot machine because it will cause false-positives as llvm and llvm-gcc get out of step) and by enhancing the buildbot system to let us mark problems as expected. We already have part of that by XFAILing tests. Even so, better buildbots will improve visibility into how the tree is progressing on a commit-by-commit basis, but it does nothing to prevent breakage in the first place. I suspect most of our grief will go away as some of the current major changes finish. If not, we'll have to come up with a better way to handle so many large changes, maybe something like a "schedule of merges" so that committers don't step all over each other. I think GCC does something like this already? We've deferred imposing structure like that until we discover that we need it, and I'm not conviced we're quite there yet, but perhaps it's time to start thinking about it. Nick > - Daniel> > > > On Wed, Jul 15, 2009 at 4:10 PM, Bill Wendling<isanbard at gmail.com> > > wrote: > >> On Wed, Jul 15, 2009 at 3:43 PM, Eli > >> Friedman<eli.friedman at gmail.com> wrote: > >>> On Wed, Jul 15, 2009 at 3:01 PM, Bill Wendling<isanbard at gmail.com> > >>> wrote: > >>>> The core problem, in my opinion, is that people *don't* pay > >>>> attention > >>>> to the build bot failure messages that come along. > >>> > >>> That's largely because of the number of false positives. > >>> > >> There have been fewer and fewer of these in recent times. > >> > >> -bw > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090715/8fb490eb/attachment.html>
On Jul 15, 2009, at 7:52 PM, Nick Lewycky wrote:> 2009/7/15 Dale Johannesen <dalej at apple.com> > > On Jul 15, 2009, at 4:48 PMPDT, Daniel Dunbar wrote: > > > That depends on what you call a false positive. The public buildbot > > regularly fails because of mailing Frontend tests, and I have had > > continues failures of some DejaGNU tests for a long time on some > > builders. Its not a false positive per se, but one starts to ignore > > the failures because they aren't unexpected. > > Yes. Probably the only way this will work better is if we get the > testsuite to 0 failures, everywhere, conditionalizing as necessary to > get rid of expected failures. Then regressions will be more visible. > I doubt that will happen unless we freeze the tree for a while and get > everybody to fix bugs, or disable tests, instead of doing new stuff > (at least, that was the case for gcc). > > This is exactly what we're supposed to do for releases, and in > theory, all of the time. > > We've been having a lot of churn lately. This is a good thing > overall, since it means there's lots of contributions going into the > project. What's different about this is that we have a lot of large- > scale, sweeping changes that touch a lot of code. In the past we've > generally serialized this sort of thing between contributors, or > broken changes up to be extremely incremental. The reason this is > happening less now is that we, as developers, are growing more > ambitious with our fixes to LLVM systematic problems, and doing so > on a tigher schedule. Once again, this is a good thing. > > There's two issues with buildbots. Firstly, we need more buildbots > on more platforms. For example, there are no Darwin buildbots, so if > I commit a change that breaks Darwin I won't get immediate notice > about it, nor a log of the failure.This isn't 100% true. :-) We have a series of build bots at Apple building in various ways. Failures are sent to the mailing list, but they are not very meaningful to non-Apple employees because they don't have access to the machines and log files. We monitor them very closely, so we will pester people about any breakages. :-) Normally, a breakage on our build bots will also break on the Google ones. It's not always the case, but it happens most of the time. Things get really out of hand (and I tend to lose my temper and write hotly worded emails) when things obviously break, and the build bots send out emails about these breakages, but people ignore them, and the build is broken for half a day or more. This morning, I got to the office and couldn't build TOT, it was so bad.> We could even consider having a buildbot a prerequisite to being a > release-blocking platform. The other is that we need some level of > quality control on buildbots. We can accomplish this by either > publishing a few buildbot guidelines (ie., don't install llvm-gcc on > your buildbot machine because it will cause false-positives as llvm > and llvm-gcc get out of step) and by enhancing the buildbot system > to let us mark problems as expected. We already have part of that by > XFAILing tests. >I think that a policy guideline for build bots would be a very Good Thing(tm). I'm a novice at creating the build bot configure file, but Daniel and I can probably summarize how the build bots are run at Apple, which would be a good first-step towards this.> Even so, better buildbots will improve visibility into how the tree > is progressing on a commit-by-commit basis, but it does nothing to > prevent breakage in the first place. I suspect most of our grief > will go away as some of the current major changes finish. If not, > we'll have to come up with a better way to handle so many large > changes, maybe something like a "schedule of merges" so that > committers don't step all over each other. I think GCC does > something like this already? > > We've deferred imposing structure like that until we discover that > we need it, and I'm not conviced we're quite there yet, but perhaps > it's time to start thinking about it. >I don't think we need to impose a constrictive structure on people. We just need to foster good programming practices. The GCC people require that patches be run through the GCC testsuite with no regressions. That testsuite is *huge* and doesn't run cleanly for us. But our modest regression testsuite is a good first step. For major changes, running some subset of the llvm-test directory is appropriate. There are other things too, of course... -bw
On Wed, Jul 15, 2009 at 7:52 PM, Nick Lewycky<nlewycky at google.com> wrote:> 2009/7/15 Dale Johannesen <dalej at apple.com> >> >> On Jul 15, 2009, at 4:48 PMPDT, Daniel Dunbar wrote: >> >> > That depends on what you call a false positive. The public buildbot >> > regularly fails because of mailing Frontend tests, and I have had >> > continues failures of some DejaGNU tests for a long time on some >> > builders. Its not a false positive per se, but one starts to ignore >> > the failures because they aren't unexpected. >> >> Yes. Probably the only way this will work better is if we get the >> testsuite to 0 failures, everywhere, conditionalizing as necessary to >> get rid of expected failures. Then regressions will be more visible. >> I doubt that will happen unless we freeze the tree for a while and get >> everybody to fix bugs, or disable tests, instead of doing new stuff >> (at least, that was the case for gcc). > > This is exactly what we're supposed to do for releases, and in theory, all > of the time. > > We've been having a lot of churn lately. This is a good thing overall, since > it means there's lots of contributions going into the project. What's > different about this is that we have a lot of large-scale, sweeping changes > that touch a lot of code. In the past we've generally serialized this sort > of thing between contributors, or broken changes up to be extremely > incremental. The reason this is happening less now is that we, as > developers, are growing more ambitious with our fixes to LLVM systematic > problems, and doing so on a tigher schedule. Once again, this is a good > thing.+1> There's two issues with buildbots. Firstly, we need more buildbots on more > platforms. For example, there are no Darwin buildbots, so if I commit a > change that breaks Darwin I won't get immediate notice about it, nor a log > of the failure.I plan to solve this this week by serving a Darwin buildbot off my home machine. I also hope to add an MSVC cmake based (and very slow) buildbot relatively soon. That will bring me up to serving a total of 4 buildslaves out of my house, so if anyone else wants to contribute, please step up. However, as Bill notes we have lots of internal bots and its fair that Apple people have to maintain them (even if the breakage is due to an external commit).> We could even consider having a buildbot a prerequisite to > being a release-blocking platform. The other is that we need some level of > quality control on buildbots.I'm not really sure what this means. The llvm-gcc problem I regard as a bug in the LLVM test suite.> We can accomplish this by either publishing a > few buildbot guidelines (ie., don't install llvm-gcc on your buildbot > machine because it will cause false-positives as llvm and llvm-gcc get out > of step) and by enhancing the buildbot system to let us mark problems as > expected. We already have part of that by XFAILing tests.What actual enhancements would we need?> Even so, better buildbots will improve visibility into how the tree is > progressing on a commit-by-commit basis, but it does nothing to prevent > breakage in the first place. I suspect most of our grief will go away as > some of the current major changes finish.I agree. - Daniel> If not, we'll have to come up with a better way to handle so many large changes, maybe something like a > "schedule of merges" so that committers don't step all over each other. I > think GCC does something like this already? > > We've deferred imposing structure like that until we discover that we need > it, and I'm not conviced we're quite there yet, but perhaps it's time to > start thinking about it. > > Nick > >> > - Daniel >> > >> > On Wed, Jul 15, 2009 at 4:10 PM, Bill Wendling<isanbard at gmail.com> >> > wrote: >> >> On Wed, Jul 15, 2009 at 3:43 PM, Eli >> >> Friedman<eli.friedman at gmail.com> wrote: >> >>> On Wed, Jul 15, 2009 at 3:01 PM, Bill Wendling<isanbard at gmail.com> >> >>> wrote: >> >>>> The core problem, in my opinion, is that people *don't* pay >> >>>> attention >> >>>> to the build bot failure messages that come along. >> >>> >> >>> That's largely because of the number of false positives. >> >>> >> >> There have been fewer and fewer of these in recent times. >> >> >> >> -bw >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> > >> > _______________________________________________ >> > LLVM Developers mailing list >> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >