On Jul 15, 2009, at 7:52 PM, Nick Lewycky wrote:
> 2009/7/15 Dale Johannesen <dalej at apple.com>
>
> On Jul 15, 2009, at 4:48 PMPDT, Daniel Dunbar wrote:
>
> > That depends on what you call a false positive. The public buildbot
> > regularly fails because of mailing Frontend tests, and I have had
> > continues failures of some DejaGNU tests for a long time on some
> > builders. Its not a false positive per se, but one starts to ignore
> > the failures because they aren't unexpected.
>
> Yes.  Probably the only way this will work better is if we get the
> testsuite to 0 failures, everywhere, conditionalizing as necessary to
> get rid of expected failures.  Then regressions will be more visible.
> I doubt that will happen unless we freeze the tree for a while and get
> everybody to fix bugs, or disable tests, instead of doing new stuff
> (at least, that was the case for gcc).
>
> This is exactly what we're supposed to do for releases, and in  
> theory, all of the time.
>
> We've been having a lot of churn lately. This is a good thing  
> overall, since it means there's lots of contributions going into the  
> project. What's different about this is that we have a lot of large- 
> scale, sweeping changes that touch a lot of code. In the past we've  
> generally serialized this sort of thing between contributors, or  
> broken changes up to be extremely incremental. The reason this is  
> happening less now is that we, as developers, are growing more  
> ambitious with our fixes to LLVM systematic problems, and doing so  
> on a tigher schedule. Once again, this is a good thing.
>
> There's two issues with buildbots. Firstly, we need more buildbots  
> on more platforms. For example, there are no Darwin buildbots, so if  
> I commit a change that breaks Darwin I won't get immediate notice  
> about it, nor a log of the failure.
This isn't 100% true. :-) We have a series of build bots at Apple  
building in various ways. Failures are sent to the mailing list, but  
they are not very meaningful to non-Apple employees because they don't  
have access to the machines and log files. We monitor them very  
closely, so we will pester people about any breakages. :-) Normally, a  
breakage on our build bots will also break on the Google ones. It's  
not always the case, but it happens most of the time.
Things get really out of hand (and I tend to lose my temper and write  
hotly worded emails) when things obviously break, and the build bots  
send out emails about these breakages, but people ignore them, and the  
build is broken for half a day or more. This morning, I got to the  
office and couldn't build TOT, it was so bad.
> We could even consider having a buildbot a prerequisite to being a  
> release-blocking platform. The other is that we need some level of  
> quality control on buildbots. We can accomplish this by either  
> publishing a few buildbot guidelines (ie., don't install llvm-gcc on  
> your buildbot machine because it will cause false-positives as llvm  
> and llvm-gcc get out of step) and by enhancing the buildbot system  
> to let us mark problems as expected. We already have part of that by  
> XFAILing tests.
>
I think that a policy guideline for build bots would be a very Good  
Thing(tm). I'm a novice at creating the build bot configure file, but  
Daniel and I can probably summarize how the build bots are run at  
Apple, which would be a good first-step towards this.
> Even so, better buildbots will improve visibility into how the tree  
> is progressing on a commit-by-commit basis, but it does nothing to  
> prevent breakage in the first place. I suspect most of our grief  
> will go away as some of the current major changes finish. If not,  
> we'll have to come up with a better way to handle so many large  
> changes, maybe something like a "schedule of merges" so that  
> committers don't step all over each other. I think GCC does  
> something like this already?
>
> We've deferred imposing structure like that until we discover that  
> we need it, and I'm not conviced we're quite there yet, but perhaps
> it's time to start thinking about it.
>
I don't think we need to impose a constrictive structure on people. We  
just need to foster good programming practices. The GCC people require  
that patches be run through the GCC testsuite with no regressions.  
That testsuite is *huge* and doesn't run cleanly for us. But our  
modest regression testsuite is a good first step. For major changes,  
running some subset of the llvm-test directory is appropriate. There  
are other things too, of course...
-bw