flight 15401 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win  7 windows-install          fail REGR. vs. 15179
Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf      5 xen-boot                     fail   like 15179
 test-amd64-amd64-xl-sedf-pin  5 xen-boot                     fail   like 15179
Tests which did not succeed, but are not blocking:
 build-armhf                   4 xen-build                    fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start                 fail never pass
 test-amd64-i386-xend-winxpsp3 16 leak-check/check             fail  never pass
 test-amd64-i386-win          16 leak-check/check             fail   never pass
 test-amd64-i386-qemut-win-vcpus1 16 leak-check/check           fail never pass
 test-amd64-i386-xl-win-vcpus1 13 guest-stop                   fail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 13 guest-stop             fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 13 guest-stop               fail never pass
 test-amd64-amd64-xl-win7-amd64 13 guest-stop                   fail never pass
 test-amd64-i386-win-vcpus1   16 leak-check/check             fail   never pass
 test-amd64-amd64-win         16 leak-check/check             fail   never pass
 test-amd64-amd64-xl-winxpsp3 13 guest-stop                   fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 13 guest-stop              fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 13 guest-stop               fail never pass
 test-amd64-i386-xl-win7-amd64 13 guest-stop                   fail  never pass
 test-amd64-amd64-qemut-win   16 leak-check/check             fail   never pass
 test-amd64-i386-xend-qemut-winxpsp3 16 leak-check/check        fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 13 guest-stop         fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop               fail never pass
 test-amd64-i386-xl-qemut-win-vcpus1 13 guest-stop              fail never pass
 test-amd64-i386-qemut-win    16 leak-check/check             fail   never pass
 test-amd64-amd64-xl-win      13 guest-stop                   fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 13 guest-stop             fail never pass
version targeted for testing:
 xen                  d1bf3b21f783
baseline version:
 xen                  5af4f2ab06f3
------------------------------------------------------------
People who touched revisions under test:
  Dan Magenheimer <dan.magenheimer@oracle.com>
  Daniel De Graaf <dgdegra@tycho.nsa.gov>
  Dario Faggioli <dario.faggioli@citrix.com>
  David Vrabel <david.vrabel@citrix.com>
  Dongxiao Xu <dongxiao.xu@intel.com>
  Ian Campbell <ian.campbell@citrix.com>
  Ian Jackson <ian.jackson@eu.citrix.com>
  Jan Beulich <jbeulich@suse.com>
  Jim Fehlig <jfehlig@suse.com>
  Jun Nakajima <jun.nakajima@intel.com>
  Keir Fraser <keir@xen.org>
  Matt Wilson <msw@amazon.com>
  Razvan Cojocaru <rzvncj@gmail.com>
  Roger Pau Monn? <roger.pau@citrix.com>
  Samuel Thibault <samuel.thibault@ens-lyon.org>
  Stefano Stabellini <stefano.stabellini@eu.citrix.com>
  Tim Deegan <tim@xen.org>
  Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
  Wei Huang <huangwei@gmail.com>
  Xiantao Zhang <xiantao.zhang@intel.com>
------------------------------------------------------------
jobs:
 build-amd64                                                  pass    
 build-armhf                                                  fail    
 build-i386                                                   pass    
 build-amd64-oldkern                                          pass    
 build-i386-oldkern                                           pass    
 build-amd64-pvops                                            pass    
 build-i386-pvops                                             pass    
 test-amd64-amd64-xl                                          pass    
 test-amd64-i386-xl                                           pass    
 test-amd64-i386-rhel6hvm-amd                                 pass    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-amd64-xl-win7-amd64                               fail    
 test-amd64-i386-xl-win7-amd64                                fail    
 test-amd64-i386-xl-credit2                                   pass    
 test-amd64-amd64-xl-pcipt-intel                              fail    
 test-amd64-i386-rhel6hvm-intel                               pass    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-i386-xl-multivcpu                                 pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         pass    
 test-amd64-amd64-xl-sedf-pin                                 fail    
 test-amd64-amd64-pv                                          pass    
 test-amd64-i386-pv                                           pass    
 test-amd64-amd64-xl-sedf                                     fail    
 test-amd64-i386-win-vcpus1                                   fail    
 test-amd64-i386-qemut-win-vcpus1                             fail    
 test-amd64-i386-xl-qemut-win-vcpus1                          fail    
 test-amd64-i386-xl-win-vcpus1                                fail    
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-winxpsp3-vcpus1                           fail    
 test-amd64-amd64-win                                         fail    
 test-amd64-i386-win                                          fail    
 test-amd64-amd64-qemut-win                                   fail    
 test-amd64-i386-qemut-win                                    fail    
 test-amd64-amd64-xl-qemut-win                                fail    
 test-amd64-amd64-xl-win                                      fail    
 test-amd64-i386-xend-qemut-winxpsp3                          fail    
 test-amd64-amd64-xl-qemut-winxpsp3                           fail    
 test-amd64-amd64-xl-qemuu-winxpsp3                           fail    
 test-amd64-i386-xend-winxpsp3                                fail    
 test-amd64-amd64-xl-winxpsp3                                 fail    
------------------------------------------------------------
sg-report-flight on woking.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images
Logs, config files, etc. are available at
    http://www.chiark.greenend.org.uk/~xensrcts/logs
Test harness code can be found at
    http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary
Not pushing.
(No revision log; it would be 1095 lines long.)
xen.org writes ("[xen-unstable test] 15401: regressions -
FAIL"):> flight 15401 xen-unstable real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-qemut-win  7 windows-install     fail REGR. vs. 15179
With some handholding, I managed to get the bisector to work on this.
It found that the original "good" version is unreliable: it built Xen
5af4f2ab06f3 and in two recent tests on the same host, of the same
build, it failed once and passed once.
Under the circumstances it''s not clear that the current staging is any
worse than non-staging.  I think we should push the revision reported
in this test (which was otherwise OK according to the tester) to
non-staging, with a manual "hg push".
> version targeted for testing:
>  xen                  d1bf3b21f783
I''m not sure how to do this with hg and have to go catch a train so I
don''t have time to look it up, but presumably there''s some
rune of the
form "hg push -r d1bf3b21f783 ssh://xenbits/xen-unstable.hg"
Thanks,
Ian.
On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote:> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): > > flight 15401 xen-unstable real [real] > > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > > With some handholding, I managed to get the bisector to work on this. > It found that the original "good" version is unreliable: it built Xen > 5af4f2ab06f3 and in two recent tests on the same host, of the same > build, it failed once and passed once.Hrm, did it make any progress over the w/e.> Under the circumstances it''s not clear that the current staging is any > worse than non-staging. I think we should push the revision reported > in this test (which was otherwise OK according to the tester) to > non-staging, with a manual "hg push".This sounds like a good idea.> > version targeted for testing: > > xen d1bf3b21f783 > > I''m not sure how to do this with hg and have to go catch a train so I > don''t have time to look it up, but presumably there''s some rune of the > form "hg push -r d1bf3b21f783 ssh://xenbits/xen-unstable.hg"That looks right to me... Shall I? (not sure I''m in the necessary group on xenbits, but I could try ;-)) Ian.
>>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): >> > flight 15401 xen-unstable real [real] >> > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ >> > >> > Regressions :-( >> > >> > Tests which did not succeed and are blocking, >> > including tests which could not be run: >> > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 >> >> With some handholding, I managed to get the bisector to work on this. >> It found that the original "good" version is unreliable: it built Xen >> 5af4f2ab06f3 and in two recent tests on the same host, of the same >> build, it failed once and passed once. > > Hrm, did it make any progress over the w/e.With the failure now being consistent rather than intermittent, we almost definitely have a state worse than before.>> Under the circumstances it''s not clear that the current staging is any >> worse than non-staging. I think we should push the revision reported >> in this test (which was otherwise OK according to the tester) to >> non-staging, with a manual "hg push". > > This sounds like a good idea.Wouldn''t that set us up for the same problem again when the next testing round fails here again? Unless Olaf''s testing with partial reverts shows otherwise, I''d be up for reverting all non-trivial x86 HVM RTC patches I had applied recently (where "trivial" to me would be "use RTC_* names instead of literal numbers" and "use cached original value in RTC_REG_B writing code", albeit the latter may not revert cleanly on its own). Should they turn out not to be the culprit, they could always be re-applied later. Jan
On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote:> >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > >> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): > >> > flight 15401 xen-unstable real [real] > >> > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ > >> > > >> > Regressions :-( > >> > > >> > Tests which did not succeed and are blocking, > >> > including tests which could not be run: > >> > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > >> > >> With some handholding, I managed to get the bisector to work on this. > >> It found that the original "good" version is unreliable: it built Xen > >> 5af4f2ab06f3 and in two recent tests on the same host, of the same > >> build, it failed once and passed once. > > > > Hrm, did it make any progress over the w/e. > > With the failure now being consistent rather than intermittent, > we almost definitely have a state worse than before. > > >> Under the circumstances it''s not clear that the current staging is any > >> worse than non-staging. I think we should push the revision reported > >> in this test (which was otherwise OK according to the tester) to > >> non-staging, with a manual "hg push". > > > > This sounds like a good idea. > > Wouldn''t that set us up for the same problem again when the next > testing round fails here again?Yes, that''s true.> > Unless Olaf''s testing with partial reverts shows otherwise, I''d be up > for reverting all non-trivial x86 HVM RTC patches I had applied > recently (where "trivial" to me would be "use RTC_* names instead of > literal numbers" and "use cached original value in RTC_REG_B writing > code", albeit the latter may not revert cleanly on its own). Should > they turn out not to be the culprit, they could always be re-applied > later.We should certainly see what Olaf''s test shows. I''d also be interested in what (if anything) the bisector has discovered. But then yes, if we think these might be the culprit then reverting would be sensible. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401:
regressions - FAIL"):> On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote:
> > > Tests which did not succeed and are blocking,
> > > including tests which could not be run:
> > >  test-amd64-amd64-xl-qemut-win  7 windows-install     fail REGR.
vs. 15179
> > 
> > With some handholding, I managed to get the bisector to work on this.
> > It found that the original "good" version is unreliable: it
built Xen
> > 5af4f2ab06f3 and in two recent tests on the same host, of the same
> > build, it failed once and passed once.
> 
> Hrm, did it make any progress over the w/e.
It stops when it finds that the failure, or the previous pass, isn''t
reproducible.
Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401:
regressions - FAIL"):> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote:
> > >>> On 04.02.13 at 12:06, Ian Campbell
<Ian.Campbell@citrix.com> wrote:
> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote:
> > >> Under the circumstances it''s not clear that the
current staging is any
> > >> worse than non-staging.  I think we should push the revision
reported
> > >> in this test (which was otherwise OK according to the tester)
to
> > >> non-staging, with a manual "hg push".
> > > 
> > > This sounds like a good idea.
> > 
> > Wouldn''t that set us up for the same problem again when the
next
> > testing round fails here again?
> 
> Yes, that''s true.
No.  Because the problem is essentially a fluke pass, not a fluke
fail.
Ian.
On Mon, 2013-02-04 at 14:22 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"): > > On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: > > > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > > >> Under the circumstances it''s not clear that the current staging is any > > > >> worse than non-staging. I think we should push the revision reported > > > >> in this test (which was otherwise OK according to the tester) to > > > >> non-staging, with a manual "hg push". > > > > > > > > This sounds like a good idea. > > > > > > Wouldn''t that set us up for the same problem again when the next > > > testing round fails here again? > > > > Yes, that''s true. > > No. Because the problem is essentially a fluke pass, not a fluke > fail.So you are also proposing to flip something in osstest to erase the fluke pass from its memory? Otherwise won''t it see all future fails as regressions, rather than never pass, due to the fluke pass? Ian.
On Mon, 2013-02-04 at 14:19 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"): > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > > > Tests which did not succeed and are blocking, > > > > including tests which could not be run: > > > > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > > > > > > With some handholding, I managed to get the bisector to work on this. > > > It found that the original "good" version is unreliable: it built Xen > > > 5af4f2ab06f3 and in two recent tests on the same host, of the same > > > build, it failed once and passed once. > > > > Hrm, did it make any progress over the w/e. > > It stops when it finds that the failure, or the previous pass, isn''t > reproducible.5af4f2ab06f3 is before 26461:78e91e9e4d61 thru 26456:1e9a8e155002 which are Jan''s RTC changes which rather rules them out I think. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401:
regressions - FAIL"):> On Mon, 2013-02-04 at 14:22 +0000, Ian Jackson wrote:
> > No.  Because the problem is essentially a fluke pass, not a fluke
> > fail.
> 
> So you are also proposing to flip something in osstest to erase the
> fluke pass from its memory? Otherwise won''t it see all future
fails as
> regressions, rather than never pass, due to the fluke pass?
No, right now I''m proposing to do a manual push of (as I proposed)
d1bf3b21f783.  The effect of that is that future pushes will be
regarded as regressions iff their test results are worse than
d1bf3b21f783''s.
That is, the test system uses whatever revision non-staging has as the
baseline revision, not whatever it most recently tested.  (And if
non-staging hasn''t had a test at all, it will test it.)
Ian.
On Mon, 2013-02-04 at 14:30 +0000, Ian Jackson wrote:> the test system uses whatever revision non-staging has as the > baseline revision, not whatever it most recently tested.Ah, I was expecting it went further back in history (i.e. did this test *ever* pass). Ian.
>>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - > FAIL"): >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> > >> Under the circumstances it''s not clear that the current staging is any >> > >> worse than non-staging. I think we should push the revision reported >> > >> in this test (which was otherwise OK according to the tester) to >> > >> non-staging, with a manual "hg push". >> > > >> > > This sounds like a good idea. >> > >> > Wouldn''t that set us up for the same problem again when the next >> > testing round fails here again? >> >> Yes, that''s true. > > No. Because the problem is essentially a fluke pass, not a fluke > fail.I''m not sure - previously, iirc, we had inconsistent successes and failures of this test (and I think another one or two). Now we appear to have run into a consistent failure state, so something must have changed. Luckily there is an indication from Olaf that rather than reverting, applying the remaining pieces of the broken up RTC emulation changes (which I didn''t post formally yet, mainly in the hope to get a push first, considering that these bits were what originally caused regressions when applied as a single monolithic change - and with a bug fixed only after I split things apart - late in the 4.2 cycle) unbreaks what he reported broken. I could certainly post that patch right away, but I''d like to give it a little more time to see whether Olaf can confirm his initial findings, and because with that I''m less certain that the test failure really is to be attributed to the RTC emulation changes at all. Jan
On Mon, 2013-02-04 at 14:39 +0000, Jan Beulich wrote:> >>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - > > FAIL"): > >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: > >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > >> > >> Under the circumstances it''s not clear that the current staging is any > >> > >> worse than non-staging. I think we should push the revision reported > >> > >> in this test (which was otherwise OK according to the tester) to > >> > >> non-staging, with a manual "hg push". > >> > > > >> > > This sounds like a good idea. > >> > > >> > Wouldn''t that set us up for the same problem again when the next > >> > testing round fails here again? > >> > >> Yes, that''s true. > > > > No. Because the problem is essentially a fluke pass, not a fluke > > fail. > > I''m not sure - previously, iirc, we had inconsistent successes and > failures of this test (and I think another one or two). Now we > appear to have run into a consistent failure state, so something > must have changed. > > Luckily there is an indication from Olaf that rather than reverting, > applying the remaining pieces of the broken up RTC emulation > changes (which I didn''t post formally yet, mainly in the hope to > get a push first, considering that these bits were what originally > caused regressions when applied as a single monolithic change - > and with a bug fixed only after I split things apart - late in the > 4.2 cycle) unbreaks what he reported broken. > > I could certainly post that patch right away, but I''d like to give > it a little more time to see whether Olaf can confirm his initial > findings, and because with that I''m less certain that the test > failure really is to be attributed to the RTC emulation changes > at all.Based on <1359987978.7743.56.camel@zakaz.uk.xensource.com> I don''t think the RTC changes are to blame, since Ian says the baseline was 5af4f2ab06f3 which is before then. Ian.
>>> On 04.02.13 at 15:44, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-02-04 at 14:39 +0000, Jan Beulich wrote: >> >>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: >> > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: > regressions - >> > FAIL"): >> >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: >> >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> >> > >> Under the circumstances it''s not clear that the current staging is any >> >> > >> worse than non-staging. I think we should push the revision reported >> >> > >> in this test (which was otherwise OK according to the tester) to >> >> > >> non-staging, with a manual "hg push". >> >> > > >> >> > > This sounds like a good idea. >> >> > >> >> > Wouldn''t that set us up for the same problem again when the next >> >> > testing round fails here again? >> >> >> >> Yes, that''s true. >> > >> > No. Because the problem is essentially a fluke pass, not a fluke >> > fail. >> >> I''m not sure - previously, iirc, we had inconsistent successes and >> failures of this test (and I think another one or two). Now we >> appear to have run into a consistent failure state, so something >> must have changed. >> >> Luckily there is an indication from Olaf that rather than reverting, >> applying the remaining pieces of the broken up RTC emulation >> changes (which I didn''t post formally yet, mainly in the hope to >> get a push first, considering that these bits were what originally >> caused regressions when applied as a single monolithic change - >> and with a bug fixed only after I split things apart - late in the >> 4.2 cycle) unbreaks what he reported broken. >> >> I could certainly post that patch right away, but I''d like to give >> it a little more time to see whether Olaf can confirm his initial >> findings, and because with that I''m less certain that the test >> failure really is to be attributed to the RTC emulation changes >> at all. > > Based on <1359987978.7743.56.camel@zakaz.uk.xensource.com> I don''t think > the RTC changes are to blame, since Ian says the baseline was > 5af4f2ab06f3 which is before then.Okay - I''m certainly not opposed to a manual push. Jan
Jan Beulich writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions
- FAIL"):> Okay - I''m certainly not opposed to a manual push.
Done.  Specifically, I have pushed d1bf3b21f783 to non-staging
xen-unstable.  The remaining backlog in staging will probably clear
soon too.
Ian.