flight 15401 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 5 xen-boot fail like 15179 test-amd64-amd64-xl-sedf-pin 5 xen-boot fail like 15179 Tests which did not succeed, but are not blocking: build-armhf 4 xen-build fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-amd64-i386-win 16 leak-check/check fail never pass test-amd64-i386-qemut-win-vcpus1 16 leak-check/check fail never pass test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 13 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass test-amd64-amd64-win 16 leak-check/check fail never pass test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 13 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-qemut-win 16 leak-check/check fail never pass test-amd64-i386-xend-qemut-winxpsp3 16 leak-check/check fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-i386-xl-qemut-win-vcpus1 13 guest-stop fail never pass test-amd64-i386-qemut-win 16 leak-check/check fail never pass test-amd64-amd64-xl-win 13 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 13 guest-stop fail never pass version targeted for testing: xen d1bf3b21f783 baseline version: xen 5af4f2ab06f3 ------------------------------------------------------------ People who touched revisions under test: Dan Magenheimer <dan.magenheimer@oracle.com> Daniel De Graaf <dgdegra@tycho.nsa.gov> Dario Faggioli <dario.faggioli@citrix.com> David Vrabel <david.vrabel@citrix.com> Dongxiao Xu <dongxiao.xu@intel.com> Ian Campbell <ian.campbell@citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Jim Fehlig <jfehlig@suse.com> Jun Nakajima <jun.nakajima@intel.com> Keir Fraser <keir@xen.org> Matt Wilson <msw@amazon.com> Razvan Cojocaru <rzvncj@gmail.com> Roger Pau Monn? <roger.pau@citrix.com> Samuel Thibault <samuel.thibault@ens-lyon.org> Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tim Deegan <tim@xen.org> Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Wei Huang <huangwei@gmail.com> Xiantao Zhang <xiantao.zhang@intel.com> ------------------------------------------------------------ jobs: build-amd64 pass build-armhf fail build-i386 pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel pass test-amd64-i386-qemut-rhel6hvm-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pair pass test-amd64-i386-pair pass test-amd64-amd64-xl-sedf-pin fail test-amd64-amd64-pv pass test-amd64-i386-pv pass test-amd64-amd64-xl-sedf fail test-amd64-i386-win-vcpus1 fail test-amd64-i386-qemut-win-vcpus1 fail test-amd64-i386-xl-qemut-win-vcpus1 fail test-amd64-i386-xl-win-vcpus1 fail test-amd64-i386-xl-qemut-winxpsp3-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-amd64-win fail test-amd64-i386-win fail test-amd64-amd64-qemut-win fail test-amd64-i386-qemut-win fail test-amd64-amd64-xl-qemut-win fail test-amd64-amd64-xl-win fail test-amd64-i386-xend-qemut-winxpsp3 fail test-amd64-amd64-xl-qemut-winxpsp3 fail test-amd64-amd64-xl-qemuu-winxpsp3 fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 1095 lines long.)
xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"):> flight 15401 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179With some handholding, I managed to get the bisector to work on this. It found that the original "good" version is unreliable: it built Xen 5af4f2ab06f3 and in two recent tests on the same host, of the same build, it failed once and passed once. Under the circumstances it''s not clear that the current staging is any worse than non-staging. I think we should push the revision reported in this test (which was otherwise OK according to the tester) to non-staging, with a manual "hg push".> version targeted for testing: > xen d1bf3b21f783I''m not sure how to do this with hg and have to go catch a train so I don''t have time to look it up, but presumably there''s some rune of the form "hg push -r d1bf3b21f783 ssh://xenbits/xen-unstable.hg" Thanks, Ian.
On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote:> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): > > flight 15401 xen-unstable real [real] > > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > > With some handholding, I managed to get the bisector to work on this. > It found that the original "good" version is unreliable: it built Xen > 5af4f2ab06f3 and in two recent tests on the same host, of the same > build, it failed once and passed once.Hrm, did it make any progress over the w/e.> Under the circumstances it''s not clear that the current staging is any > worse than non-staging. I think we should push the revision reported > in this test (which was otherwise OK according to the tester) to > non-staging, with a manual "hg push".This sounds like a good idea.> > version targeted for testing: > > xen d1bf3b21f783 > > I''m not sure how to do this with hg and have to go catch a train so I > don''t have time to look it up, but presumably there''s some rune of the > form "hg push -r d1bf3b21f783 ssh://xenbits/xen-unstable.hg"That looks right to me... Shall I? (not sure I''m in the necessary group on xenbits, but I could try ;-)) Ian.
>>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): >> > flight 15401 xen-unstable real [real] >> > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ >> > >> > Regressions :-( >> > >> > Tests which did not succeed and are blocking, >> > including tests which could not be run: >> > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 >> >> With some handholding, I managed to get the bisector to work on this. >> It found that the original "good" version is unreliable: it built Xen >> 5af4f2ab06f3 and in two recent tests on the same host, of the same >> build, it failed once and passed once. > > Hrm, did it make any progress over the w/e.With the failure now being consistent rather than intermittent, we almost definitely have a state worse than before.>> Under the circumstances it''s not clear that the current staging is any >> worse than non-staging. I think we should push the revision reported >> in this test (which was otherwise OK according to the tester) to >> non-staging, with a manual "hg push". > > This sounds like a good idea.Wouldn''t that set us up for the same problem again when the next testing round fails here again? Unless Olaf''s testing with partial reverts shows otherwise, I''d be up for reverting all non-trivial x86 HVM RTC patches I had applied recently (where "trivial" to me would be "use RTC_* names instead of literal numbers" and "use cached original value in RTC_REG_B writing code", albeit the latter may not revert cleanly on its own). Should they turn out not to be the culprit, they could always be re-applied later. Jan
On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote:> >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > >> xen.org writes ("[xen-unstable test] 15401: regressions - FAIL"): > >> > flight 15401 xen-unstable real [real] > >> > http://www.chiark.greenend.org.uk/~xensrcts/logs/15401/ > >> > > >> > Regressions :-( > >> > > >> > Tests which did not succeed and are blocking, > >> > including tests which could not be run: > >> > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > >> > >> With some handholding, I managed to get the bisector to work on this. > >> It found that the original "good" version is unreliable: it built Xen > >> 5af4f2ab06f3 and in two recent tests on the same host, of the same > >> build, it failed once and passed once. > > > > Hrm, did it make any progress over the w/e. > > With the failure now being consistent rather than intermittent, > we almost definitely have a state worse than before. > > >> Under the circumstances it''s not clear that the current staging is any > >> worse than non-staging. I think we should push the revision reported > >> in this test (which was otherwise OK according to the tester) to > >> non-staging, with a manual "hg push". > > > > This sounds like a good idea. > > Wouldn''t that set us up for the same problem again when the next > testing round fails here again?Yes, that''s true.> > Unless Olaf''s testing with partial reverts shows otherwise, I''d be up > for reverting all non-trivial x86 HVM RTC patches I had applied > recently (where "trivial" to me would be "use RTC_* names instead of > literal numbers" and "use cached original value in RTC_REG_B writing > code", albeit the latter may not revert cleanly on its own). Should > they turn out not to be the culprit, they could always be re-applied > later.We should certainly see what Olaf''s test shows. I''d also be interested in what (if anything) the bisector has discovered. But then yes, if we think these might be the culprit then reverting would be sensible. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"):> On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > > Tests which did not succeed and are blocking, > > > including tests which could not be run: > > > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > > > > With some handholding, I managed to get the bisector to work on this. > > It found that the original "good" version is unreliable: it built Xen > > 5af4f2ab06f3 and in two recent tests on the same host, of the same > > build, it failed once and passed once. > > Hrm, did it make any progress over the w/e.It stops when it finds that the failure, or the previous pass, isn''t reproducible. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"):> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: > > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > >> Under the circumstances it''s not clear that the current staging is any > > >> worse than non-staging. I think we should push the revision reported > > >> in this test (which was otherwise OK according to the tester) to > > >> non-staging, with a manual "hg push". > > > > > > This sounds like a good idea. > > > > Wouldn''t that set us up for the same problem again when the next > > testing round fails here again? > > Yes, that''s true.No. Because the problem is essentially a fluke pass, not a fluke fail. Ian.
On Mon, 2013-02-04 at 14:22 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"): > > On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: > > > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > > >> Under the circumstances it''s not clear that the current staging is any > > > >> worse than non-staging. I think we should push the revision reported > > > >> in this test (which was otherwise OK according to the tester) to > > > >> non-staging, with a manual "hg push". > > > > > > > > This sounds like a good idea. > > > > > > Wouldn''t that set us up for the same problem again when the next > > > testing round fails here again? > > > > Yes, that''s true. > > No. Because the problem is essentially a fluke pass, not a fluke > fail.So you are also proposing to flip something in osstest to erase the fluke pass from its memory? Otherwise won''t it see all future fails as regressions, rather than never pass, due to the fluke pass? Ian.
On Mon, 2013-02-04 at 14:19 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"): > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > > > > Tests which did not succeed and are blocking, > > > > including tests which could not be run: > > > > test-amd64-amd64-xl-qemut-win 7 windows-install fail REGR. vs. 15179 > > > > > > With some handholding, I managed to get the bisector to work on this. > > > It found that the original "good" version is unreliable: it built Xen > > > 5af4f2ab06f3 and in two recent tests on the same host, of the same > > > build, it failed once and passed once. > > > > Hrm, did it make any progress over the w/e. > > It stops when it finds that the failure, or the previous pass, isn''t > reproducible.5af4f2ab06f3 is before 26461:78e91e9e4d61 thru 26456:1e9a8e155002 which are Jan''s RTC changes which rather rules them out I think. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"):> On Mon, 2013-02-04 at 14:22 +0000, Ian Jackson wrote: > > No. Because the problem is essentially a fluke pass, not a fluke > > fail. > > So you are also proposing to flip something in osstest to erase the > fluke pass from its memory? Otherwise won''t it see all future fails as > regressions, rather than never pass, due to the fluke pass?No, right now I''m proposing to do a manual push of (as I proposed) d1bf3b21f783. The effect of that is that future pushes will be regarded as regressions iff their test results are worse than d1bf3b21f783''s. That is, the test system uses whatever revision non-staging has as the baseline revision, not whatever it most recently tested. (And if non-staging hasn''t had a test at all, it will test it.) Ian.
On Mon, 2013-02-04 at 14:30 +0000, Ian Jackson wrote:> the test system uses whatever revision non-staging has as the > baseline revision, not whatever it most recently tested.Ah, I was expecting it went further back in history (i.e. did this test *ever* pass). Ian.
>>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - > FAIL"): >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> > >> Under the circumstances it''s not clear that the current staging is any >> > >> worse than non-staging. I think we should push the revision reported >> > >> in this test (which was otherwise OK according to the tester) to >> > >> non-staging, with a manual "hg push". >> > > >> > > This sounds like a good idea. >> > >> > Wouldn''t that set us up for the same problem again when the next >> > testing round fails here again? >> >> Yes, that''s true. > > No. Because the problem is essentially a fluke pass, not a fluke > fail.I''m not sure - previously, iirc, we had inconsistent successes and failures of this test (and I think another one or two). Now we appear to have run into a consistent failure state, so something must have changed. Luckily there is an indication from Olaf that rather than reverting, applying the remaining pieces of the broken up RTC emulation changes (which I didn''t post formally yet, mainly in the hope to get a push first, considering that these bits were what originally caused regressions when applied as a single monolithic change - and with a bug fixed only after I split things apart - late in the 4.2 cycle) unbreaks what he reported broken. I could certainly post that patch right away, but I''d like to give it a little more time to see whether Olaf can confirm his initial findings, and because with that I''m less certain that the test failure really is to be attributed to the RTC emulation changes at all. Jan
On Mon, 2013-02-04 at 14:39 +0000, Jan Beulich wrote:> >>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - > > FAIL"): > >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: > >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: > >> > >> Under the circumstances it''s not clear that the current staging is any > >> > >> worse than non-staging. I think we should push the revision reported > >> > >> in this test (which was otherwise OK according to the tester) to > >> > >> non-staging, with a manual "hg push". > >> > > > >> > > This sounds like a good idea. > >> > > >> > Wouldn''t that set us up for the same problem again when the next > >> > testing round fails here again? > >> > >> Yes, that''s true. > > > > No. Because the problem is essentially a fluke pass, not a fluke > > fail. > > I''m not sure - previously, iirc, we had inconsistent successes and > failures of this test (and I think another one or two). Now we > appear to have run into a consistent failure state, so something > must have changed. > > Luckily there is an indication from Olaf that rather than reverting, > applying the remaining pieces of the broken up RTC emulation > changes (which I didn''t post formally yet, mainly in the hope to > get a push first, considering that these bits were what originally > caused regressions when applied as a single monolithic change - > and with a bug fixed only after I split things apart - late in the > 4.2 cycle) unbreaks what he reported broken. > > I could certainly post that patch right away, but I''d like to give > it a little more time to see whether Olaf can confirm his initial > findings, and because with that I''m less certain that the test > failure really is to be attributed to the RTC emulation changes > at all.Based on <1359987978.7743.56.camel@zakaz.uk.xensource.com> I don''t think the RTC changes are to blame, since Ian says the baseline was 5af4f2ab06f3 which is before then. Ian.
>>> On 04.02.13 at 15:44, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-02-04 at 14:39 +0000, Jan Beulich wrote: >> >>> On 04.02.13 at 15:22, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: >> > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 15401: > regressions - >> > FAIL"): >> >> On Mon, 2013-02-04 at 11:17 +0000, Jan Beulich wrote: >> >> > >>> On 04.02.13 at 12:06, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> >> > > On Fri, 2013-02-01 at 11:44 +0000, Ian Jackson wrote: >> >> > >> Under the circumstances it''s not clear that the current staging is any >> >> > >> worse than non-staging. I think we should push the revision reported >> >> > >> in this test (which was otherwise OK according to the tester) to >> >> > >> non-staging, with a manual "hg push". >> >> > > >> >> > > This sounds like a good idea. >> >> > >> >> > Wouldn''t that set us up for the same problem again when the next >> >> > testing round fails here again? >> >> >> >> Yes, that''s true. >> > >> > No. Because the problem is essentially a fluke pass, not a fluke >> > fail. >> >> I''m not sure - previously, iirc, we had inconsistent successes and >> failures of this test (and I think another one or two). Now we >> appear to have run into a consistent failure state, so something >> must have changed. >> >> Luckily there is an indication from Olaf that rather than reverting, >> applying the remaining pieces of the broken up RTC emulation >> changes (which I didn''t post formally yet, mainly in the hope to >> get a push first, considering that these bits were what originally >> caused regressions when applied as a single monolithic change - >> and with a bug fixed only after I split things apart - late in the >> 4.2 cycle) unbreaks what he reported broken. >> >> I could certainly post that patch right away, but I''d like to give >> it a little more time to see whether Olaf can confirm his initial >> findings, and because with that I''m less certain that the test >> failure really is to be attributed to the RTC emulation changes >> at all. > > Based on <1359987978.7743.56.camel@zakaz.uk.xensource.com> I don''t think > the RTC changes are to blame, since Ian says the baseline was > 5af4f2ab06f3 which is before then.Okay - I''m certainly not opposed to a manual push. Jan
Jan Beulich writes ("Re: [Xen-devel] [xen-unstable test] 15401: regressions - FAIL"):> Okay - I''m certainly not opposed to a manual push.Done. Specifically, I have pushed d1bf3b21f783 to non-staging xen-unstable. The remaining backlog in staging will probably clear soon too. Ian.