flight 17402 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/17402/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-win7-amd64 10 guest-saverestore.2 fail REGR. vs. 17399 build-amd64-oldkern 4 xen-build fail REGR. vs. 17399 build-i386-oldkern 4 xen-build fail REGR. vs. 17399 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-amd64-xl-qemut-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 13 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 13 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 13 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 13 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-i386-xend-qemut-winxpsp3 16 leak-check/check fail never pass version targeted for testing: xen 6890cebc6a987d0e896f5d23a8de11a3934101cf baseline version: xen 72af01bf6f7489e54ad59270222a29d3e8c501d1 ------------------------------------------------------------ People who touched revisions under test: "Zhang, Xiantao" <xiantao.zhang@intel.com> Andrew Cooper <andrew.cooper3@citrix.com> Dr. Greg Wettstein <greg@enjellic.com> Fabio Fantoni <fabio.fantoni@m2r.biz> George Dunlap <george.dunlap@eu.citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Malcolm Crossley <malcolm.crossley@citrix.com> Marcus Granado <marcus.granado@citrix.com> Wei Liu <wei.liu2@citrix.com> ------------------------------------------------------------ jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-oldkern fail build-i386-oldkern fail build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel pass test-amd64-i386-qemut-rhel6hvm-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pair pass test-amd64-i386-pair pass test-amd64-amd64-xl-sedf-pin pass test-amd64-amd64-pv pass test-amd64-i386-pv pass test-amd64-amd64-xl-sedf pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-i386-xend-qemut-winxpsp3 fail test-amd64-amd64-xl-qemut-winxpsp3 fail test-amd64-amd64-xl-qemuu-winxpsp3 fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. ------------------------------------------------------------ commit 6890cebc6a987d0e896f5d23a8de11a3934101cf Author: Malcolm Crossley <malcolm.crossley@citrix.com> Date: Mon Mar 25 14:31:27 2013 +0100 VT-d: deal with 5500/5520/X58 errata http://www.intel.com/content/www/us/en/chipsets/5520-and-5500-chipset-ioh-specification-update.html Stepping B-3 has two errata (#47 and #53) related to Interrupt remapping, to which the workaround is for the BIOS to completely disable interrupt remapping. These errata are fixed in stepping C-2. Unfortunately this chipset stepping is very common and many BIOSes are not disabling interrupt remapping on this stepping . We can detect this in Xen and prevent Xen from using the problematic interrupt remapping feature. The Intel 5500/5520/X58 chipset does not support VT-d Extended Interrupt Mode(EIM). This means the iommu_supports_eim() check always fails and so x2apic mode cannot be enabled in Xen before this quirk disables the interrupt remapping feature. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Gate the function call to check the quirk on interrupt remapping being requested to get enabled, and upon failure disable the IOMMU to be in line with what the changes for XSA-36 (plus follow-ups) did. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com> commit fae0372140befb88d890a30704a8ec058c902af8 Author: Jan Beulich <jbeulich@suse.com> Date: Mon Mar 25 14:28:31 2013 +0100 IOMMU: properly check whether interrupt remapping is enabled ... rather than the IOMMU as a whole. That in turn required to make sure iommu_intremap gets properly cleared when the respective initialization fails (or isn''t being done at all). Along with making sure interrupt remapping doesn''t get inconsistently enabled on some IOMMUs and not on others in the VT-d code, this in turn allowed quite a bit of cleanup on the VT-d side (if desired, that cleanup could of course be broken out into a separate patch). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com> commit 85bae8b3406b234f3074617771072623525a3576 Author: Wei Liu <wei.liu2@citrix.com> Date: Tue Mar 19 17:45:49 2013 +0000 xenconsoled: use array index to keep track of pollfd If we use pointers to reference elements inside array, it is possible that we get wild pointer after realloc(3) copies array and returns a new pointer. Keep track of element indexes inside the array can solve this problem. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Marcus Granado <marcus.granado@citrix.com> Tested-by: Marcus Granado <marcus.granado@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> commit f325cdf4118a3209d4c193b9490bd5bc8f2150fb Author: George Dunlap <george.dunlap@eu.citrix.com> Date: Mon Mar 11 13:57:47 2013 +0000 libxl: Streamline vnc argument generation code Makes the following changes to the vnc generation code: * Simplifies and comments it, making it easier to read and grok * Throws an error if duplicate values of display are set, rather than the current very un-intuitive behavior. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> commit 6cffb2b469a55032a2900ccb8776c0082f346758 Author: Dr. Greg Wettstein <greg@wind.enjellic.com> Date: Tue Mar 19 07:26:33 2013 +0000 tools: Retry blktap2 tapdisk message on interrupt. Re-start blktap2 IPC select call on interrupt. We hunted this miserable bug for a long time. The teardown of a blktap2 tapdisk instance is being carried out inconsistently up to and including the 4.2.1 release. The problem appears to be a classic ''Heisenbug'' which disappears if a single function call is added to the tapdisk shutdown path. It is likely this bug has been in existence for the life of the blktap2 code. Control messages to manipulate a tapdisk instance are sent over a UNIX domain socket. A select call is used on both the read and write paths to wait on I/O and to set a timeout for the transmission and reception of the control plane messages. The existing code fails receipt or transmission of the control message on any type of error return from the select call. The xl control process receives an interrupt while waiting in the select call which in turn causes an error return with SIGINT as the return code. This prematurely terminates the teardown of the tapdisk instance leaving it in various states of shutdown. Since multiple messages are needed to implement a full teardown the tapdisk instance can be left in various states ranging from fully connected to only the minor being left allocated. The fix is straight forward. Check the return code from the select call and re-try read or write of the control message if errno is sent to EINTR. The problem manifests itself in the read path but there appears to be little reason to not add the fix to the write path as well. Both paths appear to be cut-and-paste copies of each other. Signed-off-by: Dr. Greg Wettstein <greg@enjellic.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> commit 753d16c1d0d5e194546de1a9f67034d3e6576844 Author: fantonifabio@tiscali.it <fantonifabio@tiscali.it> Date: Mon Mar 18 10:59:53 2013 +0000 tools/firmware: Fix ovmf build with gcc version different from 4.4 Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz> (qemu changes not included)
Jan Beulich
2013-Mar-26 07:59 UTC
ongoing platform timer wraps (was: [xen-unstable test] 17402: regressions - FAIL)
>>> On 25.03.13 at 22:59, xen.org <ian.jackson@eu.citrix.com> wrote: > flight 17402 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/17402/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-xl-win7-amd64 10 guest-saverestore.2 fail REGR. vs. 17399The log here (on itch-mite, as before) has another platform timer wrap instance, followed by the NMI watchdog kicking in (again in guest context, i.e. again an almost certainly false positive caused by the corruption of time stamps). You never responded either way to the debugging patch* I had posted for this, but I continue to think we need to put in something at least temporarily (i.e. until we understand what''s going on here). Unless this is a hardware glitch (or the wrap detection somehow produces a false positive, which I currently can''t see how that would happen), I actually suspect two problems - the occurrence of the wrap, and something with the recovery. Jan * http://lists.xen.org/archives/html/xen-devel/2013-03/msg01072.html