Plan for a 4.2 release: http://lists.xen.org/archives/html/xen-devel/2012-03/msg00793.html The time line is as follows: 19 March -- TODO list locked down 2 April -- Feature Freeze 30 July -- First release candidate << DONE Weekly -- RCN+1 until release << WE ARE HERE We released RC1 last week. Many of the known issues are now fixed in Mercurial and I expect rc2 will follow once we have a test push has happened (an infrastructure failure prevented this from happening over the weekend). The updated TODO list follows. hypervisor, blockers: * None tools, blockers: * libxl stable API -- we would like 4.2 to define a stable API which downstream's can start to rely on not changing. Aspects of this are: * Interfaces which may need to be async: * libxl_device_pci_add (and remove). (Ian C, DONE) * xl compatibility with xm: * No known issues * [CHECK] More formally deprecate xm/xend. Manpage patches already in tree. Needs release noting and communication around -rc1 to remind people to test xl. * calling hotplug scripts from xl (Linux and NetBSD) (Roger Pau Monné, DONE) * Block script support (Ian C, DONE) * [CHECK] Confirm that migration from Xen 4.1 -> 4.2 works. * [BUG] libxl__devices_destroy has a race against plugging/unplugging devices to the domain which can result in over- or under-flowing the aodevs array (Roger Pau Monné, Ian Jackson, DONE) * Bump library SONAMES as necessary. <20502.39440.969619.824976@mariner.uk.xensource.com> hypervisor, nice to have: * vMCE save/restore changes, to simplify migration 4.2->4.3 with new vMCE in 4.3. (Jinsong Liu, Jan Beulich) tools, nice to have: * xl compatibility with xm: * None * libxl stable API * libxl_wait_for_free_memory/libxl_wait_for_memory_target. Interface needs an overhaul, related to locking/serialization over domain create. IanJ to add note about this interface being substandard but otherwise defer to 4.3. (DONE) * xl.cfg(5) documentation patch for qemu-upstream videoram/videomem support: http://lists.xen.org/archives/html/xen-devel/2012-05/msg00250.html qemu-upstream doesn't support specifying videomem size for the HVM guest cirrus/stdvga. (but this works with qemu-xen-traditional). (Pasi Kärkkäinen) * [BUG] long stop during the guest boot process with qcow image, reported by Intel: http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1821 * [BUG] vcpu-set doesn't take effect on guest, reported by Intel: http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1822 * Load blktap driver from xencommons initscript if available, thread at: <db614e92faf743e20b3f.1337096977@kodo2>. To be fixed more properly in 4.3. (Patch posted, discussion, plan to take simple xencommons patch for 4.2 and revist for 4.3. Ping sent) * [BUG] xl allows same PCI device to be assigned to multiple guests. http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826 (<E4558C0C96688748837EB1B05BEED75A0FD5574A@SHSMSX102.ccr.corp.intel.com>) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Mon, Aug 6, 2012 at 9:58 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> hypervisor, nice to have:* [BUG(?)] Under certain conditions, the p2m_pod_sweep code will stop halfway through searching, causing a guest to crash even if there was zeroed memory available. This is NOT a regression from 4.1, and is a very rare case, so probably shouldn''t be a blocker. (In fact, I''d be open to the idea that it should wait until after the release to get more testing.) I can take this one.> tools, nice to have:* [BUG(?)] If domain 0 attempts to access a guests'' memory before it is finished being built, and it is being built in PoD mode, this may cause the guest to crash. Again, this is NOT a regression from 4.1. Furthermore, it''s only been reported (AIUI) by a customer of OpenSuSE; so it shoudn''t be a blocker. (Again, I''d be open to the idea that it should wait until after the release to get more testing.) Jan, since you''ve got access to the system that reproduces it, do you want to take this one? I think it should just be a matter of moving xc_domain_set_target() just before the while() loop in the domain builder (but after xc_domain_populate_physmap_exact, I think), and changing the loop to never allocate real memory in PoD mode. I can do it, but it will be longer before we can get it tested. -George
On Mon, Aug 6, 2012 at 5:17 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> * [BUG(?)] If domain 0 attempts to access a guests'' memory before it > is finished being built, and it is being built in PoD mode, this may > cause the guest to crash. Again, this is NOT a regression from 4.1. > Furthermore, it''s only been reported (AIUI) by a customer of OpenSuSE;Sorry, SuSE, not OpenSuSE; fingers on auto-pilot...
>>> On 06.08.12 at 18:17, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Mon, Aug 6, 2012 at 9:58 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> hypervisor, nice to have: > > * [BUG(?)] Under certain conditions, the p2m_pod_sweep code will stop > halfway through searching, causing a guest to crash even if there was > zeroed memory available. This is NOT a regression from 4.1, and is a > very rare case, so probably shouldn''t be a blocker. (In fact, I''d be > open to the idea that it should wait until after the release to get > more testing.) > > I can take this one. > >> tools, nice to have: > > * [BUG(?)] If domain 0 attempts to access a guests'' memory before it > is finished being built, and it is being built in PoD mode, this may > cause the guest to crash. Again, this is NOT a regression from 4.1. > Furthermore, it''s only been reported (AIUI) by a customer of OpenSuSE;It''s a SLES customer really, and hence is quite a bit more important to us than if it was an openSUSE one. But we''d have to backport an eventual fix anyway (as this was reported against 4.0.x), so as long as a fix becomes available we''d be fine with backporting it no matter whether it makes it into 4.2 (though of course, given that all versions of the PoD code are affected, getting this fixed in 4.0.4 and 4.1.3 would seem desirable).> so it shoudn''t be a blocker. (Again, I''d be open to the idea that it > should wait until after the release to get more testing.) > > Jan, since you''ve got access to the system that reproduces it, do you > want to take this one? I think it should just be a matter of moving > xc_domain_set_target() just before the while() loop in the domain > builder (but after xc_domain_populate_physmap_exact, I think), and > changing the loop to never allocate real memory in PoD mode. I can do > it, but it will be longer before we can get it tested.If that''s really expected to work, then yes, I can have them test such a change (as indicated, on 4.0.x) and post the patch once validated. But it wasn''t really clear to me whether the non-PoD allocation for at least the low 2Mb weren''t on purpose (as that''s where BIOS image and hvmloader will end up sitting). Jan
>>> On 06.08.12 at 10:58, Ian Campbell <Ian.Campbell@citrix.com> wrote: > hypervisor, nice to have:- fix S3 regression(s?) reported by Ben Guthro - address PoD problems with early host side accesses to guest address space (draft patch for 4.0.x exists, needs to be ported over to -unstable, which I''ll expect to get to today) - fix high change rate to CMOS RTC periodic interrupt causing guest wall clock time to lag (possible fix outlined, needs to be put in patch form and thoroughly reviewed/tested for unwanted side effects) (some of these may need considering to be put under blockers) Jan
On Mon, Aug 13, 2012 at 3:05 AM, Jan Beulich <JBeulich@suse.com> wrote:> - fix S3 regression(s?) reported by Ben GuthroI am continuing to try to root-cause this, but it is slow going, as I keep getting pulled off to look at other things. I have not found a smoking gun yet. Is this exclusive to my setup? Does S3 work elsewhere with 4.2? It seems to fail 100% of the time on 100% of x86 machines I have tried.
>>> On 13.08.12 at 19:26, Ben Guthro <ben@guthro.net> wrote: > Is this exclusive to my setup? Does S3 work elsewhere with 4.2? > It seems to fail 100% of the time on 100% of x86 machines I have tried.Don''t know. The systems I have tried S3 on have problems even with native Linux, so there''s not much point playing with Xen on them. Ian(J), I don''t suppose this is part of the regression tests? Gabriel, Yongjie - ISTR this is part of your regular VMX testing, and the most recent report doesn''t mention any problem. Jan
On Mon, 2012-08-13 at 08:05 +0100, Jan Beulich wrote:> - address PoD problems with early host side accesses to guest > address space (draft patch for 4.0.x exists, needs to be ported > over to -unstable, which I''ll expect to get to today)Is this the same as one of the two existing PoD entries? Expecting not I''ll include it separately today (I''m going to post the update very shortly) and we can reconcile any duplication for next week. Hypervisor: * [BUG(?)] Under certain conditions, the p2m_pod_sweep code will stop halfway through searching, causing a guest to crash even if there was zeroed memory available. This is NOT a regression from 4.1, and is a very rare case, so probably shouldn''t be a blocker. (In fact, I''d be open to the idea that it should wait until after the release to get more testing.) (George Dunlap) Tools: * [BUG(?)] If domain 0 attempts to access a guests'' memory before it is finished being built, and it is being built in PoD mode, this may cause the guest to crash. Again, this is NOT a regression from 4.1. Furthermore, it''s only been reported (AIUI) by a customer of SuSE; so it shoudn''t be a blocker. (Again, I''d be open to the idea that it should wait until after the release to get more testing.) (George Dunlap / Jan Beulich)> (some of these may need considering to be put under blockers)I''ll make them all nice to have for now, let me know if you decide some of them should be blockers. Ian.
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, August 14, 2012 3:38 PM > To: Ian Jackson; Ben Guthro; Wu, GabrielX; Ren, Yongjie > Cc: Ian Campbell; xen-devel; Keir Fraser > Subject: Re: [Xen-devel] 4.2 TODO / Release Plan > > >>> On 13.08.12 at 19:26, Ben Guthro <ben@guthro.net> wrote: > > Is this exclusive to my setup? Does S3 work elsewhere with 4.2? > > It seems to fail 100% of the time on 100% of x86 machines I have tried. > > Don''t know. The systems I have tried S3 on have problems > even with native Linux, so there''s not much point playing > with Xen on them. > > Ian(J), I don''t suppose this is part of the regression tests? > > Gabriel, Yongjie - ISTR this is part of your regular VMX testing, > and the most recent report doesn''t mention any problem. >I''ll double check the S3 issue, and give update later.
>>> On 14.08.12 at 11:02, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2012-08-13 at 08:05 +0100, Jan Beulich wrote: >> - address PoD problems with early host side accesses to guest >> address space (draft patch for 4.0.x exists, needs to be ported >> over to -unstable, which I''ll expect to get to today) > > Is this the same as one of the two existing PoD entries? Expecting not > I''ll include it separately today (I''m going to post the update very > shortly) and we can reconcile any duplication for next week. > > Hypervisor: > * [BUG(?)] Under certain conditions, the p2m_pod_sweep code will > stop halfway through searching, causing a guest to crash even if > there was zeroed memory available. This is NOT a regression > from 4.1, and is a very rare case, so probably shouldn''t be a > blocker. (In fact, I''d be open to the idea that it should wait > until after the release to get more testing.) > (George Dunlap) > > Tools: > * [BUG(?)] If domain 0 attempts to access a guests'' memory before > it is finished being built, and it is being built in PoD mode, > this may cause the guest to crash. Again, this is NOT a > regression from 4.1. Furthermore, it''s only been reported > (AIUI) by a customer of SuSE; so it shoudn''t be a blocker. > (Again, I''d be open to the idea that it should wait until after > the release to get more testing.) > (George Dunlap / Jan Beulich)It''s the same as this second entry, albeit the fix is not limited to the tools. Patch posted a few minutes ago. Jan
On Tue, 2012-08-14 at 10:34 +0100, Jan Beulich wrote:> >>> On 14.08.12 at 11:02, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Mon, 2012-08-13 at 08:05 +0100, Jan Beulich wrote: > >> - address PoD problems with early host side accesses to guest > >> address space (draft patch for 4.0.x exists, needs to be ported > >> over to -unstable, which I''ll expect to get to today) > > > > Is this the same as one of the two existing PoD entries? Expecting not > > I''ll include it separately today (I''m going to post the update very > > shortly) and we can reconcile any duplication for next week. > > > > Hypervisor: > > * [BUG(?)] Under certain conditions, the p2m_pod_sweep code will > > stop halfway through searching, causing a guest to crash even if > > there was zeroed memory available. This is NOT a regression > > from 4.1, and is a very rare case, so probably shouldn''t be a > > blocker. (In fact, I''d be open to the idea that it should wait > > until after the release to get more testing.) > > (George Dunlap) > > > > Tools: > > * [BUG(?)] If domain 0 attempts to access a guests'' memory before > > it is finished being built, and it is being built in PoD mode, > > this may cause the guest to crash. Again, this is NOT a > > regression from 4.1. Furthermore, it''s only been reported > > (AIUI) by a customer of SuSE; so it shoudn''t be a blocker. > > (Again, I''d be open to the idea that it should wait until after > > the release to get more testing.) > > (George Dunlap / Jan Beulich) > > It''s the same as this second entry, albeit the fix is not limited to > the tools. Patch posted a few minutes ago.Thanks, I collapsed both entries into * address PoD problems with early host side accesses to guest address space (Jan Beulich, patch posted) although I expect it will be DONE by the time I repost next week... Ian.
Jan Beulich writes ("Re: [Xen-devel] 4.2 TODO / Release Plan"):> >>> On 13.08.12 at 19:26, Ben Guthro <ben@guthro.net> wrote: > > Is this exclusive to my setup? Does S3 work elsewhere with 4.2? > > It seems to fail 100% of the time on 100% of x86 machines I have tried. > > Don''t know. The systems I have tried S3 on have problems > even with native Linux, so there''s not much point playing > with Xen on them. > > Ian(J), I don''t suppose this is part of the regression tests?No, I''m afraid not. Ian.
> -----Original Message----- > From: Ren, Yongjie > Sent: Tuesday, August 14, 2012 5:03 PM > To: Jan Beulich; Ian Jackson; Ben Guthro; Wu, GabrielX > Cc: Ian Campbell; xen-devel; Keir Fraser > Subject: RE: [Xen-devel] 4.2 TODO / Release Plan > > >>> On 13.08.12 at 19:26, Ben Guthro <ben@guthro.net> wrote: > > > Is this exclusive to my setup? Does S3 work elsewhere with 4.2? > > > It seems to fail 100% of the time on 100% of x86 machines I have tried. > > > > Don''t know. The systems I have tried S3 on have problems > > even with native Linux, so there''s not much point playing > > with Xen on them. > > > > Ian(J), I don''t suppose this is part of the regression tests? > > > > Gabriel, Yongjie - ISTR this is part of your regular VMX testing, > > and the most recent report doesn''t mention any problem. > > > I''ll double check the S3 issue, and give update later. >I can also reproduce the S3 issue on my hardware. It can sleep to memory (S3) but can''t resume after pressing power button. It''s not a recent regression. It exists for a long time. We filed a bug for Dom0 S3 more than 1.5 years ago. http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1707 (There are some comments in this bug.) There are 2 reasons why we didn''t list this bug in our recent report. 1. our report is based on our automatic testing, but dom0 S3 is not included. It''s difficult to make such a case automatic. 2. We really missed the bug in our recent reports. Sorry. We''ll add it in the old issue list for tracking.