George Dunlap
2013-May-30 16:16 UTC
Xen 4.3 development update -- RC3 willo be an actual candidate, please test
Things are looking in pretty good shape -- at the moment there is only one worrying bug on our bug tracker for which we don''t have a plan. We''re scheduled to do RC3 next Tuesday for the test day on Wednesday. If that test day goes well, we may actually end up releasing RC3. So if you''ve been waiting until the release is more stable to test, now is the time! Also, if you''re a developer and have outstanding patches that you think need to be in the release, please push to try to make sure they are committed by end-of-day Friday (as defined by the committer who needs to check your patch), so we can get them through the test system in time for the RC on Tuesday. This information will be mirrored on the Xen 4.3 Roadmap wiki page: http://wiki.xen.org/wiki/Xen_Roadmap/4.3 The key goals we''re focusing on now, in order, are as follows: 1. Have a bug-free 4.3 release 2. Have an awesome 4.3 release 3. Have a 4.3 release that happens on schedule (ready by June 15th) The most important thing in making a case is to answer the question, "If there are bugs in this patch, will they be discovered before the June 17th release?" The second most important thing is to consider the cost/benefit analysis of bugs that are found: what is the risk of introducing a bug which will delay the release, vs the benefit it will have in making the release better? = Timeline We are planning on a 9-month release cycle. Based on that, below are our estimated dates: * Feature freeze: 25 March 2013 * Code freezing point: 15 April 2013 * First RC: 6 May 2013 <== WE ARE HERE * Release: 17 June 2013 The RCs and release will of course depend on stability and bugs, and will therefore be fairly unpredictable. Each new feature will be considered on a case-by-case basis. The June 17th release is both an estimate and a goal. At this point, Xen 4.3 can be released whenever it''s actually ready. In fact, the sooner we release, the sooner we can open the tree up for new development and get on to 4.4 -- so keep fixing those bugs! Last updated: 30 May 2013 == Completed = * Default to QEMU upstream (partial) - pci pass-thru (external) - enable dirtybit tracking during migration (external) - xl cd-{insert,eject} (external) * openvswitch toostack integration To label "tech-preview" unless we get good testing (>10 individuals) * NUMA scheduler affinity * Install into /usr/local by default * Allow config file to specify multiple usb devices for HVM domains * Persistent grants for blk (external) - Linux - qemu * Allow XSM to override IS_PRIV checks in the hypervisor * vTPM updates * Scalability: 16TiB of RAM * CPUID-based idle (don''t rely on ACPI info f/ dom0) * Serial console improvements -EHCI debug port == Bugs resolved since last update = * Windows 2003 fails to install in Xen-unstable tip (RTC issues) resolution: fixed * XSA-46 regression in PV pass-through? resolution: fixed * qemu-traditional: build on glibc 2.17 resolution: fixed * acpi-related xenstore entries not propagated on migrate resolution: for 4.4 * mac address changes on reboot if not specified in config file resolution: for 4.4 * xendomains bug resolution: fixed * pv shutdown race resolution: not a Xen bug (still being tracked) * libxl cpuid features for sse4* don''t match linux features resolution: fixed * qxl not actually working resolution: disabled for now, fix for 4.4 == Open bugs = * Migration w/ qemu-upstream causes stuck clock > http://osdir.com/ml/general/2013-05/msg30029.html status: Root cause not yet found priority: high * perl 5.18 fails to compile qemu-traditional docs? > http://www.gossamer-threads.com/lists/xen/devel/284141 status: discussion in progress priority: minor * Scan through qemu-upstream changesets looking for important fixes (particularly for stubdoms) for qemu-traditional - cpu hotplug owner: Anthony * qemu-upstream MMIO hole issue > http://lists.xen.org/archives/html/xen-devel/2013-03/msg00559.html > "You can reproduce it when a VM has a big pci hole size (such as 512MB), e.g. create a VM with a virtual device which has a 512MB pci BAR." priority: high status: Plan to make MMIO hole same size as qemu-trad for release, then fix it properly for 4.4 * qemu-upstream not freeing pirq > http://www.gossamer-threads.com/lists/xen/devel/281498 priority: high status: patches posted * Revert Jan''s debugging patch (commit bd9be94) owner: Jan Beulich status: Few instances collected; removal late in release cycle * Update 4.3-rc to 4.3 in README; add tag bragging about 4.3 owner: George status: queued up * xl does not handle migrate interruption gracefully > If you start a localhost migrate, and press "Ctrl-C" in the middle, > you get two hung domains status: Probably not for 4.3 * libxl / xl does not handle failure of remote qemu gracefully > Easiest way to reproduce: > - set "vncunused=0" and do a local migrate > - The "remote" qemu will fail because the vnc port is in use > The failure isn''t the problem, but everything being stuck afterwards is status: Probably not for 4.3
George Dunlap
2013-May-30 16:28 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On Thu, May 30, 2013 at 5:16 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> Things are looking in pretty good shape -- at the moment there is only > one worrying bug on our bug tracker for which we don''t have a plan. > We''re scheduled to do RC3 next Tuesday for the test day on Wednesday. > If that test day goes well, we may actually end up releasing RC3. > > So if you''ve been waiting until the release is more stable to test, > now is the time! > > Also, if you''re a developer and have outstanding patches that you > think need to be in the release, please push to try to make sure they > are committed by end-of-day Friday (as defined by the committer who > needs to check your patch), so we can get them through the test system > in time for the RC on Tuesday. > > This information will be mirrored on the Xen 4.3 Roadmap wiki page: > http://wiki.xen.org/wiki/Xen_Roadmap/4.3 > > The key goals we''re focusing on now, in order, are as follows: > 1. Have a bug-free 4.3 release > 2. Have an awesome 4.3 release > 3. Have a 4.3 release that happens on schedule (ready by June 15th) > > The most important thing in making a case is to answer the question, > "If there are bugs in this patch, will they be discovered before the > June 17th release?" The second most important thing is to consider the > cost/benefit analysis of bugs that are found: what is the risk of > introducing a bug which will delay the release, vs the benefit it will > have in making the release better? > > = Timeline > > We are planning on a 9-month release cycle. Based on that, below are > our estimated dates: > * Feature freeze: 25 March 2013 > * Code freezing point: 15 April 2013 > * First RC: 6 May 2013 <== WE ARE HERE > * Release: 17 June 2013 > > The RCs and release will of course depend on stability and bugs, and > will therefore be fairly unpredictable. Each new feature will be > considered on a case-by-case basis. > > The June 17th release is both an estimate and a goal. At this point, > Xen 4.3 can be released whenever it''s actually ready. In fact, the > sooner we release, the sooner we can open the tree up for new > development and get on to 4.4 -- so keep fixing those bugs! > > Last updated: 30 May 2013 > > == Completed => > * Default to QEMU upstream (partial) > - pci pass-thru (external) > - enable dirtybit tracking during migration (external) > - xl cd-{insert,eject} (external) > > * openvswitch toostack integration > To label "tech-preview" unless we get good testing (>10 individuals) > > * NUMA scheduler affinity > > * Install into /usr/local by default > > * Allow config file to specify multiple usb devices for HVM domains > > * Persistent grants for blk (external) > - Linux > - qemu > > * Allow XSM to override IS_PRIV checks in the hypervisor > > * vTPM updates > > * Scalability: 16TiB of RAM > > * CPUID-based idle (don''t rely on ACPI info f/ dom0) > > * Serial console improvements > -EHCI debug port > > == Bugs resolved since last update => > * Windows 2003 fails to install in Xen-unstable tip (RTC issues) > resolution: fixed > > * XSA-46 regression in PV pass-through? > resolution: fixed > > * qemu-traditional: build on glibc 2.17 > resolution: fixed > > * acpi-related xenstore entries not propagated on migrate > resolution: for 4.4 > > * mac address changes on reboot if not specified in config file > resolution: for 4.4 > > * xendomains bug > resolution: fixed > > * pv shutdown race > resolution: not a Xen bug (still being tracked) > > * libxl cpuid features for sse4* don''t match linux features > resolution: fixed > > * qxl not actually working > resolution: disabled for now, fix for 4.4 > > == Open bugs => > * Migration w/ qemu-upstream causes stuck clock > > http://osdir.com/ml/general/2013-05/msg30029.html > status: Root cause not yet found > priority: highThis is the only bug I''m tracking right now that we might consider a blocker. On the other hand, I''ve been unable to reproduce the problem on my systems so far, so it may be a more isolated instance.> * xl does not handle migrate interruption gracefully > > If you start a localhost migrate, and press "Ctrl-C" in the middle, > > you get two hung domains > status: Probably not for 4.3 > > * libxl / xl does not handle failure of remote qemu gracefully > > Easiest way to reproduce: > > - set "vncunused=0" and do a local migrate > > - The "remote" qemu will fail because the vnc port is in use > > The failure isn''t the problem, but everything being stuck afterwards is > status: Probably not for 4.3I think these will probably need more work than we''re really willing to do right now given the (lack of) severity -- any thoughts? -George
George Dunlap
2013-May-30 16:30 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On Thu, May 30, 2013 at 5:16 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> * Scan through qemu-upstream changesets looking for important fixes > (particularly for stubdoms) for qemu-traditional > - cpu hotplug > owner: AnthonyAnthony, how is this going? -George
George Dunlap
2013-May-30 16:31 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On Thu, May 30, 2013 at 5:16 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> * Revert Jan''s debugging patch (commit bd9be94) > owner: Jan Beulich > status: Few instances collected; removal late in release cycleDo you want to remove this before RC3, or after? -George
Stefano Stabellini
2013-May-30 16:35 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On Thu, 30 May 2013, George Dunlap wrote:> * qemu-upstream MMIO hole issue > > http://lists.xen.org/archives/html/xen-devel/2013-03/msg00559.html > > "You can reproduce it when a VM has a big pci hole size (such as > 512MB), e.g. create a VM with a virtual device which has a 512MB > pci BAR." > priority: high > status: Plan to make MMIO hole same size as qemu-trad for release, > then fix it properly for 4.4Patch sent to qemu-devel, I commit a backport to qemu-xen in a matter of days. The patch won''t fix the problem entirely but it is going to make qemu-xen on par with qemu-xen-traditional.> * qemu-upstream not freeing pirq > > http://www.gossamer-threads.com/lists/xen/devel/281498 > priority: high > status: patches postedRFC patch sent, I don''t think this bug should be considered a blocker. Given that this issue is present since Xen 4.2 and older, it might not even be appropriate to try to fix it at this stage of the release cycle.
Jan Beulich
2013-May-30 16:39 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
>>> George Dunlap <George.Dunlap@eu.citrix.com> 05/30/13 6:31 PM >>> >On Thu, May 30, 2013 at 5:16 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> * Revert Jan''s debugging patch (commit bd9be94) >> owner: Jan Beulich >> status: Few instances collected; removal late in release cycle > >Do you want to remove this before RC3, or after?I shall remove this right away (read: tomorrow) - it caught a single instance over the many weeks it was in, and I''m suspecting that the added printing is hiding the problem, so if anything we''d need another more quiescent debugging patch (which I don''t think we want before the release, nor could I promise I would get to produce it in time). Jan
Ian Campbell
2013-May-30 16:59 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On Thu, 2013-05-30 at 17:28 +0100, George Dunlap wrote:> > * xl does not handle migrate interruption gracefully > > > If you start a localhost migrate, and press "Ctrl-C" in the middle, > > > you get two hung domains > > status: Probably not for 4.3 > > > > * libxl / xl does not handle failure of remote qemu gracefully > > > Easiest way to reproduce: > > > - set "vncunused=0" and do a local migrate > > > - The "remote" qemu will fail because the vnc port is in use > > > The failure isn''t the problem, but everything being stuck afterwards is > > status: Probably not for 4.3 > > I think these will probably need more work than we''re really willing > to do right now given the (lack of) severity -- any thoughts?FWIW I agree. Ian.
Andrew Cooper
2013-May-30 17:33 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On 30/05/2013 17:39, Jan Beulich wrote:>>>> George Dunlap <George.Dunlap@eu.citrix.com> 05/30/13 6:31 PM >>> >> On Thu, May 30, 2013 at 5:16 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >>> * Revert Jan''s debugging patch (commit bd9be94) >>> owner: Jan Beulich >>> status: Few instances collected; removal late in release cycle >> Do you want to remove this before RC3, or after? > I shall remove this right away (read: tomorrow) - it caught a single instance over > the many weeks it was in, and I''m suspecting that the added printing is hiding the > problem, so if anything we''d need another more quiescent debugging patch (which > I don''t think we want before the release, nor could I promise I would get to produce > it in time). > > JanFWIW, http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2ae8b9173fb2388af6514c730d620ed5f450bc34 was also debugging and should probably come back out. I am not certain that it actually caught any issues, but we did the fixing up of legacy PIC interrupts which I suspect might have been the cause of this. ~Andrew> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Alex Bligh
2013-May-30 19:35 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
George, --On 30 May 2013 17:28:20 +0100 George Dunlap <George.Dunlap@eu.citrix.com> wrote:>> * Migration w/ qemu-upstream causes stuck clock >> > http://osdir.com/ml/general/2013-05/msg30029.html >> status: Root cause not yet found >> priority: high > > This is the only bug I''m tracking right now that we might consider a > blocker. > > On the other hand, I''ve been unable to reproduce the problem on my > systems so far, so it may be a more isolated instance.We can reproduce this with your script on more than one system with local migration. Diana''s going to try changing the guest kernel (I think that''s what you wanted) tomorrow. Happy to volunteer any other assistance we can give you in reproducing this (this is easy for me to say as I suspect it''s Diana who''s going to be doing the actual work ...) -- Alex Bligh
Jan Beulich
2013-May-31 06:13 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
>>> On 30.05.13 at 19:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > FWIW, > > http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2ae8b9173fb2388af651 > 4c730d620ed5f450bc34 > > was also debugging and should probably come back out. I am not certain > that it actually caught any issues, but we did the fixing up of legacy > PIC interrupts which I suspect might have been the cause of this.Indeed, though I would just frame the block by #ifndef NDEBUG. Keir - do you have any preference here? Jan
Jan Beulich
2013-May-31 06:41 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
>>> On 30.05.13 at 18:31, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Thu, May 30, 2013 at 5:16 PM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: >> * Revert Jan''s debugging patch (commit bd9be94) >> owner: Jan Beulich >> status: Few instances collected; removal late in release cycle > > Do you want to remove this before RC3, or after?Revert pushed. Jan
Andrew Cooper
2013-May-31 10:42 UTC
Re: Xen 4.3 development update -- RC3 willo be an actual candidate, please test
On 31/05/13 07:13, Jan Beulich wrote:>>>> On 30.05.13 at 19:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> FWIW, >> >> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2ae8b9173fb2388af651 >> 4c730d620ed5f450bc34 >> >> was also debugging and should probably come back out. I am not certain >> that it actually caught any issues, but we did the fixing up of legacy >> PIC interrupts which I suspect might have been the cause of this. > Indeed, though I would just frame the block by #ifndef NDEBUG. > Keir - do you have any preference here? > > Jan >Having used similar debugging structures for this recent XSA-36 related issue, the result from this change is not actually too useful. I would certainly suggest complete removal over having in in an NDEBUG section. ~Andrew