Xu, Jiajun
2010-Apr-06 08:47 UTC
[Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
Hi All, Here is the test report for Xen-4.0.0-rc9(Xen C/S 21087). Our testing contains old bug verification and regression testing. There is no new bug exposed in this testing. The vcpu pin issue is fixed. VT-d/SR-IOV and TXT basic function can work on our platform. We use Pv_ops(xen/master, 2.6.31.12) as Dom0 in our testing. Status Summary ===========================================================================Feature Result ------------------------------------------------------ VT-x PASS VT-x (EPT+VPID) PASS VT-x (Real Mode) PASS RAS Buggy VT-d PASS SR-IOV Buggy TXT PASS PowerMgmt PASS Other PASS Xen Info ===========================================================================xen-changeset: 21087:e1af8db30335 pvops git: commit 4ebd1335d19117929e939878a2e38b6c92a0ffb7 ioemu git: commit f1d909f0f854194f5a40d850886d1413fb8b63c2 Old Open Bugs (3) ===========================================================================1. System hang after many times cpu online/offline http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1596 2. [RAS] CPUs are not in the correct NUMA node after hot-add memory http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1573 3. [SR-IOV] Qemu report pci_msix_writel error while assigning VF to guest http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1575 Fixed Bugs(2): ===========================================================================1. vcpu pin can''t work on PAE host http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1595 2. CPU may panic when running cpu online/offline for more than 100 times http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1594 Best Regards, Jiajun _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-06 09:44 UTC
Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
On 06/04/2010 09:47, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:> Old Open Bugs (3) > ===========================================================================> 1. System hang after many times cpu online/offline > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1596I have a potential fix for this which I hope Yunhong Jiang will comment on soon. However it is probably too late for 4.0.0 now anyway, which we expect to release tomorrow. Thanks for all your testing efforts! And, of course, more broadly, Intel team''s bug fixing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Apr-06 10:16 UTC
RE: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
>-----Original Message----- >From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Tuesday, April 06, 2010 5:45 PM >To: Xu, Jiajun; xen-devel@lists.xensource.com >Cc: Jiang, Yunhong >Subject: Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: >#4ebd13... > >On 06/04/2010 09:47, "Xu, Jiajun" <jiajun.xu@intel.com> wrote: > >> Old Open Bugs (3) >> >===================================================================>=======>> 1. System hang after many times cpu online/offline >> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1596 > >I have a potential fix for this which I hope Yunhong Jiang will comment on >soon. However it is probably too late for 4.0.0 now anyway, which we expect >to release tomorrow.Keir, really thanks for your patch very much. I should not take leave in last Friday and this monday :( In fact, although your patch fixed the issue in my mail, but there still another bug in PM side for CPU online/offline, which will cause panic sometimes, so anyway, CPU online/offline can''t pass our stress test in xen 4.0. I''m testing the patch. Seems it at least passed loop count 500, o*line all APs, leaves only BSP online. A potential issue in the patch is, in following change, it may trigger the assert of __sync_lazy_execstate(), which assume current is idle_vcpu, however, at this time, we can''t gurrante this. A check for current vcpu is needed. + /* + * If we are running the idle vcpu, sync last non-idle vcpu''s state + * before changing cpu_online_map. If we are running non-idle vcpu, + * we will synchronously sync the state in context_switch() later. + */ + __sync_lazy_execstate(); Again, really thanks for your help on this. --jyh> >Thanks for all your testing efforts! And, of course, more broadly, Intel >team''s bug fixing. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-06 11:55 UTC
Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
On 06/04/2010 11:16, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:> Keir, really thanks for your patch very much. I should not take leave in last > Friday and this monday :( In fact, although your patch fixed the issue in my > mail, but there still another bug in PM side for CPU online/offline, which > will cause panic sometimes, so anyway, CPU online/offline can''t pass our > stress test in xen 4.0.Okay, well, seems we will get there for 4.0.1 instead. Thanks!> I''m testing the patch. Seems it at least passed loop count 500, o*line all > APs, leaves only BSP online. > A potential issue in the patch is, in following change, it may trigger the > assert of __sync_lazy_execstate(), which assume current is idle_vcpu, however, > at this time, we can''t gurrante this. A check for current vcpu is needed.Oh yes, obviously I didn''t test my own patch. ;-) Well I will make that change and apply the patch to xen-unstable. We can wait to backport to 4.0-testing until we have a complete set of patches which pass all your tests.> Again, really thanks for your help on this.Thank *you*! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-07 07:04 UTC
Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
On 06/04/2010 11:16, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:> Keir, really thanks for your patch very much. I should not take leave in last > Friday and this monday :( In fact, although your patch fixed the issue in my > mail, but there still another bug in PM side for CPU online/offline, which > will cause panic sometimes, so anyway, CPU online/offline can''t pass our > stress test in xen 4.0. > I''m testing the patch. Seems it at least passed loop count 500, o*line all > APs, leaves only BSP online. > A potential issue in the patch is, in following change, it may trigger the > assert of __sync_lazy_execstate(), which assume current is idle_vcpu, however, > at this time, we can''t gurrante this. A check for current vcpu is needed.I looked at the code again, and are you sure about this? As in, have you seen the assertion trigger? The check that current is the idle_vcpu is only made ''if(switch_required)'', and that can only be the case if we are running the idle_vcpu! So I think my patch is good as it is, would you agree? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Apr-07 07:24 UTC
RE: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
>-----Original Message----- >From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Wednesday, April 07, 2010 3:04 PM >To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: >#4ebd13... > >On 06/04/2010 11:16, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote: > >> Keir, really thanks for your patch very much. I should not take leave in last >> Friday and this monday :( In fact, although your patch fixed the issue in my >> mail, but there still another bug in PM side for CPU online/offline, which >> will cause panic sometimes, so anyway, CPU online/offline can''t pass our >> stress test in xen 4.0. >> I''m testing the patch. Seems it at least passed loop count 500, o*line all >> APs, leaves only BSP online. >> A potential issue in the patch is, in following change, it may trigger the >> assert of __sync_lazy_execstate(), which assume current is idle_vcpu, however, >> at this time, we can''t gurrante this. A check for current vcpu is needed. > >I looked at the code again, and are you sure about this? As in, have you >seen the assertion trigger? The check that current is the idle_vcpu is only >made ''if(switch_required)'', and that can only be the case if we are running >the idle_vcpu! So I think my patch is good as it is, would you agree?Aha, yes, you are right, the patch is correct. I tested your patch in my first round (I added the _redudant_ check in the second round:$ ) and didn''t trigger the assertion, the first round runs for about 900 round before triger another bug. So, yes, it''s a wrong alarm. --jyh> > -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-07 07:44 UTC
Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
On 07/04/2010 08:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:>> I looked at the code again, and are you sure about this? As in, have you >> seen the assertion trigger? The check that current is the idle_vcpu is only >> made ''if(switch_required)'', and that can only be the case if we are running >> the idle_vcpu! So I think my patch is good as it is, would you agree? > > Aha, yes, you are right, the patch is correct. > I tested your patch in my first round (I added the _redudant_ check in the > second round:$ ) and didn''t trigger the assertion, the first round runs for > about 900 round before triger another bug. So, yes, it''s a wrong alarm.I applied the patch as xen-unstable:21109. It actually includes a further change, to add an extra BUG()-check to cpu_exit_clear(). I think it should work fine. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Apr-07 08:20 UTC
RE: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: #4ebd13...
>-----Original Message----- >From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Wednesday, April 07, 2010 3:45 PM >To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] Xen-4.0.0 RC9 Test Report. Xen: #21087 & Dom0: >#4ebd13... > >On 07/04/2010 08:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote: > >>> I looked at the code again, and are you sure about this? As in, have you >>> seen the assertion trigger? The check that current is the idle_vcpu is only >>> made ''if(switch_required)'', and that can only be the case if we are running >>> the idle_vcpu! So I think my patch is good as it is, would you agree? >> >> Aha, yes, you are right, the patch is correct. >> I tested your patch in my first round (I added the _redudant_ check in the >> second round:$ ) and didn''t trigger the assertion, the first round runs for >> about 900 round before triger another bug. So, yes, it''s a wrong alarm. > >I applied the patch as xen-unstable:21109. It actually includes a further >change, to add an extra BUG()-check to cpu_exit_clear(). I think it should >work fine.Really thanks. I will test it later. --jyh> > Thanks, > Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel