Folks, The second release candidate for Xen 3.4.0 is available at http://xenbits.xensource.com/xen-unstable.hg, tagged as ''3.4.0-rc2''. Please test! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Apr-16 17:09 UTC
RE: [Xen-devel] Second release candidate for Xen 3.4.0
FYI, I can still reproduce the "blocked for more than 480 seconds" problem I reported yesterday. After running >2 hours of load, the 2.6.29 guest spews out a number of Call Trace''s and freezes. Each is prefixed with: INFO: task xxx:nnn blocked for more than 480 seconds. and most traces have getnstimeofday or io_schedule at the top. (full dump attached) Xen: 64-bit, c/s 19553 Guest: 4vcpu PV EL5u2 32-bit, with 2.6.29 kernel Load: continuous runs of make -j80 (after clean) on linux-2.6.28 PCPU: quad-core dual-thread> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Thursday, April 16, 2009 4:57 AM > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] Second release candidate for Xen 3.4.0 > > > Folks, > > The second release candidate for Xen 3.4.0 is available at > http://xenbits.xensource.com/xen-unstable.hg, tagged as ''3.4.0-rc2''. > > Please test! > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16/04/2009 18:09, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> FYI, I can still reproduce the "blocked for more than 480 seconds" > problem I reported yesterday. After running >2 hours of load, > the 2.6.29 guest spews out a number of Call Trace''s and freezes. > Each is prefixed with:Hmm could be the kernel I suppose. Or perhaps there''s a time issue lurking. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 17/04/2009 08:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> On 16/04/2009 18:09, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > >> FYI, I can still reproduce the "blocked for more than 480 seconds" >> problem I reported yesterday. After running >2 hours of load, >> the 2.6.29 guest spews out a number of Call Trace''s and freezes. >> Each is prefixed with: > > Hmm could be the kernel I suppose. Or perhaps there''s a time issue lurking.And if the latter, the cpuidle stuff would still be most likely culprit in my opinion. Did you repro problems with cpuidle=off? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>From: Keir Fraser >Sent: 2009年4月17日 16:06 > >On 17/04/2009 08:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >> On 16/04/2009 18:09, "Dan Magenheimer" ><dan.magenheimer@oracle.com> wrote: >> >>> FYI, I can still reproduce the "blocked for more than 480 seconds" >>> problem I reported yesterday. After running >2 hours of load, >>> the 2.6.29 guest spews out a number of Call Trace's and freezes. >>> Each is prefixed with: >> >> Hmm could be the kernel I suppose. Or perhaps there's a time >issue lurking. > >And if the latter, the cpuidle stuff would still be most >likely culprit in >my opinion. Did you repro problems with cpuidle=off? >I think Dan mentioned 'cpuidle=off' in his previous post, but of course it's worthy of further confirmation about this option:> > -----Original Message----- > > From: Dan Magenheimer > > Sent: Wednesday, April 15, 2009 8:59 AM > > To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail); Tian, Kevin > > Subject: RE: [Xen-devel] Time goes backwards in dom0 in xen-unstable > > > > > > Hmmm... after only a few minutes with cpuidle=off, > > my test domPV froze up after printing a number of > > call traces starting with: > > > > INFO: task xxx:nnn blocked for more than 480 seconds. > > > > At the top of all of the traces is either > > getnstimeofday+51 or io_schedule+44. > > > > (Note that this PV domain is a 2.6.29 kernel... don't > > know if the messages are the same on an older kernel.)Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir : I got the Xen-unstable 3.4 rc3pre from http://xenbits.xensource.com/xen-unstable.hg I built the new one of xen . But i can''t create the GuestOS of HVM . I got the Error info: *[2009-04-17 20:32:55 4861] ERROR (XendDomainInfo:2011) VM RedHat restarting too fast (Elapsed time: 0.231917 seconds). Refusing to restart to avoid loops. [2009-04-17 20:32:55 4861] ERROR (XendDomain:1183) domain_unpause Traceback (most recent call last): File "usr/lib/python2.5/site-packages/xen/xend/XendDomain.py", line 1172, in domain_unpause raise XendInvalidDomain(str(domid)) XendInvalidDomain: <Fault 3: ''RedHat''>* This is my first time to use xen-unstable. Before that the Xen3.3.1 works well for me . Use the same config file and img file.>From Google , i didn''t get any useful infomations. Hope you can give any hit! Thanks so much> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- Best regards ! William zhou E-mail: weizhou.sir@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 17/04/2009 13:47, "wei zhou" <weizhou.sir@gmail.com> wrote:> Hi Keir : > > I got the Xen-unstable 3.4 rc3pre from > http://xenbits.xensource.com/xen-unstable.hg > I built the new one of xen . > But i can''t create the GuestOS of HVM . > I got the Error info:You could try reverting c/s 19546 (hg export 19546 | patch -Rp1) and see if that helps after rebuild/reinstall xend. But for that patch to have an effect, xend must have decided to restart the guest and I don''t know why it would want to do that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
That patch has been added in Xen-unstable-3.4rc3 2009/4/17 Keir Fraser <keir.fraser@eu.citrix.com>> On 17/04/2009 13:47, "wei zhou" <weizhou.sir@gmail.com> wrote: > > > Hi Keir : > > > > I got the Xen-unstable 3.4 rc3pre from > > http://xenbits.xensource.com/xen-unstable.hg > > I built the new one of xen . > > But i can''t create the GuestOS of HVM . > > I got the Error info: > > You could try reverting c/s 19546 (hg export 19546 | patch -Rp1) and see if > that helps after rebuild/reinstall xend. But for that patch to have an > effect, xend must have decided to restart the guest and I don''t know why it > would want to do that. > > -- Keir > > >-- Best regards ! William zhou E-mail: weizhou.sir@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Apr-17 15:33 UTC
RE: [Xen-devel] Second release candidate for Xen 3.4.0
Last night''s run ran for over 15 hours before the same "blocked for more than 480 seconds" occurred. This time the tmem patch was running so I/O was greatly reduced, which might account for the change in behavior (or it might be completely random). Interestingly, the domain isn''t completely frozen. It is still doing some things but is mostly non-responsive. I was able to do a ctrl-Z on the console and get the normal shell response, but then no prompt. I am also able to see stuff by sending it sysrq''s using xm. I''ll give cpuidle=off a spin this weekend but...> Hmm could be the kernel I suppose.Yes, this article would lead me to believe so: http://lwn.net/Articles/326490/ I''ll also try to reproduce on 2.6.18. If I can''t, I''d chalk it up as a kernel problem. Dan> -----Original Message----- > From: Tian, Kevin [mailto:kevin.tian@intel.com] > Sent: Friday, April 17, 2009 2:13 AM > To: Keir Fraser; Dan Magenheimer; xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] Second release candidate for Xen 3.4.0 > > > >From: Keir Fraser > >Sent: 2009年4月17日 16:06 > > > >On 17/04/2009 08:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > > > >> On 16/04/2009 18:09, "Dan Magenheimer" > ><dan.magenheimer@oracle.com> wrote: > >> > >>> FYI, I can still reproduce the "blocked for more than 480 seconds" > >>> problem I reported yesterday. After running >2 hours of load, > >>> the 2.6.29 guest spews out a number of Call Trace''s and freezes. > >>> Each is prefixed with: > >> > >> Hmm could be the kernel I suppose. Or perhaps there''s a time > >issue lurking. > > > >And if the latter, the cpuidle stuff would still be most > >likely culprit in > >my opinion. Did you repro problems with cpuidle=off? > > > > I think Dan mentioned ''cpuidle=off'' in his previous post, but > of course > it''s worthy of further confirmation about this option: > > > > -----Original Message----- > > > From: Dan Magenheimer > > > Sent: Wednesday, April 15, 2009 8:59 AM > > > To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail); Tian, Kevin > > > Subject: RE: [Xen-devel] Time goes backwards in dom0 in > xen-unstable > > > > > > > > > Hmmm... after only a few minutes with cpuidle=off, > > > my test domPV froze up after printing a number of > > > call traces starting with: > > > > > > INFO: task xxx:nnn blocked for more than 480 seconds. > > > > > > At the top of all of the traces is either > > > getnstimeofday+51 or io_schedule+44. > > > > > > (Note that this PV domain is a 2.6.29 kernel... don''t > > > know if the messages are the same on an older kernel.) > > Thanks, > Kevin_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Apr-20 15:58 UTC
RE: [Xen-devel] Second release candidate for Xen 3.4.0
Over the weekend, I tried cpuidle=off and it didn''t make any difference. I didn''t have a chance to fall back to a 2.6.18 test run but did start up another 2.6.29 run which ran for over 24 hours before my test script failed with the following and a stack dump: "BUG: soft lockup - CPU#3 stuck after 4099s!" The guest didn''t freeze or crash though.> -----Original Message----- > From: Dan Magenheimer > Sent: Friday, April 17, 2009 9:34 AM > To: Tian, Kevin; Keir Fraser; xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] Second release candidate for Xen 3.4.0 > > > Last night''s run ran for over 15 hours before the same > "blocked for more than 480 seconds" occurred. This > time the tmem patch was running so I/O was greatly > reduced, which might account for the change in behavior > (or it might be completely random). > > Interestingly, the domain isn''t completely frozen. > It is still doing some things but is mostly non-responsive. > I was able to do a ctrl-Z on the console and get the > normal shell response, but then no prompt. I am also > able to see stuff by sending it sysrq''s using xm. > > I''ll give cpuidle=off a spin this weekend but... > > > Hmm could be the kernel I suppose. > > Yes, this article would lead me to believe so: > > http://lwn.net/Articles/326490/ > > I''ll also try to reproduce on 2.6.18. If I can''t, I''d > chalk it up as a kernel problem. > > Dan > > > -----Original Message----- > > From: Tian, Kevin [mailto:kevin.tian@intel.com] > > Sent: Friday, April 17, 2009 2:13 AM > > To: Keir Fraser; Dan Magenheimer; xen-devel@lists.xensource.com > > Subject: RE: [Xen-devel] Second release candidate for Xen 3.4.0 > > > > > > >From: Keir Fraser > > >Sent: 2009年4月17日 16:06 > > > > > >On 17/04/2009 08:55, "Keir Fraser" > <keir.fraser@eu.citrix.com> wrote: > > > > > >> On 16/04/2009 18:09, "Dan Magenheimer" > > ><dan.magenheimer@oracle.com> wrote: > > >> > > >>> FYI, I can still reproduce the "blocked for more than > 480 seconds" > > >>> problem I reported yesterday. After running >2 hours of load, > > >>> the 2.6.29 guest spews out a number of Call Trace''s and freezes. > > >>> Each is prefixed with: > > >> > > >> Hmm could be the kernel I suppose. Or perhaps there''s a time > > >issue lurking. > > > > > >And if the latter, the cpuidle stuff would still be most > > >likely culprit in > > >my opinion. Did you repro problems with cpuidle=off? > > > > > > > I think Dan mentioned ''cpuidle=off'' in his previous post, but > > of course > > it''s worthy of further confirmation about this option: > > > > > > -----Original Message----- > > > > From: Dan Magenheimer > > > > Sent: Wednesday, April 15, 2009 8:59 AM > > > > To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail); > Tian, Kevin > > > > Subject: RE: [Xen-devel] Time goes backwards in dom0 in > > xen-unstable > > > > > > > > > > > > Hmmm... after only a few minutes with cpuidle=off, > > > > my test domPV froze up after printing a number of > > > > call traces starting with: > > > > > > > > INFO: task xxx:nnn blocked for more than 480 seconds. > > > > > > > > At the top of all of the traces is either > > > > getnstimeofday+51 or io_schedule+44. > > > > > > > > (Note that this PV domain is a 2.6.29 kernel... don''t > > > > know if the messages are the same on an older kernel.) > > > > Thanks, > > Kevin_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel