andrewpitman@comcast.net
2011-Jul-15 14:30 UTC
[Xen-users] Xen 4.1.1 crash when manipulating cpupools.
Hi all! I''ve been trying to make use of the credit2 scheduler for my guests running real-time (audio) applications, and to facilitate this I''m using cpupools to separate Domain-0 and less time critical guests which run fine under the regular credit scheduler, and the others which need to use credit2. However, I seem to be able to reliably crash the hypervisor when I try to move virtual cpus (hyperthreads) or domains between cpupools. Sometimes it even crashes when I simply try to set the credit2 scheduler weight for a domain. All domains are fully virtualized and running CentOS 5. Dom-0 is running Fedora 13 with pvops 2.6.32.40 kernel. The panic message displays a register dump and stack trace, as well as: (XEN) Panic on CPU 0: (XEN) Xen BUG at sched_credit.c:991 Hardware is a Dell R810 server with one 10-core Xeon E7, 128 GB RAM and the guests reside on NAS storage . If I simply use Pool-0 for everything and set the default scheduler to credit2 in the Xen boot string everything runs fine, but this workaround is not ideal since it removes some flexibility for me. Any ideas where I should look to determine how to resolve this? Thanks, Andy _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2011-Jul-21 12:58 UTC
[Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
On Fri, Jul 15, 2011 at 02:30:09PM +0000, andrewpitman@comcast.net wrote:> Hi all! > > I''ve been trying to make use of the credit2 scheduler for my guests > running real-time (audio) applications, and to facilitate this I''m using > cpupools to separate Domain-0 and less time critical guests which run fine > under the regular credit scheduler, and the others which need to use > credit2. However, I seem to be able to reliably crash the hypervisor when > I try to move virtual cpus (hyperthreads) or domains between cpupools. > Sometimes it even crashes when I simply try to set the credit2 scheduler > weight for a domain. All domains are fully virtualized and running CentOS > 5. Dom-0 is running Fedora 13 with pvops 2.6.32.40 kernel. > > The panic message displays a register dump and stack trace, as well as: > (XEN) Panic on CPU 0: > (XEN) Xen BUG at sched_credit.c:991 > > Hardware is a Dell R810 server with one 10-core Xeon E7, 128 GB RAM and > the guests reside on NAS storage. > > If I simply use Pool-0 for everything and set the default scheduler to > credit2 in the Xen boot string everything runs fine, but this workaround > is not ideal since it removes some flexibility for me. > > Any ideas where I should look to determine how to resolve this? >(Added xen-devel to CC). Can you please post the exact steps to reproduce the hypervisor crash? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2011-Jul-21 13:15 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
On 07/21/11 14:58, Pasi Kärkkäinen wrote:> On Fri, Jul 15, 2011 at 02:30:09PM +0000, andrewpitman@comcast.net wrote: >> Hi all! >> >> I''ve been trying to make use of the credit2 scheduler for my guests >> running real-time (audio) applications, and to facilitate this I''m using >> cpupools to separate Domain-0 and less time critical guests which run fine >> under the regular credit scheduler, and the others which need to use >> credit2. However, I seem to be able to reliably crash the hypervisor when >> I try to move virtual cpus (hyperthreads) or domains between cpupools. >> Sometimes it even crashes when I simply try to set the credit2 scheduler >> weight for a domain. All domains are fully virtualized and running CentOS >> 5. Dom-0 is running Fedora 13 with pvops 2.6.32.40 kernel. >> >> The panic message displays a register dump and stack trace, as well as: >> (XEN) Panic on CPU 0: >> (XEN) Xen BUG at sched_credit.c:991 >> >> Hardware is a Dell R810 server with one 10-core Xeon E7, 128 GB RAM and >> the guests reside on NAS storage. >> >> If I simply use Pool-0 for everything and set the default scheduler to >> credit2 in the Xen boot string everything runs fine, but this workaround >> is not ideal since it removes some flexibility for me. >> >> Any ideas where I should look to determine how to resolve this?You could try to make credit2 cpupool-ready ;-) I don''t think credit2 is supporting cpupools up to now (at least not any other cpupool than Pool-0). George? I wanted to address this topic on the upcoming Xen Hackathon here in Munich, as George will be here, too, to answer my questions... Hope you can wait until then... As I''m very busy now, I won''t have much time to look into this earlier. Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jul-21 13:47 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
On Thu, 2011-07-21 at 14:15 +0100, Juergen Gross wrote:> I don''t think credit2 is supporting cpupools up to now (at least not any other > cpupool than Pool-0). George?I think that here were some unfortunate corner cases wrt credit2 + cpupools that I didn''t get worked out. In any case, it''s certainly not being tested regularly, so it would be no surprise of something broke. I''ll take a look at it in the next few [working] days in hope that there''s a relatively simple fix. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
andrewpitman@comcast.net
2011-Aug-02 19:44 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
George, Thanks, that would be great! It would definitely be useful to have the credit2 scheduler fully support cpupools. One other thing I did notice was that when I try to weight Domain-0 it crashes the hypervisor as well (example: "xm sched-credit2 -d Domain-0 -w 512"). Thanks, Andy ----- Original Message ----- From: "George Dunlap" <george.dunlap@citrix.com> To: "Juergen Gross" <juergen.gross@ts.fujitsu.com> Cc: "George Dunlap" <George.Dunlap@eu.citrix.com>, xen-devel@lists.xensource.com, xen-users@lists.xensource.com, "Pasi Kärkkäinen" <pasik@iki.fi>, andrewpitman@comcast.net Sent: Thursday, July 21, 2011 9:47:19 AM Subject: Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools. On Thu, 2011-07-21 at 14:15 +0100, Juergen Gross wrote:> I don''t think credit2 is supporting cpupools up to now (at least not any other > cpupool than Pool-0). George?I think that here were some unfortunate corner cases wrt credit2 + cpupools that I didn''t get worked out. In any case, it''s certainly not being tested regularly, so it would be no surprise of something broke. I''ll take a look at it in the next few [working] days in hope that there''s a relatively simple fix. -George _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
George Dunlap
2011-Aug-03 18:09 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
andrewpitman@comcast.net
2011-Nov-08 16:14 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
George, Do you know if this was addressed in 4.1.2? Thanks, Andy ----- Original Message ----- From: "George Dunlap" <george.dunlap@eu.citrix.com> To: andrewpitman@comcast.net Cc: xen-devel@lists.xensource.com, xen-users@lists.xensource.com, "Pasi Kärkkäinen" <pasik@iki.fi>, "Juergen Gross" <juergen.gross@ts.fujitsu.com> Sent: Wednesday, August 3, 2011 2:09:08 PM Subject: Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools. Yes, I''m aware of the crash when setting weight. I''ve had a quick look, and it''s not obvious what the problem is, and I haven''t had a chance to look deeper. -George On 08/02/2011 12:44 PM, andrewpitman@comcast.net wrote: George, Thanks, that would be great! It would definitely be useful to have the credit2 scheduler fully support cpupools. One other thing I did notice was that when I try to weight Domain-0 it crashes the hypervisor as well (example: "xm sched-credit2 -d Domain-0 -w 512"). Thanks, Andy ----- Original Message ----- From: "George Dunlap" <george.dunlap@citrix.com> To: "Juergen Gross" <juergen.gross@ts.fujitsu.com> Cc: "George Dunlap" <George.Dunlap@eu.citrix.com> , xen-devel@lists.xensource.com , xen-users@lists.xensource.com , "Pasi Kärkkäinen" <pasik@iki.fi> , andrewpitman@comcast.net Sent: Thursday, July 21, 2011 9:47:19 AM Subject: Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools. On Thu, 2011-07-21 at 14:15 +0100, Juergen Gross wrote:> I don''t think credit2 is supporting cpupools up to now (at least not any other > cpupool than Pool-0). George?I think that here were some unfortunate corner cases wrt credit2 + cpupools that I didn''t get worked out. In any case, it''s certainly not being tested regularly, so it would be no surprise of something broke. I''ll take a look at it in the next few [working] days in hope that there''s a relatively simple fix. -George _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Juergen Gross
2011-Nov-09 09:55 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
andrewpitman@comcast.net
2011-Nov-11 21:18 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
Juergen, It doesn''t look like this has been fixed. I managed to get my 4.1.2 server to crash when setting the weight of Domain-0 (running in Pool-0 using credit2) and again when moving some cpus into a new cpupool which was set up to use the credit2 scheduler. Andy ----- Original Message ----- From: "Juergen Gross" <juergen.gross@ts.fujitsu.com> To: andrewpitman@comcast.net Cc: "George Dunlap" <george.dunlap@eu.citrix.com>, xen-devel@lists.xensource.com, xen-users@lists.xensource.com Sent: Wednesday, November 9, 2011 4:55:41 AM Subject: Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools. Andy, the last problem with credit2 and cpupools I''m aware of was fixed with cs 23156 in xen 4.1, which is included in 4.1.2. I think this addressed your original problem. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2011-Nov-14 09:20 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
Juergen Gross
2011-Nov-14 09:58 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
George Dunlap
2011-Nov-14 11:13 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
On Mon, 2011-11-14 at 09:20 +0000, Juergen Gross wrote:> On 11/11/2011 10:18 PM, andrewpitman@comcast.net wrote: > > Juergen, > > > > It doesn''t look like this has been fixed. I managed to get my 4.1.2 > > server to crash when setting the weight of Domain-0 (running in > > Pool-0 using credit2) and again when moving some cpus into a new > > cpupool which was set up to use the credit2 scheduler. > > > > Yeah, I could reproduce your problems on my machine. > > George, I see two problems in credit2: > > - when setting the weight of dom0, vcpu_schedule_lock_irq(current) > will be > taken in sched_adjust() of schedule.c and again in csched_dom_cntl() > of > sched_credit2.c resulting in a deadlock.Yes, this one was recently brought to my attention. Unfortunately I''m not sure when I''m going to be able to get to it. My big problem is actually testing; I don''t have a good way to test xen-unstable effectively right now. Juergen, if I were to send you a prototype patch, could you test it and fix it up if it doesn''t work exactly right?> > - removing a cpu from a cpupool seems not to work for credit2. I > removed all > but cpu 0 from Pool-0 and the dom0 vcpus were all active on cpus > 1-3. It > took some time until they moved to cpu 0 (verified via ''r'' hotkey on > the > xen console). > > I tested on xen-4.1-testing cs 23182. > > > Juergen > -- > Juergen Gross Principal Developer Operating Systems > PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 > Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com > Domagkstr. 28 Internet: ts.fujitsu.com > D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2011-Nov-14 11:27 UTC
Re: [Xen-devel] Re: [Xen-users] Xen 4.1.1 crash when manipulating cpupools.
On 11/14/2011 12:13 PM, George Dunlap wrote:> On Mon, 2011-11-14 at 09:20 +0000, Juergen Gross wrote: >> On 11/11/2011 10:18 PM, andrewpitman@comcast.net wrote: >>> Juergen, >>> >>> It doesn''t look like this has been fixed. I managed to get my 4.1.2 >>> server to crash when setting the weight of Domain-0 (running in >>> Pool-0 using credit2) and again when moving some cpus into a new >>> cpupool which was set up to use the credit2 scheduler. >>> >> Yeah, I could reproduce your problems on my machine. >> >> George, I see two problems in credit2: >> >> - when setting the weight of dom0, vcpu_schedule_lock_irq(current) >> will be >> taken in sched_adjust() of schedule.c and again in csched_dom_cntl() >> of >> sched_credit2.c resulting in a deadlock. > Yes, this one was recently brought to my attention. Unfortunately I''m > not sure when I''m going to be able to get to it. > > My big problem is actually testing; I don''t have a good way to test > xen-unstable effectively right now. Juergen, if I were to send you a > prototype patch, could you test it and fix it up if it doesn''t work > exactly right?Sure. Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel