We have been experimenting with large VCPU counts >> PCPU and have succeeded in hanging Dom0 in or during /sbin/loader. Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo (i.e. 2 PCPU) on a 965 chipset VCPU=17 - works VCPU=20 - works (takes a very long time) VCPU=24 - lockup (whole machine, yes I mean Dom0) VCPU=31 - same VCPU=32 - same We''ve noted in the Xen l-apic code a hard 32 CPU limit (a uint32 used as an l-apic (vcpu?) bitmask), but this looks to be unrelated. John Zulauf Intel Corporation _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
On 24/1/07 6:56 pm, "Zulauf, John" <john.zulauf@intel.com> wrote:> Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo (i.e. 2 > PCPU) on a 965 chipset > > VCPU=17 works > VCPU=20 works (takes a very long time) > VCPU=24 lockup (whole machine, yes I mean Dom0) > VCPU=31 same > VCPU=32 sameHave you worked out where it gets stuck? Do many-VCPU domU guests fare better? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
________________________________________ On January 24, 2007 11:11 AM Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] wrote> On 24/1/07 6:56 pm, "Zulauf, John" <john.zulauf@intel.com> wrote: > > > Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo (i.e. 2 PCPU) on a 965 chipset > > > > VCPU=17 - works > > VCPU=20 - works (takes a very long time) > > VCPU=24 - lockup (whole machine, yes I mean Dom0) > > VCPU=31 - same > > VCPU=32 - same > > Have you worked out where it gets stuck? Do many-VCPU domU guests fare better?It IS dying for a domU guests. Do you perhaps mean PV guests? No we haven''t tried them. I don''t have any at the moment to test. Clarifying the initial report: Host is 2 core (Core 2 Duo/965 chipset) box with VPCU=PCPU for Dom0. Issue occurs when attempting to start an HVM DomU with VCPU >= 24 (though we haven''t checked 21-23 yet). The DomU never(apparently) starts, and Dom0 hard hangs. (i.e. not even the numlock key works) Given that Dom0 hangs hard when this occurs, we don''t have any ideas as to the exact cause of death. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Woller, Thomas
2007-Jan-24 19:34 UTC
RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU
Not sure if this is useful. We have a box with 8 cores, and can run 32 VCPUs without issue on AMD-V with suse10 64b smp guest. This data though is from around january 8th, so it''s a bit stale. I don''t have the exact c/s that the tests were run on, or the guest config parms, but I think it was with 6Gig of RAM for the guest. :P I can try this guest again next day or 2 if useful. tom XEND_DEBUG = 1 Name ID VCPU CPU State Time(s) CPU Affinity Domain-0 0 0 0 -b- 172.5 any cpu Domain-0 0 1 1 -b- 60.0 any cpu Domain-0 0 2 3 -b- 25.2 any cpu Domain-0 0 3 3 r-- 10.2 any cpu Domain-0 0 4 2 -b- 8.7 any cpu Domain-0 0 5 5 -b- 6.8 any cpu Domain-0 0 6 6 -b- 3.6 any cpu Domain-0 0 7 0 -b- 4.8 any cpu suse10_x64_smp 4 0 6 -b- 144.2 any cpu suse10_x64_smp 4 1 4 -b- 36.3 any cpu suse10_x64_smp 4 2 4 -b- 26.5 any cpu suse10_x64_smp 4 3 3 --- 883.8 any cpu suse10_x64_smp 4 4 5 r-- 885.6 any cpu suse10_x64_smp 4 5 6 --- 883.6 any cpu suse10_x64_smp 4 6 7 --- 884.2 any cpu suse10_x64_smp 4 7 2 --- 884.4 any cpu suse10_x64_smp 4 8 4 --- 886.8 any cpu suse10_x64_smp 4 9 7 r-- 885.6 any cpu suse10_x64_smp 4 10 6 --- 885.2 any cpu suse10_x64_smp 4 11 4 r-- 884.0 any cpu suse10_x64_smp 4 12 0 r-- 884.6 any cpu suse10_x64_smp 4 13 3 --- 883.7 any cpu suse10_x64_smp 4 14 1 --- 887.0 any cpu suse10_x64_smp 4 15 1 --- 884.7 any cpu suse10_x64_smp 4 16 0 --- 885.5 any cpu suse10_x64_smp 4 17 7 --- 884.2 any cpu suse10_x64_smp 4 18 2 r-- 885.9 any cpu suse10_x64_smp 4 19 1 --- 886.0 any cpu suse10_x64_smp 4 20 6 --- 885.7 any cpu suse10_x64_smp 4 21 6 --- 886.4 any cpu suse10_x64_smp 4 22 2 --- 885.6 any cpu suse10_x64_smp 4 23 5 --- 888.9 any cpu suse10_x64_smp 4 24 4 --- 885.0 any cpu suse10_x64_smp 4 25 0 --- 885.2 any cpu suse10_x64_smp 4 26 4 --- 885.0 any cpu suse10_x64_smp 4 27 7 --- 885.1 any cpu suse10_x64_smp 4 28 4 --- 882.4 any cpu suse10_x64_smp 4 29 6 --- 884.1 any cpu suse10_x64_smp 4 30 5 --- 883.6 any cpu suse10_x64_smp 4 31 1 r-- 885.3 any cpu ________________________________ From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Zulauf, John Sent: Wednesday, January 24, 2007 12:57 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU We have been experimenting with large VCPU counts >> PCPU and have succeeded in hanging Dom0 in or during /sbin/loader. Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo (i.e. 2 PCPU) on a 965 chipset VCPU=17 - works VCPU=20 - works (takes a very long time) VCPU=24 - lockup (whole machine, yes I mean Dom0) VCPU=31 - same VCPU=32 - same We''ve noted in the Xen l-apic code a hard 32 CPU limit (a uint32 used as an l-apic (vcpu?) bitmask), but this looks to be unrelated. John Zulauf Intel Corporation _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
On 24/1/07 7:28 pm, "Zulauf, John" <john.zulauf@intel.com> wrote:> Clarifying the initial report: > > Host is 2 core (Core 2 Duo/965 chipset) box with VPCU=PCPU for Dom0. > Issue occurs when attempting to start an HVM DomU with VCPU >= 24 (though we > haven''t checked 21-23 yet). > The DomU never(apparently) starts, and Dom0 hard hangs. (i.e. not even the > numlock key works) > > Given that Dom0 hangs hard when this occurs, we don''t have any ideas as to the > exact cause of death.Sorry, I misread the initial report. So the HVM guest doesn''t get as far as running hvmloader as far as you know? Do the Xen debug keys work? (requires you to have a serial line connected) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
With FC6 (or any 2.6.18 kernel) there''s an issue using the tsc clocksource (we''re working on a fix); may not be related but could be worth booting with "clocksource=acpi_pm". cheers, S. ----- Original Message ----- From: "Zulauf, John" <john.zulauf@intel.com> To: "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>; <xen-devel@lists.xensource.com> Sent: Wednesday, January 24, 2007 7:28 PM Subject: RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU ________________________________________ On January 24, 2007 11:11 AM Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] wrote> On 24/1/07 6:56 pm, "Zulauf, John" <john.zulauf@intel.com> wrote: > > > Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo > > (i.e. 2 PCPU) on a 965 chipset > > > > VCPU=17 - works > > VCPU=20 - works (takes a very long time) > > VCPU=24 - lockup (whole machine, yes I mean Dom0) > > VCPU=31 - same > > VCPU=32 - same > > Have you worked out where it gets stuck? Do many-VCPU domU guests fare > better?It IS dying for a domU guests. Do you perhaps mean PV guests? No we haven''t tried them. I don''t have any at the moment to test. Clarifying the initial report: Host is 2 core (Core 2 Duo/965 chipset) box with VPCU=PCPU for Dom0. Issue occurs when attempting to start an HVM DomU with VCPU >= 24 (though we haven''t checked 21-23 yet). The DomU never(apparently) starts, and Dom0 hard hangs. (i.e. not even the numlock key works) Given that Dom0 hangs hard when this occurs, we don''t have any ideas as to the exact cause of death. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Thanks. I don''t believe we''re even getting through to the boot loader, let alone to the kernel. -----Original Message----- From: Steven M. Hand [mailto:smh22@hermes.cam.ac.uk] On Behalf Of Steven Hand Sent: Wednesday, January 24, 2007 11:50 AM To: Zulauf, John; Keir Fraser; xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU With FC6 (or any 2.6.18 kernel) there''s an issue using the tsc clocksource (we''re working on a fix); may not be related but could be worth booting with "clocksource=acpi_pm". cheers, S. ----- Original Message ----- From: "Zulauf, John" <john.zulauf@intel.com> To: "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>; <xen-devel@lists.xensource.com> Sent: Wednesday, January 24, 2007 7:28 PM Subject: RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU ________________________________________ On January 24, 2007 11:11 AM Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] wrote> On 24/1/07 6:56 pm, "Zulauf, John" <john.zulauf@intel.com> wrote: > > > Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2Duo> > (i.e. 2 PCPU) on a 965 chipset > > > > VCPU=17 - works > > VCPU=20 - works (takes a very long time) > > VCPU=24 - lockup (whole machine, yes I mean Dom0) > > VCPU=31 - same > > VCPU=32 - same > > Have you worked out where it gets stuck? Do many-VCPU domU guests fare> better?It IS dying for a domU guests. Do you perhaps mean PV guests? No we haven''t tried them. I don''t have any at the moment to test. Clarifying the initial report: Host is 2 core (Core 2 Duo/965 chipset) box with VPCU=PCPU for Dom0. Issue occurs when attempting to start an HVM DomU with VCPU >= 24 (though we haven''t checked 21-23 yet). The DomU never(apparently) starts, and Dom0 hard hangs. (i.e. not even the numlock key works) Given that Dom0 hangs hard when this occurs, we don''t have any ideas as to the exact cause of death. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Further testing: A brief update the DomU "crash for large VCPU count" For PCPU == 2 (single Core 2 Duo/965 platform) host machine hangs for with VCPU''s > 20. For PCPU == 8 (Dual quadcore/Bensley) we have tested up to 24 VCPU successfully. However, we''ve seen FC6 rebooting the DomU sporadically with VCPU > 16. The suggested clock_source option has no effect. -----Original Message----- From: Woller, Thomas [mailto:thomas.woller@amd.com] Sent: Wednesday, January 24, 2007 11:35 AM To: Zulauf, John; xen-devel@lists.xensource.com Subject: RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU Not sure if this is useful. We have a box with 8 cores, and can run 32 VCPUs without issue on AMD-V with suse10 64b smp guest. This data though is from around january 8th, so it''s a bit stale. I don''t have the exact c/s that the tests were run on, or the guest config parms, but I think it was with 6Gig of RAM for the guest. :P I can try this guest again next day or 2 if useful. tom XEND_DEBUG = 1 Name ID VCPU CPU State Time(s) CPU Affinity Domain-0 0 0 0 -b- 172.5 any cpu Domain-0 0 1 1 -b- 60.0 any cpu Domain-0 0 2 3 -b- 25.2 any cpu Domain-0 0 3 3 r-- 10.2 any cpu Domain-0 0 4 2 -b- 8.7 any cpu Domain-0 0 5 5 -b- 6.8 any cpu Domain-0 0 6 6 -b- 3.6 any cpu Domain-0 0 7 0 -b- 4.8 any cpu suse10_x64_smp 4 0 6 -b- 144.2 any cpu suse10_x64_smp 4 1 4 -b- 36.3 any cpu suse10_x64_smp 4 2 4 -b- 26.5 any cpu suse10_x64_smp 4 3 3 --- 883.8 any cpu suse10_x64_smp 4 4 5 r-- 885.6 any cpu suse10_x64_smp 4 5 6 --- 883.6 any cpu suse10_x64_smp 4 6 7 --- 884.2 any cpu suse10_x64_smp 4 7 2 --- 884.4 any cpu suse10_x64_smp 4 8 4 --- 886.8 any cpu suse10_x64_smp 4 9 7 r-- 885.6 any cpu suse10_x64_smp 4 10 6 --- 885.2 any cpu suse10_x64_smp 4 11 4 r-- 884.0 any cpu suse10_x64_smp 4 12 0 r-- 884.6 any cpu suse10_x64_smp 4 13 3 --- 883.7 any cpu suse10_x64_smp 4 14 1 --- 887.0 any cpu suse10_x64_smp 4 15 1 --- 884.7 any cpu suse10_x64_smp 4 16 0 --- 885.5 any cpu suse10_x64_smp 4 17 7 --- 884.2 any cpu suse10_x64_smp 4 18 2 r-- 885.9 any cpu suse10_x64_smp 4 19 1 --- 886.0 any cpu suse10_x64_smp 4 20 6 --- 885.7 any cpu suse10_x64_smp 4 21 6 --- 886.4 any cpu suse10_x64_smp 4 22 2 --- 885.6 any cpu suse10_x64_smp 4 23 5 --- 888.9 any cpu suse10_x64_smp 4 24 4 --- 885.0 any cpu suse10_x64_smp 4 25 0 --- 885.2 any cpu suse10_x64_smp 4 26 4 --- 885.0 any cpu suse10_x64_smp 4 27 7 --- 885.1 any cpu suse10_x64_smp 4 28 4 --- 882.4 any cpu suse10_x64_smp 4 29 6 --- 884.1 any cpu suse10_x64_smp 4 30 5 --- 883.6 any cpu suse10_x64_smp 4 31 1 r-- 885.3 any cpu ________________________________ From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Zulauf, John Sent: Wednesday, January 24, 2007 12:57 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU We have been experimenting with large VCPU counts >> PCPU and have succeeded in hanging Dom0 in or during /sbin/loader. Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a Core2 Duo (i.e. 2 PCPU) on a 965 chipset VCPU=17 - works VCPU=20 - works (takes a very long time) VCPU=24 - lockup (whole machine, yes I mean Dom0) VCPU=31 - same VCPU=32 - same We''ve noted in the Xen l-apic code a hard 32 CPU limit (a uint32 used as an l-apic (vcpu?) bitmask), but this looks to be unrelated. John Zulauf Intel Corporation _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Woller, Thomas
2007-Jan-28 18:52 UTC
RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU
More info. Using c/s 13628 64b smp hv, with SMP HVM suseLinux10 and Opensuse10.2 64b guests, w/ gfx enabled. Guest config parms: 32 VCPUS, 2000 memory, pae/acpi/apic=1, shadow_memory=512, no vif line Machine has 8 physical cores in AMD-V system, w/ 16G physical RAM. SuseLinux10 (8 PCPU) = boots 32 VCPUs without issue w/ standard guest kernel config SuseLinux10 (2 PCPU maxcpus=2) = boots 32 VCPUs without issue (quite a bit slower) w/ standard guest kernel config, guest very unresponsive, and dom0 very slow response but better than guest. Vcpu-list shows that all 32 VCPUs are running though. Unable to login to guest after 10 minutes, after entering login/passwd. OpenSuse10.2 = hangs on boot with standard guest kernel boot options adding acpi=off allows boot adding clocksource=acpi_pm did not help, still hangs on boot (black screen, no splash ever displayed) I''ll run some overnight tests on the 8 PCPU SUSE10 setup, and see if it is stable wrt >16 VCPUs running. tom> -----Original Message----- > From: Zulauf, John [mailto:john.zulauf@intel.com] > Sent: Thursday, January 25, 2007 6:43 PM > To: Woller, Thomas; xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU > > Further testing: > > A brief update the DomU "crash for large VCPU count" > > For PCPU == 2 (single Core 2 Duo/965 platform) host machine > hangs for with VCPU''s > 20. > > For PCPU == 8 (Dual quadcore/Bensley) > > we have tested up to 24 VCPU successfully. However, we''ve > seen FC6 rebooting the DomU sporadically with VCPU > 16. The > suggested clock_source option has no effect. > > > -----Original Message----- > From: Woller, Thomas [mailto:thomas.woller@amd.com] > Sent: Wednesday, January 24, 2007 11:35 AM > To: Zulauf, John; xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU > > Not sure if this is useful. We have a box with 8 cores, and > can run 32 VCPUs without issue on AMD-V with suse10 64b smp guest. > This data though is from around january 8th, so it''s a bit > stale. I don''t have the exact c/s that the tests were run > on, or the guest config parms, but I think it was with 6Gig > of RAM for the guest. :P I can try this guest again next day > or 2 if useful. > tom > > XEND_DEBUG = 1 > Name ID VCPU CPU State Time(s) CPU > Affinity > Domain-0 0 0 0 -b- > 172.5 any cpu > Domain-0 0 1 1 -b- > 60.0 any cpu > Domain-0 0 2 3 -b- > 25.2 any cpu > Domain-0 0 3 3 r-- > 10.2 any cpu > Domain-0 0 4 2 -b- > 8.7 any cpu > Domain-0 0 5 5 -b- > 6.8 any cpu > Domain-0 0 6 6 -b- > 3.6 any cpu > Domain-0 0 7 0 -b- > 4.8 any cpu > suse10_x64_smp 4 0 6 -b- > 144.2 any cpu > suse10_x64_smp 4 1 4 -b- > 36.3 any cpu > suse10_x64_smp 4 2 4 -b- > 26.5 any cpu > suse10_x64_smp 4 3 3 --- > 883.8 any cpu > suse10_x64_smp 4 4 5 r-- > 885.6 any cpu > suse10_x64_smp 4 5 6 --- > 883.6 any cpu > suse10_x64_smp 4 6 7 --- > 884.2 any cpu > suse10_x64_smp 4 7 2 --- > 884.4 any cpu > suse10_x64_smp 4 8 4 --- > 886.8 any cpu > suse10_x64_smp 4 9 7 r-- > 885.6 any cpu > suse10_x64_smp 4 10 6 --- > 885.2 any cpu > suse10_x64_smp 4 11 4 r-- > 884.0 any cpu > suse10_x64_smp 4 12 0 r-- > 884.6 any cpu > suse10_x64_smp 4 13 3 --- > 883.7 any cpu > suse10_x64_smp 4 14 1 --- > 887.0 any cpu > suse10_x64_smp 4 15 1 --- > 884.7 any cpu > suse10_x64_smp 4 16 0 --- > 885.5 any cpu > suse10_x64_smp 4 17 7 --- > 884.2 any cpu > suse10_x64_smp 4 18 2 r-- > 885.9 any cpu > suse10_x64_smp 4 19 1 --- > 886.0 any cpu > suse10_x64_smp 4 20 6 --- > 885.7 any cpu > suse10_x64_smp 4 21 6 --- > 886.4 any cpu > suse10_x64_smp 4 22 2 --- > 885.6 any cpu > suse10_x64_smp 4 23 5 --- > 888.9 any cpu > suse10_x64_smp 4 24 4 --- > 885.0 any cpu > suse10_x64_smp 4 25 0 --- > 885.2 any cpu > suse10_x64_smp 4 26 4 --- > 885.0 any cpu > suse10_x64_smp 4 27 7 --- > 885.1 any cpu > suse10_x64_smp 4 28 4 --- > 882.4 any cpu > suse10_x64_smp 4 29 6 --- > 884.1 any cpu > suse10_x64_smp 4 30 5 --- > 883.6 any cpu > suse10_x64_smp 4 31 1 r-- > 885.3 any cpu > > > ________________________________ > > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Zulauf, John > Sent: Wednesday, January 24, 2007 12:57 PM > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] Dom0 Hang for large VCPU counts > PCPU > > > > We have been experimenting with large VCPU counts >> > PCPU and have succeeded in hanging Dom0 in or during /sbin/loader. > > > > Using xen-3.0.4-testing with an HVM booting the FC6 DVD on a > Core2 Duo (i.e. 2 PCPU) on a 965 chipset > > > > VCPU=17 - works > > VCPU=20 - works (takes a very long time) > > VCPU=24 - lockup (whole machine, yes I mean Dom0) > > VCPU=31 - same > > VCPU=32 - same > > > > We''ve noted in the Xen l-apic code a hard 32 CPU limit > (a uint32 used as an l-apic (vcpu?) bitmask), but this looks > to be unrelated. > > > > John Zulauf > > Intel Corporation > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel