Hello, I installed the signed drivers from http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a large number of vcpu''s. The BSOD is related to xennet.sys. After some trials I found that it runs fine up to 15 cores. From 16 or more, the BSOD kicks in when booting the domU. The hardware (4 times X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@) (gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ). Dom0: 3.1.9-1.4-xen x86_64 DomU: Windows 2008 Server R2 Enterprise 64b. This happened with versions 0.11.0.308 and 0.11.0.356. Is this a known problem? Best regards, Dion
Hello, I installed the signed drivers from http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a large number of vcpu''s. The BSOD is related to xennet.sys. After some trials I found that it runs fine up to 15 cores. From 16 or more, the BSOD kicks in when booting the domU. The hardware (4 times X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@) (gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ). Dom0: 3.1.9-1.4-xen x86_64 DomU: Windows 2008 Server R2 Enterprise 64b. This happened with versions 0.11.0.308 and 0.11.0.356. Is this a known problem? Best regards, Dion
Hello, I installed the signed drivers from http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a large number of vcpu''s. The BSOD is related to xennet.sys. After some trials I found that it runs fine up to 15 cores. From 16 or more, the BSOD kicks in when booting the domU. The hardware (4 times X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@) (gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ). Dom0: 3.1.9-1.4-xen x86_64 DomU: Windows 2008 Server R2 Enterprise 64b. This happened with versions 0.11.0.308 and 0.11.0.356. Is this a known problem? Best regards, Dion
> Hello, > > I installed the signed drivers from > http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers > and I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a > large number of vcpu''s. The BSOD is related to xennet.sys. > > After some trials I found that it runs fine up to 15 cores. From 16 or more, the > BSOD kicks in when booting the domU. > > The hardware (4 times X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@) (gcc > version 4.6.2 (SUSE Linux) (openSUSE 12.1) ). > Dom0: 3.1.9-1.4-xen x86_64 > DomU: Windows 2008 Server R2 Enterprise 64b. > > This happened with versions 0.11.0.308 and 0.11.0.356. > > Is this a known problem? >Not until you just mentioned it. I''ve never tried it, but looking at the code I''d expect that the limit would have been 32 or 64, not 16. I can''t think why xennet would be caring about the number of processors either, although that could just be a symptom of a problem in xenpci. I''ll try and find some time to test it myself. James
> > > > Is this a known problem? > > > > Not until you just mentioned it. I''ve never tried it, but looking at the code I''d > expect that the limit would have been 32 or 64, not 16. I can''t think why > xennet would be caring about the number of processors either, although > that could just be a symptom of a problem in xenpci. > > I''ll try and find some time to test it myself. >I think I can see the problem... I''m making the assumption that the processor numbering is linear such that in a system with NdisSystemProcessorCount() CPU''s, they are numbered from 0 to NdisSystemProcessorCount()-1, but I bet that with >16 processors they aren''t numbered linearly (eg 0-15 + 32-47 or something). How quickly do you need a fix? James
On 06/25/2012 02:07 AM, James Harper wrote:>>> Is this a known problem? >>> >> Not until you just mentioned it. I''ve never tried it, but looking at the code I''d >> expect that the limit would have been 32 or 64, not 16. I can''t think why >> xennet would be caring about the number of processors either, although >> that could just be a symptom of a problem in xenpci. >> >> I''ll try and find some time to test it myself. >> > I think I can see the problem... I''m making the assumption that the processor numbering is linear such that in a system with NdisSystemProcessorCount() CPU''s, they are numbered from 0 to NdisSystemProcessorCount()-1, but I bet that with >16 processors they aren''t numbered linearly (eg 0-15 + 32-47 or something). > > How quickly do you need a fix? > > James >I am not in a hurry with this. However, I recently ran into a performance related problem regarding domU<-->domU communication. Searching the list, I found that this is also related to multiple cores (I experienced before that going to "extremes" in software space revealed errors nobody else noticed before). I have the following issue (further referred to as the multicore xennet issue) on hardware with 16 cores available. 1. With a configuration with two W2K8 servers running multiple cores each, domU<-->domU communication is limited to around 6/7 MB/s. 2. Configuring the two domUs with each one cpu, the domU<-->domU communication goes up to about 80 to 100 MB/s. Probably this is limited by the disk I/O. However, on the "big dom0 machine" with 64 cores available and >500GB RAM, I had not noticed this issue so far. However there the configuration is one domU with 15 cores using xennet and the other domU with 48 cores, running GPLPV as well except for the network interface which is configured with Qemu''s e1000. I can spend time on more testing on this multicore xennet issue and publish the results here if this is interesting for the list. 1. First I am going to capture the details of the first mentioned BSOD; 2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration and repeat the measurements. Thanks so far, Dion
On 06/25/2012 08:24 AM, Dion Kant wrote:> 1. First I am going to capture the details of the first mentioned BSOD;When vcpus=16 (yes already at 16) a reproducable BSOD occurs at startup. De BSOD mentions: xennet.sys PAGE_FAULT_IN_NONPAGED_AREA STOP 0x00000050 (0xFFFFFA8153977000,....> 2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration and > repeat the measurements.Xen version 4.1.2_16-1.7.1 FV DomU A: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB FV DomU: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB PV DomU: openSUSE 12.1 Kernel 3.1.9-1.4-xen Dom0: openSUSE 12.1 Kernel 3.1.9-1.4-xen NV (Non Virtual) host connected via 1GbE Test: transfer large file from DomU A (=server) to client. FV: Use Windows Explorer and copy file from share PV, Dom0, NV: Use smbclient and copy file to /dev/null Client Speed client server (MB/s) vcpus vcpus FV 105 1 8 PV 298 32 8 Dom0 448 64 8 NV 114 2 8 FV 105 1 1 PV 70 32 1 Dom0 166 64 1 NV 114 2 1 Client Speed client server (MB/s) vcpus vcpus FV 105 1 8 FV 88 2 8 FV 105 4 8 FV 95 8 8 FV 95 15 8 FV 55 15 1 FV 55 1 1 No cpu pinning was configured. There is a lot of cpu load on the FV client. The cpu load increases with increasing number of cpus (i.e. total cycles consumed on all cpus) whereas the performance is not increased. Spinning cpu bounces from one to the other but now and then the (Windows) load approaches 100% (i.e. all cpus are spinning) On the server the load is much less and sticks much more to one cpu. There is a lot of variance between runs, so given figures are typical. Dion
On 06/25/2012 08:24 AM, Dion Kant wrote:> 1. First I am going to capture the details of the first mentioned BSOD;When vcpus=16 (yes already at 16) a reproducible BSOD occurs at startup. De BSOD mentions: xennet.sys PAGE_FAULT_IN_NONPAGED_AREA STOP 0x00000050 (0xFFFFFA8153977000,....> 2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration and > repeat the measurements.Xen version 4.1.2_16-1.7.1 FV DomU A: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB FV DomU: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB PV DomU: openSUSE 12.1 Kernel 3.1.9-1.4-xen Dom0: openSUSE 12.1 Kernel 3.1.9-1.4-xen NV (Non Virtual) host connected via 1GbE Test: transfer large file from DomU A (=server) to client. FV: Use Windows Explorer and copy file from share PV, Dom0, NV: Use smbclient and copy file to /dev/null Client Speed client server (MB/s) vcpus vcpus FV 105 1 8 PV 298 32 8 Dom0 448 64 8 NV 114 2 8 FV 105 1 1 PV 70 32 1 Dom0 166 64 1 NV 114 2 1 Client Speed client server (MB/s) vcpus vcpus FV 105 1 8 FV 88 2 8 FV 105 4 8 FV 95 8 8 FV 95 15 8 FV 55 15 1 FV 55 1 1 No cpu pinning was configured. There is a lot of cpu load on the FV client. The cpu load increases with increasing number of cpus (i.e. total cycles consumed on all cpus) whereas the performance is not increased. Spinning cpu bounces from one to the other but now and then the (Windows) load approaches 100% (i.e. all cpus are spinning) On the server the load is much less and sticks much more to one cpu. There is a lot of variance between runs, so given figures are typical. Dion