thr3ads.net - Xen users - GPLPV xennet bsod when vcpu>15 [Jun 2012]

If this information is useful, please help other people find it:
Share via:

Dion Kant

2012-Jun-23 19:14 UTC

GPLPV xennet bsod when vcpu>15

Hello,

I installed the signed drivers from
http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and
I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a
large number of vcpu''s. The BSOD is related to xennet.sys.

After some trials I found that it runs fine up to 15 cores. From 16 or
more, the BSOD kicks in when booting the domU.

The hardware (4 times  X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@)
(gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ).
Dom0: 3.1.9-1.4-xen x86_64
DomU: Windows 2008 Server R2 Enterprise 64b.

This happened with versions 0.11.0.308 and 0.11.0.356.

Is this a known problem?

Best regards,

Dion

Dion Kant

2012-Jun-24 10:26 UTC

head link

GPLPV xennet bsod when vcpu>15

Hello,

I installed the signed drivers from
http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and
I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a
large number of vcpu''s. The BSOD is related to xennet.sys.

After some trials I found that it runs fine up to 15 cores. From 16 or
more, the BSOD kicks in when booting the domU.

The hardware (4 times  X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@)
(gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ).
Dom0: 3.1.9-1.4-xen x86_64
DomU: Windows 2008 Server R2 Enterprise 64b.

This happened with versions 0.11.0.308 and 0.11.0.356.

Is this a known problem?

Best regards,

Dion

Dion Kant

2012-Jun-24 10:29 UTC

head link

GPLPV xennet bsod when vcpu>15

Hello,

I installed the signed drivers from
http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers and
I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a
large number of vcpu''s. The BSOD is related to xennet.sys.

After some trials I found that it runs fine up to 15 cores. From 16 or
more, the BSOD kicks in when booting the domU.

The hardware (4 times  X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@)
(gcc version 4.6.2 (SUSE Linux) (openSUSE 12.1) ).
Dom0: 3.1.9-1.4-xen x86_64
DomU: Windows 2008 Server R2 Enterprise 64b.

This happened with versions 0.11.0.308 and 0.11.0.356.

Is this a known problem?

Best regards,

Dion

James Harper

2012-Jun-24 23:57 UTC

head link

Re: GPLPV xennet bsod when vcpu>15

> Hello,
> 
> I installed the signed drivers from
> http://wiki.univention.de/index.php?title=Installing-signed-GPLPV-drivers
> and I ran into a BSOD on a Windows 2008 Server R2 Enterprise domU with a
> large number of vcpu''s. The BSOD is related to xennet.sys.
> 
> After some trials I found that it runs fine up to 15 cores. From 16 or
more, the
> BSOD kicks in when booting the domU.
> 
> The hardware (4 times  X7550) runs Xen version 4.1.2_05-1.1.1 (abuild@)
(gcc
> version 4.6.2 (SUSE Linux) (openSUSE 12.1) ).
> Dom0: 3.1.9-1.4-xen x86_64
> DomU: Windows 2008 Server R2 Enterprise 64b.
> 
> This happened with versions 0.11.0.308 and 0.11.0.356.
> 
> Is this a known problem?
> 
Not until you just mentioned it. I''ve never tried it, but looking at
the code I''d expect that the limit would have been 32 or 64, not 16. I
can''t think why xennet would be caring about the number of processors
either, although that could just be a symptom of a problem in xenpci.

I''ll try and find some time to test it myself.

James

James Harper

2012-Jun-25 00:07 UTC

head link

Re: GPLPV xennet bsod when vcpu>15

> >
> > Is this a known problem?
> >
> 
> Not until you just mentioned it. I''ve never tried it, but looking
at the code I''d
> expect that the limit would have been 32 or 64, not 16. I can''t
think why
> xennet would be caring about the number of processors either, although
> that could just be a symptom of a problem in xenpci.
> 
> I''ll try and find some time to test it myself.
> 
I think I can see the problem... I''m making the assumption that the
processor numbering is linear such that in a system with
NdisSystemProcessorCount() CPU''s, they are numbered from 0 to
NdisSystemProcessorCount()-1, but I bet that with >16 processors they
aren''t numbered linearly (eg 0-15 + 32-47 or something).

How quickly do you need a fix?

James

Dion Kant

2012-Jun-25 06:24 UTC

head link

Re: GPLPV xennet bsod when vcpu>15

On 06/25/2012 02:07 AM, James Harper wrote:>>> Is this a known problem?
>>>
>> Not until you just mentioned it. I''ve never tried it, but
looking at the code I''d
>> expect that the limit would have been 32 or 64, not 16. I
can''t think why
>> xennet would be caring about the number of processors either, although
>> that could just be a symptom of a problem in xenpci.
>>
>> I''ll try and find some time to test it myself.
>>
> I think I can see the problem... I''m making the assumption that
the processor numbering is linear such that in a system with
NdisSystemProcessorCount() CPU''s, they are numbered from 0 to
NdisSystemProcessorCount()-1, but I bet that with >16 processors they
aren''t numbered linearly (eg 0-15 + 32-47 or something).
>
> How quickly do you need a fix?
>
> James
> I am not in a hurry with this. However, I recently ran into a
performance related problem regarding domU<-->domU communication.
Searching the list, I found that this is also related to multiple cores
(I experienced before that going to "extremes" in software space
revealed errors nobody else noticed before).

I have the following issue (further referred to as the multicore xennet
issue) on hardware with 16 cores available.

1. With a configuration with two W2K8 servers running multiple cores
each, domU<-->domU communication is limited to around 6/7 MB/s.
2. Configuring the two domUs with each one cpu, the domU<-->domU
communication goes up to about 80 to 100 MB/s. Probably this is limited
by the disk I/O.

However, on the "big dom0 machine" with 64 cores available and
>500GB
RAM, I had not noticed this issue so far. However there the
configuration is one domU with 15 cores using xennet and the other domU
with 48 cores, running GPLPV as well except for the network interface
which is configured with Qemu''s e1000.

I can spend time on more testing on this multicore xennet issue and
publish the results here if this is interesting for the list.

1. First I am going to capture the details of the first mentioned BSOD;
2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration and
repeat the measurements.

Thanks so far,

Dion

Dion Kant

2012-Jun-26 23:26 UTC

head link

Re: GPLPV xennet bsod when vcpu>15

On 06/25/2012 08:24 AM, Dion Kant wrote:> 1. First I am going to capture the details of the first mentioned BSOD;When vcpus=16 (yes already at 16) a reproducable BSOD occurs at startup.
De BSOD mentions:

xennet.sys
PAGE_FAULT_IN_NONPAGED_AREA
STOP 0x00000050  (0xFFFFFA8153977000,....
> 2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration
and
> repeat the measurements.
Xen version 4.1.2_16-1.7.1
FV DomU A: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB
FV DomU: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB
PV DomU: openSUSE 12.1 Kernel 3.1.9-1.4-xen
Dom0: openSUSE 12.1 Kernel 3.1.9-1.4-xen
NV (Non Virtual) host connected via 1GbE

Test: transfer large file from DomU A (=server) to client.
FV: Use Windows Explorer and copy file from share
PV, Dom0, NV: Use smbclient and copy file to /dev/null

Client  Speed  client server
        (MB/s) vcpus  vcpus
 FV      105      1     8
 PV      298     32     8
 Dom0    448     64     8
 NV      114      2     8
 FV      105      1     1
 PV       70     32     1
 Dom0    166     64     1
 NV      114      2     1

Client  Speed   client server
        (MB/s)  vcpus  vcpus
 FV      105      1      8
 FV       88      2      8
 FV      105      4      8
 FV       95      8      8
 FV       95     15      8
 FV       55     15      1
 FV       55      1      1

No cpu pinning was configured. There is a lot of cpu load on the FV
client. The cpu load increases with increasing number of cpus (i.e.
total cycles consumed on all cpus) whereas the performance is not
increased. Spinning cpu bounces from one to the other but now and then
the (Windows) load approaches 100% (i.e. all cpus are spinning) On the
server the load is much less and sticks much more to one cpu. There is a
lot of variance between runs, so given figures are typical.

Dion

Dion Kant

2012-Jun-26 23:35 UTC

head link

Re: GPLPV xennet bsod when vcpu>15

On 06/25/2012 08:24 AM, Dion Kant wrote:> 1. First I am going to capture the details of the first mentioned BSOD;When vcpus=16 (yes already at 16) a reproducible BSOD occurs at startup.
De BSOD mentions:

xennet.sys
PAGE_FAULT_IN_NONPAGED_AREA
STOP 0x00000050  (0xFFFFFA8153977000,....
> 2. I''ll repeat the 2 times GPLPV-ed domU multi core configuration
and
> repeat the measurements.
Xen version 4.1.2_16-1.7.1
FV DomU A: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB
FV DomU: GPLPV-ed Windows Server 2008 R2 Enterprise 64b, 32GB
PV DomU: openSUSE 12.1 Kernel 3.1.9-1.4-xen
Dom0: openSUSE 12.1 Kernel 3.1.9-1.4-xen
NV (Non Virtual) host connected via 1GbE

Test: transfer large file from DomU A (=server) to client.
FV: Use Windows Explorer and copy file from share
PV, Dom0, NV: Use smbclient and copy file to /dev/null

Client  Speed  client server
        (MB/s) vcpus  vcpus
 FV      105      1     8
 PV      298     32     8
 Dom0    448     64     8
 NV      114      2     8
 FV      105      1     1
 PV       70     32     1
 Dom0    166     64     1
 NV      114      2     1

 Client  Speed   client server
        (MB/s)  vcpus  vcpus
 FV      105      1      8
 FV       88      2      8
 FV      105      4      8
 FV       95      8      8
 FV       95     15      8
 FV       55     15      1
 FV       55      1      1

No cpu pinning was configured. There is a lot of cpu load on the FV
client. The cpu load increases with increasing number of cpus (i.e.
total cycles consumed on all cpus) whereas the performance is not
increased. Spinning cpu bounces from one to the other but now and then
the (Windows) load approaches 100% (i.e. all cpus are spinning) On the
server the load is much less and sticks much more to one cpu. There is a
lot of variance between runs, so given figures are typical.

Dion

Seemingly Similar Threads

Search for more apparently analagous threads

Xen users - Jun 2012 - GPLPV xennet bsod when vcpu>15

GPLPV xennet bsod when vcpu>15

GPLPV xennet bsod when vcpu>15

GPLPV xennet bsod when vcpu>15

Re: GPLPV xennet bsod when vcpu>15

Re: GPLPV xennet bsod when vcpu>15

Re: GPLPV xennet bsod when vcpu>15

Re: GPLPV xennet bsod when vcpu>15

Re: GPLPV xennet bsod when vcpu>15

Seemingly Similar Threads