thr3ads.net - Xen devel - large packet support in netfront driver and guest network throughput [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Anirban Chakraborty

2013-Sep-12 17:53 UTC

large packet support in netfront driver and guest network throughput

Hi All,

I am sure this has been answered somewhere in the list in the past, but I
can''t find it. I was wondering if the linux guest netfront driver has
GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the
eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing
that they receive large packet:

In dom0:
eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4  
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP
(6), length 29012)
    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack
1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960

vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP
(6), length 29012)
    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack
1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960


In the guest:
eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4  
          inet addr:10.84.20.213  Bcast:10.84.20.255  Mask:255.255.255.0
          inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP
(6), length 1500)
    10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack
1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448

Is the packet on transfer from netback to net front is segmented into MTU size?
Is GRO not supported in the guest?

I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos
6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s
and iperf session between them shows at most 3.2 Gbps.
I am using linux bridge as network backend switch. Dom0 is configured to have
2940MB of RAM.
In most cases, after a few runs the throughput drops to ~2.2 Gbps. top shows
that the netback thread in dom0 is having about 70-80% CPU utilization. I have
checked the dom0 network configuration and there is no QoS policy in place etc.
So, my question is that is PCI passthrough only option to get line rate in the
guests? Is there any benchmark of maximum throughput achieved in the guests
using PV drivers and without PCI pass thru? Also, what could be the reason for
throughput drop in the guests (from ~3.2 to ~2.2 Gbps) consistently after few
runs of iperf?

Any pointer will be highly appreciated.

thanks,
Anirban

Wei Liu

2013-Sep-13 11:44 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty
wrote:> Hi All,
> 
> I am sure this has been answered somewhere in the list in the past, but I
can''t find it. I was wondering if the linux guest netfront driver has
GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the
eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing
that they receive large packet:
> 
> In dom0:
> eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4  
>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto
TCP (6), length 29012)
>     10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
> 
> vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
>           UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto
TCP (6), length 29012)
>     10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
> 
> 
> In the guest:
> eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4  
>           inet addr:10.84.20.213  Bcast:10.84.20.255  Mask:255.255.255.0
>           inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto
TCP (6), length 1500)
>     10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848,
ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448
> 
> Is the packet on transfer from netback to net front is segmented into MTU
size? Is GRO not supported in the guest?
Here is what I see in the guest, iperf server running in guest and iperf
client running in Dom0. Tcpdump runs with the rune you provided.

10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr
21832969], length 11584

This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s.
> 
> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests
(Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different
XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.
XenServer might use different Dom0 kernel with their own tuning. You can
also try to contact XenServer support for better idea?

In general, off-host communication can be affected by various things. It
would be quite useful to identify the bottleneck first.

Try to run:
1. Dom0 to Dom0 iperf (or you workload)
2. Dom0 to DomU iperf
3. DomU to Dom0 iperf

In order to get line rate, you need to at least get line rate from Dom0
to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been
achieved at the moment...

Wei.
> I am using linux bridge as network backend switch. Dom0 is configured to
have 2940MB of RAM.
> In most cases, after a few runs the throughput drops to ~2.2 Gbps. top
shows that the netback thread in dom0 is having about 70-80% CPU utilization. I
have checked the dom0 network configuration and there is no QoS policy in place
etc. So, my question is that is PCI passthrough only option to get line rate in
the guests? Is there any benchmark of maximum throughput achieved in the guests
using PV drivers and without PCI pass thru? Also, what could be the reason for
throughput drop in the guests (from ~3.2 to ~2.2 Gbps) consistently after few
runs of iperf?
> 
> Any pointer will be highly appreciated.
> 
> thanks,
> Anirban 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Anirban Chakraborty

2013-Sep-13 17:09 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote:
>> Hi All,
>> 
>> I am sure this has been answered somewhere in the list in the past, but
I can''t find it. I was wondering if the linux guest netfront driver has
GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the
eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing
that they receive large packet:
>> 
>> In dom0:
>> eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4  
>>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF],
proto TCP (6), length 29012)
>>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
>> 
>> vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
>>          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
>> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF],
proto TCP (6), length 29012)
>>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
>> 
>> 
>> In the guest:
>> eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4  
>>          inet addr:10.84.20.213  Bcast:10.84.20.255  Mask:255.255.255.0
>>          inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF],
proto TCP (6), length 1500)
>>    10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq
17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213],
length 1448
>> 
>> Is the packet on transfer from netback to net front is segmented into
MTU size? Is GRO not supported in the guest?
> 
> Here is what I see in the guest, iperf server running in guest and iperf
> client running in Dom0. Tcpdump runs with the rune you provided.
> 
> 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
> 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr
> 21832969], length 11584
> 
> This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s.
Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both
vif and the physical interfaces] , i.e. on the receive path of the server. iperf
server was running on the guest (10.84.20.213) and the client was at another
guest (on a different server) with IP 10.84.20.214. The traffic was between two
guests, not between dom0 and the guest.
> 
>> 
>> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests
(Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different
XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.
> 
> XenServer might use different Dom0 kernel with their own tuning. You can
> also try to contact XenServer support for better idea?
> 
XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront
driver, as it appears from the tcpdump, thats why I thought I post it here. Note
that checksum offloads of the interfaces (virtual and physical) were not even
touched, the default setting (which was set to on) was used.
> In general, off-host communication can be affected by various things. It
> would be quite useful to identify the bottleneck first.
> 
> Try to run:
> 1. Dom0 to Dom0 iperf (or you workload)
> 2. Dom0 to DomU iperf
> 3. DomU to Dom0 iperf
I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO
turned on in the physical interface). However, when I run guest to guest, things
fall off. Is large packet not supported in netfront? I thought otherwise. I
looked at the code and I do not see any call to napi_gro_receive(), rather it is
using netif_receive_skb(). netback seems to be sending GSO packets to the
netfront, but it is being segmented to 1500 byte (as it appears from the
tcpdump).
> 
> In order to get line rate, you need to at least get line rate from Dom0
> to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been
> achieved at the moment…
What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am
getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm
running on that server without any other traffic.

-Anirban

Wei Liu

2013-Sep-16 14:21 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Fri, Sep 13, 2013 at 05:09:48PM +0000, Anirban Chakraborty
wrote:> On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote:
> 
> > On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote:
> >> Hi All,
> >> 
> >> I am sure this has been answered somewhere in the list in the
past, but I can't find it. I was wondering if the linux guest netfront
driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes,
although the eth0 in dom0 and the vif corresponding to the linux guest in dom0
is showing that they receive large packet:
> >> 
> >> In dom0:
> >> eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4  
> >>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500 
Metric:1
> >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
> >> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags
[DF], proto TCP (6), length 29012)
> >>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
> >> 
> >> vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
> >>          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
> >> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
> >> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags
[DF], proto TCP (6), length 29012)
> >>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
> >> 
> >> 
> >> In the guest:
> >> eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4  
> >>          inet addr:10.84.20.213  Bcast:10.84.20.255 
Mask:255.255.255.0
> >>          inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
> >> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags
[DF], proto TCP (6), length 1500)
> >>    10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq
17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213],
length 1448
> >> 
> >> Is the packet on transfer from netback to net front is segmented
into MTU size? Is GRO not supported in the guest?
> > 
> > Here is what I see in the guest, iperf server running in guest and
iperf
> > client running in Dom0. Tcpdump runs with the rune you provided.
> > 
> > 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
> > 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr
> > 21832969], length 11584
> > 
> > This is a upstream kernel. The throughput from Dom0 to DomU is
~7.2Gb/s.
> 
> Thanks for your reply. The tcpdump was captured on dom0 of the guest [at
both vif and the physical interfaces] , i.e. on the receive path of the server.
iperf server was running on the guest (10.84.20.213) and the client was at
another guest (on a different server) with IP 10.84.20.214. The traffic was
between two guests, not between dom0 and the guest.
> 
> > 
> >> 
> >> I am seeing extremely low throughput on a 10Gb/s link. Two linux
guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different
XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.
> > 
> > XenServer might use different Dom0 kernel with their own tuning. You
can
> > also try to contact XenServer support for better idea?
> > 
> 
> XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront
driver, as it appears from the tcpdump, thats why I thought I post it here. Note
that checksum offloads of the interfaces (virtual and physical) were not even
touched, the default setting (which was set to on) was used.
> 
> > In general, off-host communication can be affected by various things.
It
> > would be quite useful to identify the bottleneck first.
> > 
> > Try to run:
> > 1. Dom0 to Dom0 iperf (or you workload)
> > 2. Dom0 to DomU iperf
> > 3. DomU to Dom0 iperf
> 
> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO
turned on in the physical interface). However, when I run guest to guest, things
fall off. Is large packet not supported in netfront? I thought otherwise. I
looked at the code and I do not see any call to napi_gro_receive(), rather it is
using netif_receive_skb(). netback seems to be sending GSO packets to the
netfront, but it is being segmented to 1500 byte (as it appears from the
tcpdump).
> 
OK, I get your problem.

Indeed netfront doesn't make use of GRO API at the moment. I've added
this to my list to work on. I will keep you posted when I get to that.

Thanks!

Wei.
> > 
> > In order to get line rate, you need to at least get line rate from
Dom0
> > to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been
> > achieved at the moment…
> 
> What is the current number, without VCPU pinning etc. for 1500 byte MTU? I
am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm
running on that server without any other traffic.
> 
> -Anirban
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

annie li

2013-Sep-17 02:09 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On 2013-9-16 22:21, Wei Liu wrote:> On Fri, Sep 13, 2013 at 05:09:48PM +0000, Anirban Chakraborty wrote:
>> On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote:
>>
>>> On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty
wrote:
>>>> Hi All,
>>>>
>>>> I am sure this has been answered somewhere in the list in the
past, but I can't find it. I was wondering if the linux guest netfront
driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes,
although the eth0 in dom0 and the vif corresponding to the linux guest in dom0
is showing that they receive large packet:
>>>>
>>>> In dom0:
>>>> eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4
>>>>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500 
Metric:1
>>>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>>>> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags
[DF], proto TCP (6), length 29012)
>>>>     10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
>>>>
>>>> vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>>>>           UP BROADCAST RUNNING NOARP PROMISC  MTU:1500 
Metric:1
>>>> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
>>>> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags
[DF], proto TCP (6), length 29012)
>>>>     10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq
276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225],
length 28960
>>>>
>>>>
>>>> In the guest:
>>>> eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4
>>>>           inet addr:10.84.20.213  Bcast:10.84.20.255 
Mask:255.255.255.0
>>>>           inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
>>>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>>>> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags
[DF], proto TCP (6), length 1500)
>>>>     10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq
17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213],
length 1448
>>>>
>>>> Is the packet on transfer from netback to net front is
segmented into MTU size? Is GRO not supported in the guest?
>>> Here is what I see in the guest, iperf server running in guest and
iperf
>>> client running in Dom0. Tcpdump runs with the rune you provided.
>>>
>>> 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
>>> 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973
ecr
>>> 21832969], length 11584
>>>
>>> This is a upstream kernel. The throughput from Dom0 to DomU is
~7.2Gb/s.
>> Thanks for your reply. The tcpdump was captured on dom0 of the guest
[at both vif and the physical interfaces] , i.e. on the receive path of the
server. iperf server was running on the guest (10.84.20.213) and the client was
at another guest (on a different server) with IP 10.84.20.214. The traffic was
between two guests, not between dom0 and the guest.
>>
>>>> I am seeing extremely low throughput on a 10Gb/s link. Two
linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two
different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.
>>> XenServer might use different Dom0 kernel with their own tuning.
You can
>>> also try to contact XenServer support for better idea?
>>>
>> XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in
netfront driver, as it appears from the tcpdump, thats why I thought I post it
here. Note that checksum offloads of the interfaces (virtual and physical) were
not even touched, the default setting (which was set to on) was used.
>>
>>> In general, off-host communication can be affected by various
things. It
>>> would be quite useful to identify the bottleneck first.
>>>
>>> Try to run:
>>> 1. Dom0 to Dom0 iperf (or you workload)
>>> 2. Dom0 to DomU iperf
>>> 3. DomU to Dom0 iperf
>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with
GRO turned on in the physical interface). However, when I run guest to guest,
things fall off. Is large packet not supported in netfront? I thought otherwise.
I looked at the code and I do not see any call to napi_gro_receive(), rather it
is using netif_receive_skb(). netback seems to be sending GSO packets to the
netfront, but it is being segmented to 1500 byte (as it appears from the
tcpdump).
>>
> OK, I get your problem.
>
> Indeed netfront doesn't make use of GRO API at the moment.
This is true.
But I am wondering why large packet is not segmented into mtu size with 
upstream kernel? I did see large packets with upsteam kernel on receive 
guest(test between 2 domus on same host).

Thanks
Annie>   I've added
> this to my list to work on. I will keep you posted when I get to that.
>
> Thanks!
>
> Wei.
>
>>> In order to get line rate, you need to at least get line rate from
Dom0
>>> to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been
>>> achieved at the moment…
>> What is the current number, without VCPU pinning etc. for 1500 byte
MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the
only vm running on that server without any other traffic.
>>
>> -Anirban
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Wei Liu

2013-Sep-17 08:25 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
[...]> >>>>Is the packet on transfer from netback to net front is
segmented into MTU size? Is GRO not supported in the guest?
> >>>Here is what I see in the guest, iperf server running in guest
and iperf
> >>>client running in Dom0. Tcpdump runs with the rune you
provided.
> >>>
> >>>10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
> >>>5806480:5818064, ack 1, win 229, options [nop,nop,TS val
21968973 ecr
> >>>21832969], length 11584
> >>>
> >>>This is a upstream kernel. The throughput from Dom0 to DomU is
~7.2Gb/s.
> >>Thanks for your reply. The tcpdump was captured on dom0 of the
guest [at both vif and the physical interfaces] , i.e. on the receive path of
the server. iperf server was running on the guest (10.84.20.213) and the client
was at another guest (on a different server) with IP 10.84.20.214. The traffic
was between two guests, not between dom0 and the guest.
> >>
> >>>>I am seeing extremely low throughput on a 10Gb/s link. Two
linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two
different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.
> >>>XenServer might use different Dom0 kernel with their own
tuning. You can
> >>>also try to contact XenServer support for better idea?
> >>>
> >>XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in
netfront driver, as it appears from the tcpdump, thats why I thought I post it
here. Note that checksum offloads of the interfaces (virtual and physical) were
not even touched, the default setting (which was set to on) was used.
> >>
> >>>In general, off-host communication can be affected by various
things. It
> >>>would be quite useful to identify the bottleneck first.
> >>>
> >>>Try to run:
> >>>1. Dom0 to Dom0 iperf (or you workload)
> >>>2. Dom0 to DomU iperf
> >>>3. DomU to Dom0 iperf
> >>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected
(with GRO turned on in the physical interface). However, when I run guest to
guest, things fall off. Is large packet not supported in netfront? I thought
otherwise. I looked at the code and I do not see any call to napi_gro_receive(),
rather it is using netif_receive_skb(). netback seems to be sending GSO packets
to the netfront, but it is being segmented to 1500 byte (as it appears from the
tcpdump).
> >>
> >OK, I get your problem.
> >
> >Indeed netfront doesn't make use of GRO API at the moment.
> 
> This is true.
> But I am wondering why large packet is not segmented into mtu size
> with upstream kernel? I did see large packets with upsteam kernel on
> receive guest(test between 2 domus on same host).
> 
I think Anirban's setup is different. The traffic is from a DomU on
another host.

I will need to setup testing environment with 10G link to test this.

Anirban, can you share your setup, especially DomU kernel version, are
you using upstream kernel in DomU?

Wei.
> Thanks
> Annie
> >  I've added
> >this to my list to work on. I will keep you posted when I get to that.
> >
> >Thanks!
> >
> >Wei.
> >
> >>>In order to get line rate, you need to at least get line rate
from Dom0
> >>>to Dom0 IMHO. 10G/s line rate from guest to guest has not yet
been
> >>>achieved at the moment…
> >>What is the current number, without VCPU pinning etc. for 1500 byte
MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the
only vm running on that server without any other traffic.
> >>
> >>-Anirban
> >>
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xen.org
> >http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Anirban Chakraborty

2013-Sep-17 17:53 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote:
>On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
>><snip>
>>>>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I
expected
>>>>(with GRO turned on in the physical interface). However, when I
run
>>>>guest to guest, things fall off. Is large packet not supported
in
>>>>netfront? I thought otherwise. I looked at the code and I do not
see
>>>>any call to napi_gro_receive(), rather it is using
>>>>netif_receive_skb(). netback seems to be sending GSO packets to
the
>>>>netfront, but it is being segmented to 1500 byte (as it appears
from
>>>>the tcpdump).
>> >>
>> >OK, I get your problem.
>> >
>> >Indeed netfront doesn''t make use of GRO API at the moment.
>> 
>> This is true.
>> But I am wondering why large packet is not segmented into mtu size
>> with upstream kernel? I did see large packets with upsteam kernel on
>> receive guest(test between 2 domus on same host).
>> 
>
>I think Anirban''s setup is different. The traffic is from a DomU on
>another host.
>
>I will need to setup testing environment with 10G link to test this.
>
>Anirban, can you share your setup, especially DomU kernel version, are
>you using upstream kernel in DomU?
Sure..
I have two hosts, say h1 and h2 running XenServer 6.1.
h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical
guest, guest2.

iperf server is running on guest1 with iperf client connecting from guest2.

I haven''t tried with upstream kernel yet. However, what I found out is
that the netback on the receiving host is transmitting GSO segments to the
guest (guest1), but the packets are segmented at the netfront interface.

Annie''s setup has both the guests running on the same host, in which
case
packets are looped back.

-Anirban

annie li

2013-Sep-18 02:28 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On 2013-9-18 1:53, Anirban Chakraborty wrote:>
> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote:
>
>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
>>> <snip>
>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I
expected
>>>>> (with GRO turned on in the physical interface). However,
when I run
>>>>> guest to guest, things fall off. Is large packet not
supported in
>>>>> netfront? I thought otherwise. I looked at the code and I
do not see
>>>>> any call to napi_gro_receive(), rather it is using
>>>>> netif_receive_skb(). netback seems to be sending GSO
packets to the
>>>>> netfront, but it is being segmented to 1500 byte (as it
appears from
>>>>> the tcpdump).
>>>>>
>>>> OK, I get your problem.
>>>>
>>>> Indeed netfront doesn''t make use of GRO API at the
moment.
>>> This is true.
>>> But I am wondering why large packet is not segmented into mtu size
>>> with upstream kernel? I did see large packets with upsteam kernel
on
>>> receive guest(test between 2 domus on same host).
>>>
>> I think Anirban''s setup is different. The traffic is from a
DomU on
>> another host.
>>
>> I will need to setup testing environment with 10G link to test this.
>>
>> Anirban, can you share your setup, especially DomU kernel version, are
>> you using upstream kernel in DomU?
> Sure..
> I have two hosts, say h1 and h2 running XenServer 6.1.
> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical
> guest, guest2.
>
> iperf server is running on guest1 with iperf client connecting from guest2.
>
> I haven''t tried with upstream kernel yet. However, what I found
out is
> that the netback on the receiving host is transmitting GSO segments to the
> guest (guest1), but the packets are segmented at the netfront interface.
Did you try the guests on same host in your environment?
>
> Annie''s setup has both the guests running on the same host, in
which case
> packets are looped back.
If guests does not segment packets for same host case, it should not do 
segment for different host case. For current upstream, Netback->netfront 
mechanism does not treat differently for these two cases.

Thanks
Annie

Wei Liu

2013-Sep-18 15:48 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty
wrote:> 
> 
> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote:
> 
> >On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
> >><snip>
> >>>>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I
expected
> >>>>(with GRO turned on in the physical interface). However,
when I run
> >>>>guest to guest, things fall off. Is large packet not
supported in
> >>>>netfront? I thought otherwise. I looked at the code and I
do not see
> >>>>any call to napi_gro_receive(), rather it is using
> >>>>netif_receive_skb(). netback seems to be sending GSO
packets to the
> >>>>netfront, but it is being segmented to 1500 byte (as it
appears from
> >>>>the tcpdump).
> >> >>
> >> >OK, I get your problem.
> >> >
> >> >Indeed netfront doesn''t make use of GRO API at the
moment.
> >> 
> >> This is true.
> >> But I am wondering why large packet is not segmented into mtu size
> >> with upstream kernel? I did see large packets with upsteam kernel
on
> >> receive guest(test between 2 domus on same host).
> >> 
> >
> >I think Anirban''s setup is different. The traffic is from a
DomU on
> >another host.
> >
> >I will need to setup testing environment with 10G link to test this.
> >
> >Anirban, can you share your setup, especially DomU kernel version, are
> >you using upstream kernel in DomU?
> 
> Sure..
> I have two hosts, say h1 and h2 running XenServer 6.1.
> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical
> guest, guest2.
> 
Do you have exact version of your DomU'' kernel? Is it available
somewhere online?
> iperf server is running on guest1 with iperf client connecting from guest2.
> 
> I haven''t tried with upstream kernel yet. However, what I found
out is
> that the netback on the receiving host is transmitting GSO segments to the
> guest (guest1), but the packets are segmented at the netfront interface.
> 
I just tried, with vanilla upstream kernel I can see large packet size
on DomU''s side.

I also tried to convert netfront to use GRO API (hopefully I didn''t get
it wrong), I didn''t see much improvement -- it''s quite obvious
because I
already saw large packet even without GRO.

If you fancy trying GRO API, see attached patch. Note that you might
need to do some contextual adjustment as this patch is for upstream
kernel.

Wei.

---8<---
From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Wed, 18 Sep 2013 16:46:23 +0100
Subject: [PATCH] xen-netfront: convert to GRO API

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 36808bf..dd1011e 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev,
 		u64_stats_update_end(&stats->syncp);
 
 		/* Pass it up. */
-		netif_receive_skb(skb);
+		napi_gro_receive(&np->napi, skb);
 	}
 
 	return packets_dropped;
@@ -1051,6 +1051,8 @@ err:
 	if (work_done < budget) {
 		int more_to_do = 0;
 
+		napi_gro_flush(napi, false);
+
 		local_irq_save(flags);
 
 		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
-- 
1.7.10.4

> Annie''s setup has both the guests running on the same host, in
which case
> packets are looped back.
> 
> -Anirban
>

Anirban Chakraborty

2013-Sep-18 20:38 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote:
>> 
>> 
>> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com>
wrote:
>> 
>>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
>>>> <snip>
>>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what
I expected
>>>>>> (with GRO turned on in the physical interface).
However, when I run
>>>>>> guest to guest, things fall off. Is large packet not
supported in
>>>>>> netfront? I thought otherwise. I looked at the code and
I do not see
>>>>>> any call to napi_gro_receive(), rather it is using
>>>>>> netif_receive_skb(). netback seems to be sending GSO
packets to the
>>>>>> netfront, but it is being segmented to 1500 byte (as it
appears from
>>>>>> the tcpdump).
>>>>>> 
>>>>> OK, I get your problem.
>>>>> 
>>>>> Indeed netfront doesn''t make use of GRO API at the
moment.
>>>> 
>>>> This is true.
>>>> But I am wondering why large packet is not segmented into mtu
size
>>>> with upstream kernel? I did see large packets with upsteam
kernel on
>>>> receive guest(test between 2 domus on same host).
>>>> 
>>> 
>>> I think Anirban''s setup is different. The traffic is from
a DomU on
>>> another host.
>>> 
>>> I will need to setup testing environment with 10G link to test
this.
>>> 
>>> Anirban, can you share your setup, especially DomU kernel version,
are
>>> you using upstream kernel in DomU?
>> 
>> Sure..
>> I have two hosts, say h1 and h2 running XenServer 6.1.
>> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running
identical
>> guest, guest2.
>> 
> 
> Do you have exact version of your DomU'' kernel? Is it available
> somewhere online?
Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time.
> 
>> iperf server is running on guest1 with iperf client connecting from
guest2.
>> 
>> I haven''t tried with upstream kernel yet. However, what I
found out is
>> that the netback on the receiving host is transmitting GSO segments to
the
>> guest (guest1), but the packets are segmented at the netfront
interface.
>> 
> 
> I just tried, with vanilla upstream kernel I can see large packet size
> on DomU''s side.
> 
> I also tried to convert netfront to use GRO API (hopefully I
didn''t get
> it wrong), I didn''t see much improvement -- it''s quite
obvious because I
> already saw large packet even without GRO.
> 
> If you fancy trying GRO API, see attached patch. Note that you might
> need to do some contextual adjustment as this patch is for upstream
> kernel.
> 
> Wei.
> 
> ---8<---
> From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001
> From: Wei Liu <wei.liu2@citrix.com>
> Date: Wed, 18 Sep 2013 16:46:23 +0100
> Subject: [PATCH] xen-netfront: convert to GRO API
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
> drivers/net/xen-netfront.c |    4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 36808bf..dd1011e 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device
*dev,
> 		u64_stats_update_end(&stats->syncp);
> 
> 		/* Pass it up. */
> -		netif_receive_skb(skb);
> +		napi_gro_receive(&np->napi, skb);
> 	}
> 
> 	return packets_dropped;
> @@ -1051,6 +1051,8 @@ err:
> 	if (work_done < budget) {
> 		int more_to_do = 0;
> 
> +		napi_gro_flush(napi, false);
> +
> 		local_irq_save(flags);
> 
> 		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
> -- 
> 1.7.10.4

I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the
following patch (your patch plus the advertisement of NETIF_F_GRO) :
---------
diff --git a/xen-netfront.c.orig b/xen-netfront.c
index 23e467d..bc673d3 100644
--- a/xen-netfront.c.orig
+++ b/xen-netfront.c
@@ -818,6 +818,7 @@ static int handle_incoming_queue(struct net_device *dev,
 {
 	int packets_dropped = 0;
 	struct sk_buff *skb;
+	struct netfront_info *np = netdev_priv(dev);
 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		struct page *page = NETFRONT_SKB_CB(skb)->page;
@@ -846,7 +847,7 @@ static int handle_incoming_queue(struct net_device *dev,
 		dev->stats.rx_bytes += skb->len;
 
 		/* Pass it up. */
-		netif_receive_skb(skb);
+		napi_gro_receive(&np->napi, skb);
 	}
 
 	return packets_dropped;
@@ -981,6 +982,7 @@ err:
 	if (work_done < budget) {
 		int more_to_do = 0;
 
+		napi_gro_flush(napi);
 		local_irq_save(flags);
 
 		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
@@ -1182,7 +1184,8 @@ static struct net_device * __devinit
xennet_create_dev(struct xenbus_device *dev
 	netif_napi_add(netdev, &np->napi, xennet_poll, 64);
 
 	/* Assume all features and let xennet_set_features fix up.  */
-	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO;
+	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO |
+					NETIF_F_GRO;
 
 	SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops);
 	SET_NETDEV_DEV(netdev, &dev->dev);
-----------
tcpdump showed that the guest interface received large packets. I
haven''t checked upstream kernel as guest though.

Anirban

Anirban Chakraborty

2013-Sep-18 21:06 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Sep 17, 2013, at 7:28 PM, annie li <annie.li@oracle.com> wrote:
> 
> On 2013-9-18 1:53, Anirban Chakraborty wrote:
>> 
>> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com>
wrote:
>> 
>>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
>>>> <snip>
>>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what
I expected
>>>>>> (with GRO turned on in the physical interface).
However, when I run
>>>>>> guest to guest, things fall off. Is large packet not
supported in
>>>>>> netfront? I thought otherwise. I looked at the code and
I do not see
>>>>>> any call to napi_gro_receive(), rather it is using
>>>>>> netif_receive_skb(). netback seems to be sending GSO
packets to the
>>>>>> netfront, but it is being segmented to 1500 byte (as it
appears from
>>>>>> the tcpdump).
>>>>>> 
>>>>> OK, I get your problem.
>>>>> 
>>>>> Indeed netfront doesn''t make use of GRO API at the
moment.
>>>> This is true.
>>>> But I am wondering why large packet is not segmented into mtu
size
>>>> with upstream kernel? I did see large packets with upsteam
kernel on
>>>> receive guest(test between 2 domus on same host).
>>>> 
>>> I think Anirban''s setup is different. The traffic is from
a DomU on
>>> another host.
>>> 
>>> I will need to setup testing environment with 10G link to test
this.
>>> 
>>> Anirban, can you share your setup, especially DomU kernel version,
are
>>> you using upstream kernel in DomU?
>> Sure..
>> I have two hosts, say h1 and h2 running XenServer 6.1.
>> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running
identical
>> guest, guest2.
>> 
>> iperf server is running on guest1 with iperf client connecting from
guest2.
>> 
>> I haven''t tried with upstream kernel yet. However, what I
found out is
>> that the netback on the receiving host is transmitting GSO segments to
the
>> guest (guest1), but the packets are segmented at the netfront
interface.
> 
> Did you try the guests on same host in your environment?
Yes, I did and it showed up 5.66 Gbps as opposed to
2.6-2.8Gbps.> 
>> 
>> Annie''s setup has both the guests running on the same host, in
which case
>> packets are looped back.
> 
> If guests does not segment packets for same host case, it should not do
segment for different host case. For current upstream, Netback->netfront
mechanism does not treat differently for these two cases.
For the two guests (Centos 6.4, 2.6.32.-358) on the same host, the receiving
guest indeed receives large packet, while this is not true if the receiving
guest is on a different host. Since both the vifs are connected to the same
linux bridge and hence packets passed between them are forwarded without hitting
the wire. This explains the higher throughput. However, I would still expect
segmented packet at the receiving guest, but certainly that is not what I am
seeing.

Anirban

Wei Liu

2013-Sep-19 09:41 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Wed, Sep 18, 2013 at 08:38:01PM +0000, Anirban Chakraborty
wrote:> 
> On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote:
> 
> > On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote:
> >> 
> >> 
> >> On 9/17/13 1:25 AM, "Wei Liu"
<wei.liu2@citrix.com> wrote:
> >> 
> >>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote:
> >>>> <snip>
> >>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is
what I expected
> >>>>>> (with GRO turned on in the physical interface).
However, when I run
> >>>>>> guest to guest, things fall off. Is large packet
not supported in
> >>>>>> netfront? I thought otherwise. I looked at the
code and I do not see
> >>>>>> any call to napi_gro_receive(), rather it is using
> >>>>>> netif_receive_skb(). netback seems to be sending
GSO packets to the
> >>>>>> netfront, but it is being segmented to 1500 byte
(as it appears from
> >>>>>> the tcpdump).
> >>>>>> 
> >>>>> OK, I get your problem.
> >>>>> 
> >>>>> Indeed netfront doesn''t make use of GRO API
at the moment.
> >>>> 
> >>>> This is true.
> >>>> But I am wondering why large packet is not segmented into
mtu size
> >>>> with upstream kernel? I did see large packets with upsteam
kernel on
> >>>> receive guest(test between 2 domus on same host).
> >>>> 
> >>> 
> >>> I think Anirban''s setup is different. The traffic is
from a DomU on
> >>> another host.
> >>> 
> >>> I will need to setup testing environment with 10G link to test
this.
> >>> 
> >>> Anirban, can you share your setup, especially DomU kernel
version, are
> >>> you using upstream kernel in DomU?
> >> 
> >> Sure..
> >> I have two hosts, say h1 and h2 running XenServer 6.1.
> >> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running
identical
> >> guest, guest2.
> >> 
> > 
> > Do you have exact version of your DomU'' kernel? Is it
available
> > somewhere online?
> 
> Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time.
> 
So that''s a RHEL kernel, you might also want to ask Redhat to have a
look at that?
> > 
> >> iperf server is running on guest1 with iperf client connecting
from guest2.
> >> 
> >> I haven''t tried with upstream kernel yet. However, what I
found out is
> >> that the netback on the receiving host is transmitting GSO
segments to the
> >> guest (guest1), but the packets are segmented at the netfront
interface.
> >> 
> > 
> > I just tried, with vanilla upstream kernel I can see large packet size
> > on DomU''s side.
> > 
> > I also tried to convert netfront to use GRO API (hopefully I
didn''t get
> > it wrong), I didn''t see much improvement -- it''s
quite obvious because I
> > already saw large packet even without GRO.
> > 
> > If you fancy trying GRO API, see attached patch. Note that you might
> > need to do some contextual adjustment as this patch is for upstream
> > kernel.
> > 
> > Wei.
> > 
> > ---8<---
> > From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001
> > From: Wei Liu <wei.liu2@citrix.com>
> > Date: Wed, 18 Sep 2013 16:46:23 +0100
> > Subject: [PATCH] xen-netfront: convert to GRO API
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > drivers/net/xen-netfront.c |    4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> > index 36808bf..dd1011e 100644
> > --- a/drivers/net/xen-netfront.c
> > +++ b/drivers/net/xen-netfront.c
> > @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device
*dev,
> > 		u64_stats_update_end(&stats->syncp);
> > 
> > 		/* Pass it up. */
> > -		netif_receive_skb(skb);
> > +		napi_gro_receive(&np->napi, skb);
> > 	}
> > 
> > 	return packets_dropped;
> > @@ -1051,6 +1051,8 @@ err:
> > 	if (work_done < budget) {
> > 		int more_to_do = 0;
> > 
> > +		napi_gro_flush(napi, false);
> > +
> > 		local_irq_save(flags);
> > 
> > 		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
> > -- 
> > 1.7.10.4
> 
> 
> I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the
following patch (your patch plus the advertisement of NETIF_F_GRO) :
OK, thanks for reporting back.

I''m curious about the packet size after enabling GRO. I can get 5G/s
upstream with packet size ~24K on a 10G nic. It''s not line rate yet,
certainly there is space for improvement.

Wei.
> ---------
> diff --git a/xen-netfront.c.orig b/xen-netfront.c
> index 23e467d..bc673d3 100644
> --- a/xen-netfront.c.orig
> +++ b/xen-netfront.c
> @@ -818,6 +818,7 @@ static int handle_incoming_queue(struct net_device
*dev,
>  {
>  	int packets_dropped = 0;
>  	struct sk_buff *skb;
> +	struct netfront_info *np = netdev_priv(dev);
>  
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		struct page *page = NETFRONT_SKB_CB(skb)->page;
> @@ -846,7 +847,7 @@ static int handle_incoming_queue(struct net_device
*dev,
>  		dev->stats.rx_bytes += skb->len;
>  
>  		/* Pass it up. */
> -		netif_receive_skb(skb);
> +		napi_gro_receive(&np->napi, skb);
>  	}
>  
>  	return packets_dropped;
> @@ -981,6 +982,7 @@ err:
>  	if (work_done < budget) {
>  		int more_to_do = 0;
>  
> +		napi_gro_flush(napi);
>  		local_irq_save(flags);
>  
>  		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
> @@ -1182,7 +1184,8 @@ static struct net_device * __devinit
xennet_create_dev(struct xenbus_device *dev
>  	netif_napi_add(netdev, &np->napi, xennet_poll, 64);
>  
>  	/* Assume all features and let xennet_set_features fix up.  */
> -	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO;
> +	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO |
> +					NETIF_F_GRO;
>  
>  	SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops);
>  	SET_NETDEV_DEV(netdev, &dev->dev);
> -----------
> tcpdump showed that the guest interface received large packets. I
haven''t checked upstream kernel as guest though.
> 
> Anirban
>

Anirban Chakraborty

2013-Sep-19 16:59 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Sep 19, 2013, at 2:41 AM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Wed, Sep 18, 2013 at 08:38:01PM +0000, Anirban Chakraborty wrote:
>> 
>> On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote:
>> 
>>> On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty
wrote:
>>>> 
>>>> 
>>>> On 9/17/13 1:25 AM, "Wei Liu"
<wei.liu2@citrix.com> wrote:
>>>> <snip>
>> 
>> Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time.
>> 
> 
> So that''s a RHEL kernel, you might also want to ask Redhat to have
a
> look at that?
It is not RHEL kernel, Centos 6.4. This being a netfront driver issue, I think
we should address it here.
> 
>>> <snip>-- 
>>> 1.7.10.4
>> 
>> 
>> I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the
following patch (your patch plus the advertisement of NETIF_F_GRO) :
> 
> OK, thanks for reporting back.
> 
> I''m curious about the packet size after enabling GRO. I can get
5G/s
> upstream with packet size ~24K on a 10G nic. It''s not line rate
yet,
> certainly there is space for improvement.
I am seeing varying packet sizes with the GRO patch, from 2K to all the way upto
64K.
I do not think we can get line rate by enabling GRO only. The netback thread
that is handling the guest traffic is running on a different CPU (and possibly
different node) compared to the guest. If we can schedule netback for a guest to
run on the same node as the guest, we should be able to see better numbers.
In any case, are you going to submit the patch upstream or should I do it? 
Thanks.

Anirban

Wei Liu

2013-Sep-19 18:43 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Thu, Sep 19, 2013 at 04:59:49PM +0000, Anirban Chakraborty wrote:
[...]> >> 
> >> I was able to see a bit of improvement (from 2.65 to 3.6 Gbps)
with the following patch (your patch plus the advertisement of NETIF_F_GRO) :
> > 
> > OK, thanks for reporting back.
> > 
> > I''m curious about the packet size after enabling GRO. I can
get 5G/s
> > upstream with packet size ~24K on a 10G nic. It''s not line
rate yet,
> > certainly there is space for improvement.
> 
> I am seeing varying packet sizes with the GRO patch, from 2K to all the way
upto 64K.
> I do not think we can get line rate by enabling GRO only. The netback
thread that is handling the guest traffic is running on a different CPU (and
possibly different node) compared to the guest. If we can schedule netback for a
guest to run on the same node as the guest, we should be able to see better
numbers.
You can use vcpu pin to pin Dom0''s CPUs and Dom0''s CPUs to the
same NUMA
node. However domain''s memory might still be striped across different
nodes.
> In any case, are you going to submit the patch upstream or should I do it? 
I will do that once net-next is open.

Wei.
> Thanks.
> 
> Anirban
> 
>

Wei Liu

2013-Sep-19 19:04 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Thu, Sep 19, 2013 at 07:43:15PM +0100, Wei Liu wrote:
[...]> 
> > In any case, are you going to submit the patch upstream or should I do
it?
> 
> I will do that once net-next is open.
> 
I will add your SoB to that patch, is that OK?

Wei.
> Wei.
> 
> > Thanks.
> > 
> > Anirban
> > 
> >

Anirban Chakraborty

2013-Sep-19 20:54 UTC

head link

Re: large packet support in netfront driver and guest network throughput

On Sep 19, 2013, at 12:04 PM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Thu, Sep 19, 2013 at 07:43:15PM +0100, Wei Liu wrote:
> [...]
>> 
>>> In any case, are you going to submit the patch upstream or should I
do it?
>> 
>> I will do that once net-next is open.
>> 
> 
> I will add your SoB to that patch, is that OK?
Thats fine. Thanks.
Anirban

Seemingly Similar Threads

Search for more maybe matching threads

Xen devel - Sep 2013 - large packet support in netfront driver and guest network throughput

large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Re: large packet support in netfront driver and guest network throughput

Seemingly Similar Threads