Anirban Chakraborty
2013-Sep-12 17:53 UTC
large packet support in netfront driver and guest network throughput
Hi All, I am sure this has been answered somewhere in the list in the past, but I can''t find it. I was wondering if the linux guest netfront driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing that they receive large packet: In dom0: eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 In the guest: eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP (6), length 1500) 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest? I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. I am using linux bridge as network backend switch. Dom0 is configured to have 2940MB of RAM. In most cases, after a few runs the throughput drops to ~2.2 Gbps. top shows that the netback thread in dom0 is having about 70-80% CPU utilization. I have checked the dom0 network configuration and there is no QoS policy in place etc. So, my question is that is PCI passthrough only option to get line rate in the guests? Is there any benchmark of maximum throughput achieved in the guests using PV drivers and without PCI pass thru? Also, what could be the reason for throughput drop in the guests (from ~3.2 to ~2.2 Gbps) consistently after few runs of iperf? Any pointer will be highly appreciated. thanks, Anirban
Wei Liu
2013-Sep-13 11:44 UTC
Re: large packet support in netfront driver and guest network throughput
On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote:> Hi All, > > I am sure this has been answered somewhere in the list in the past, but I can''t find it. I was wondering if the linux guest netfront driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing that they receive large packet: > > In dom0: > eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 > 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) > 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 > > vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 > tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 > 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) > 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 > > > In the guest: > eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 > inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 > inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 > 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP (6), length 1500) > 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 > > Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest?Here is what I see in the guest, iperf server running in guest and iperf client running in Dom0. Tcpdump runs with the rune you provided. 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr 21832969], length 11584 This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s.> > I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps.XenServer might use different Dom0 kernel with their own tuning. You can also try to contact XenServer support for better idea? In general, off-host communication can be affected by various things. It would be quite useful to identify the bottleneck first. Try to run: 1. Dom0 to Dom0 iperf (or you workload) 2. Dom0 to DomU iperf 3. DomU to Dom0 iperf In order to get line rate, you need to at least get line rate from Dom0 to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been achieved at the moment... Wei.> I am using linux bridge as network backend switch. Dom0 is configured to have 2940MB of RAM. > In most cases, after a few runs the throughput drops to ~2.2 Gbps. top shows that the netback thread in dom0 is having about 70-80% CPU utilization. I have checked the dom0 network configuration and there is no QoS policy in place etc. So, my question is that is PCI passthrough only option to get line rate in the guests? Is there any benchmark of maximum throughput achieved in the guests using PV drivers and without PCI pass thru? Also, what could be the reason for throughput drop in the guests (from ~3.2 to ~2.2 Gbps) consistently after few runs of iperf? > > Any pointer will be highly appreciated. > > thanks, > Anirban > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Anirban Chakraborty
2013-Sep-13 17:09 UTC
Re: large packet support in netfront driver and guest network throughput
On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote:> On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote: >> Hi All, >> >> I am sure this has been answered somewhere in the list in the past, but I can''t find it. I was wondering if the linux guest netfront driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing that they receive large packet: >> >> In dom0: >> eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >> >> vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >> UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 >> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 >> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >> >> >> In the guest: >> eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 >> inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 >> inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP (6), length 1500) >> 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 >> >> Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest? > > Here is what I see in the guest, iperf server running in guest and iperf > client running in Dom0. Tcpdump runs with the rune you provided. > > 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq > 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr > 21832969], length 11584 > > This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s.Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both vif and the physical interfaces] , i.e. on the receive path of the server. iperf server was running on the guest (10.84.20.213) and the client was at another guest (on a different server) with IP 10.84.20.214. The traffic was between two guests, not between dom0 and the guest.> >> >> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. > > XenServer might use different Dom0 kernel with their own tuning. You can > also try to contact XenServer support for better idea? >XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront driver, as it appears from the tcpdump, thats why I thought I post it here. Note that checksum offloads of the interfaces (virtual and physical) were not even touched, the default setting (which was set to on) was used.> In general, off-host communication can be affected by various things. It > would be quite useful to identify the bottleneck first. > > Try to run: > 1. Dom0 to Dom0 iperf (or you workload) > 2. Dom0 to DomU iperf > 3. DomU to Dom0 iperfI tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO turned on in the physical interface). However, when I run guest to guest, things fall off. Is large packet not supported in netfront? I thought otherwise. I looked at the code and I do not see any call to napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be sending GSO packets to the netfront, but it is being segmented to 1500 byte (as it appears from the tcpdump).> > In order to get line rate, you need to at least get line rate from Dom0 > to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been > achieved at the moment…What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm running on that server without any other traffic. -Anirban
Wei Liu
2013-Sep-16 14:21 UTC
Re: large packet support in netfront driver and guest network throughput
On Fri, Sep 13, 2013 at 05:09:48PM +0000, Anirban Chakraborty wrote:> On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote: > > > On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote: > >> Hi All, > >> > >> I am sure this has been answered somewhere in the list in the past, but I can't find it. I was wondering if the linux guest netfront driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing that they receive large packet: > >> > >> In dom0: > >> eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 > >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 > >> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) > >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 > >> > >> vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > >> UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 > >> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 > >> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) > >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 > >> > >> > >> In the guest: > >> eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 > >> inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 > >> inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link > >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 > >> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP (6), length 1500) > >> 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 > >> > >> Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest? > > > > Here is what I see in the guest, iperf server running in guest and iperf > > client running in Dom0. Tcpdump runs with the rune you provided. > > > > 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq > > 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr > > 21832969], length 11584 > > > > This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s. > > Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both vif and the physical interfaces] , i.e. on the receive path of the server. iperf server was running on the guest (10.84.20.213) and the client was at another guest (on a different server) with IP 10.84.20.214. The traffic was between two guests, not between dom0 and the guest. > > > > >> > >> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. > > > > XenServer might use different Dom0 kernel with their own tuning. You can > > also try to contact XenServer support for better idea? > > > > XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront driver, as it appears from the tcpdump, thats why I thought I post it here. Note that checksum offloads of the interfaces (virtual and physical) were not even touched, the default setting (which was set to on) was used. > > > In general, off-host communication can be affected by various things. It > > would be quite useful to identify the bottleneck first. > > > > Try to run: > > 1. Dom0 to Dom0 iperf (or you workload) > > 2. Dom0 to DomU iperf > > 3. DomU to Dom0 iperf > > I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO turned on in the physical interface). However, when I run guest to guest, things fall off. Is large packet not supported in netfront? I thought otherwise. I looked at the code and I do not see any call to napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be sending GSO packets to the netfront, but it is being segmented to 1500 byte (as it appears from the tcpdump). >OK, I get your problem. Indeed netfront doesn't make use of GRO API at the moment. I've added this to my list to work on. I will keep you posted when I get to that. Thanks! Wei.> > > > In order to get line rate, you need to at least get line rate from Dom0 > > to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been > > achieved at the moment… > > What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm running on that server without any other traffic. > > -Anirban >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
annie li
2013-Sep-17 02:09 UTC
Re: large packet support in netfront driver and guest network throughput
On 2013-9-16 22:21, Wei Liu wrote:> On Fri, Sep 13, 2013 at 05:09:48PM +0000, Anirban Chakraborty wrote: >> On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@citrix.com> wrote: >> >>> On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote: >>>> Hi All, >>>> >>>> I am sure this has been answered somewhere in the list in the past, but I can't find it. I was wondering if the linux guest netfront driver has GRO support in it. tcpdump shows packets coming in with 1500 bytes, although the eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing that they receive large packet: >>>> >>>> In dom0: >>>> eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 >>>> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >>>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >>>> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) >>>> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >>>> >>>> vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>>> UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 >>>> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 >>>> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto TCP (6), length 29012) >>>> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >>>> >>>> >>>> In the guest: >>>> eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 >>>> inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 >>>> inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link >>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >>>> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto TCP (6), length 1500) >>>> 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 >>>> >>>> Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest? >>> Here is what I see in the guest, iperf server running in guest and iperf >>> client running in Dom0. Tcpdump runs with the rune you provided. >>> >>> 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq >>> 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr >>> 21832969], length 11584 >>> >>> This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s. >> Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both vif and the physical interfaces] , i.e. on the receive path of the server. iperf server was running on the guest (10.84.20.213) and the client was at another guest (on a different server) with IP 10.84.20.214. The traffic was between two guests, not between dom0 and the guest. >> >>>> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. >>> XenServer might use different Dom0 kernel with their own tuning. You can >>> also try to contact XenServer support for better idea? >>> >> XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront driver, as it appears from the tcpdump, thats why I thought I post it here. Note that checksum offloads of the interfaces (virtual and physical) were not even touched, the default setting (which was set to on) was used. >> >>> In general, off-host communication can be affected by various things. It >>> would be quite useful to identify the bottleneck first. >>> >>> Try to run: >>> 1. Dom0 to Dom0 iperf (or you workload) >>> 2. Dom0 to DomU iperf >>> 3. DomU to Dom0 iperf >> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO turned on in the physical interface). However, when I run guest to guest, things fall off. Is large packet not supported in netfront? I thought otherwise. I looked at the code and I do not see any call to napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be sending GSO packets to the netfront, but it is being segmented to 1500 byte (as it appears from the tcpdump). >> > OK, I get your problem. > > Indeed netfront doesn't make use of GRO API at the moment.This is true. But I am wondering why large packet is not segmented into mtu size with upstream kernel? I did see large packets with upsteam kernel on receive guest(test between 2 domus on same host). Thanks Annie> I've added > this to my list to work on. I will keep you posted when I get to that. > > Thanks! > > Wei. > >>> In order to get line rate, you need to at least get line rate from Dom0 >>> to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been >>> achieved at the moment… >> What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm running on that server without any other traffic. >> >> -Anirban >> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Wei Liu
2013-Sep-17 08:25 UTC
Re: large packet support in netfront driver and guest network throughput
On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: [...]> >>>>Is the packet on transfer from netback to net front is segmented into MTU size? Is GRO not supported in the guest? > >>>Here is what I see in the guest, iperf server running in guest and iperf > >>>client running in Dom0. Tcpdump runs with the rune you provided. > >>> > >>>10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq > >>>5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr > >>>21832969], length 11584 > >>> > >>>This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s. > >>Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both vif and the physical interfaces] , i.e. on the receive path of the server. iperf server was running on the guest (10.84.20.213) and the client was at another guest (on a different server) with IP 10.84.20.214. The traffic was between two guests, not between dom0 and the guest. > >> > >>>>I am seeing extremely low throughput on a 10Gb/s link. Two linux guests (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. > >>>XenServer might use different Dom0 kernel with their own tuning. You can > >>>also try to contact XenServer support for better idea? > >>> > >>XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront driver, as it appears from the tcpdump, thats why I thought I post it here. Note that checksum offloads of the interfaces (virtual and physical) were not even touched, the default setting (which was set to on) was used. > >> > >>>In general, off-host communication can be affected by various things. It > >>>would be quite useful to identify the bottleneck first. > >>> > >>>Try to run: > >>>1. Dom0 to Dom0 iperf (or you workload) > >>>2. Dom0 to DomU iperf > >>>3. DomU to Dom0 iperf > >>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO turned on in the physical interface). However, when I run guest to guest, things fall off. Is large packet not supported in netfront? I thought otherwise. I looked at the code and I do not see any call to napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be sending GSO packets to the netfront, but it is being segmented to 1500 byte (as it appears from the tcpdump). > >> > >OK, I get your problem. > > > >Indeed netfront doesn't make use of GRO API at the moment. > > This is true. > But I am wondering why large packet is not segmented into mtu size > with upstream kernel? I did see large packets with upsteam kernel on > receive guest(test between 2 domus on same host). >I think Anirban's setup is different. The traffic is from a DomU on another host. I will need to setup testing environment with 10G link to test this. Anirban, can you share your setup, especially DomU kernel version, are you using upstream kernel in DomU? Wei.> Thanks > Annie > > I've added > >this to my list to work on. I will keep you posted when I get to that. > > > >Thanks! > > > >Wei. > > > >>>In order to get line rate, you need to at least get line rate from Dom0 > >>>to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been > >>>achieved at the moment… > >>What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm running on that server without any other traffic. > >> > >>-Anirban > >> > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@lists.xen.org > >http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Anirban Chakraborty
2013-Sep-17 17:53 UTC
Re: large packet support in netfront driver and guest network throughput
On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote:>On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: >><snip> >>>>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected >>>>(with GRO turned on in the physical interface). However, when I run >>>>guest to guest, things fall off. Is large packet not supported in >>>>netfront? I thought otherwise. I looked at the code and I do not see >>>>any call to napi_gro_receive(), rather it is using >>>>netif_receive_skb(). netback seems to be sending GSO packets to the >>>>netfront, but it is being segmented to 1500 byte (as it appears from >>>>the tcpdump). >> >> >> >OK, I get your problem. >> > >> >Indeed netfront doesn''t make use of GRO API at the moment. >> >> This is true. >> But I am wondering why large packet is not segmented into mtu size >> with upstream kernel? I did see large packets with upsteam kernel on >> receive guest(test between 2 domus on same host). >> > >I think Anirban''s setup is different. The traffic is from a DomU on >another host. > >I will need to setup testing environment with 10G link to test this. > >Anirban, can you share your setup, especially DomU kernel version, are >you using upstream kernel in DomU?Sure.. I have two hosts, say h1 and h2 running XenServer 6.1. h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical guest, guest2. iperf server is running on guest1 with iperf client connecting from guest2. I haven''t tried with upstream kernel yet. However, what I found out is that the netback on the receiving host is transmitting GSO segments to the guest (guest1), but the packets are segmented at the netfront interface. Annie''s setup has both the guests running on the same host, in which case packets are looped back. -Anirban
annie li
2013-Sep-18 02:28 UTC
Re: large packet support in netfront driver and guest network throughput
On 2013-9-18 1:53, Anirban Chakraborty wrote:> > On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: > >> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: >>> <snip> >>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected >>>>> (with GRO turned on in the physical interface). However, when I run >>>>> guest to guest, things fall off. Is large packet not supported in >>>>> netfront? I thought otherwise. I looked at the code and I do not see >>>>> any call to napi_gro_receive(), rather it is using >>>>> netif_receive_skb(). netback seems to be sending GSO packets to the >>>>> netfront, but it is being segmented to 1500 byte (as it appears from >>>>> the tcpdump). >>>>> >>>> OK, I get your problem. >>>> >>>> Indeed netfront doesn''t make use of GRO API at the moment. >>> This is true. >>> But I am wondering why large packet is not segmented into mtu size >>> with upstream kernel? I did see large packets with upsteam kernel on >>> receive guest(test between 2 domus on same host). >>> >> I think Anirban''s setup is different. The traffic is from a DomU on >> another host. >> >> I will need to setup testing environment with 10G link to test this. >> >> Anirban, can you share your setup, especially DomU kernel version, are >> you using upstream kernel in DomU? > Sure.. > I have two hosts, say h1 and h2 running XenServer 6.1. > h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical > guest, guest2. > > iperf server is running on guest1 with iperf client connecting from guest2. > > I haven''t tried with upstream kernel yet. However, what I found out is > that the netback on the receiving host is transmitting GSO segments to the > guest (guest1), but the packets are segmented at the netfront interface.Did you try the guests on same host in your environment?> > Annie''s setup has both the guests running on the same host, in which case > packets are looped back.If guests does not segment packets for same host case, it should not do segment for different host case. For current upstream, Netback->netfront mechanism does not treat differently for these two cases. Thanks Annie
Wei Liu
2013-Sep-18 15:48 UTC
Re: large packet support in netfront driver and guest network throughput
On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote:> > > On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: > > >On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: > >><snip> > >>>>I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected > >>>>(with GRO turned on in the physical interface). However, when I run > >>>>guest to guest, things fall off. Is large packet not supported in > >>>>netfront? I thought otherwise. I looked at the code and I do not see > >>>>any call to napi_gro_receive(), rather it is using > >>>>netif_receive_skb(). netback seems to be sending GSO packets to the > >>>>netfront, but it is being segmented to 1500 byte (as it appears from > >>>>the tcpdump). > >> >> > >> >OK, I get your problem. > >> > > >> >Indeed netfront doesn''t make use of GRO API at the moment. > >> > >> This is true. > >> But I am wondering why large packet is not segmented into mtu size > >> with upstream kernel? I did see large packets with upsteam kernel on > >> receive guest(test between 2 domus on same host). > >> > > > >I think Anirban''s setup is different. The traffic is from a DomU on > >another host. > > > >I will need to setup testing environment with 10G link to test this. > > > >Anirban, can you share your setup, especially DomU kernel version, are > >you using upstream kernel in DomU? > > Sure.. > I have two hosts, say h1 and h2 running XenServer 6.1. > h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical > guest, guest2. >Do you have exact version of your DomU'' kernel? Is it available somewhere online?> iperf server is running on guest1 with iperf client connecting from guest2. > > I haven''t tried with upstream kernel yet. However, what I found out is > that the netback on the receiving host is transmitting GSO segments to the > guest (guest1), but the packets are segmented at the netfront interface. >I just tried, with vanilla upstream kernel I can see large packet size on DomU''s side. I also tried to convert netfront to use GRO API (hopefully I didn''t get it wrong), I didn''t see much improvement -- it''s quite obvious because I already saw large packet even without GRO. If you fancy trying GRO API, see attached patch. Note that you might need to do some contextual adjustment as this patch is for upstream kernel. Wei. ---8<--- From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001 From: Wei Liu <wei.liu2@citrix.com> Date: Wed, 18 Sep 2013 16:46:23 +0100 Subject: [PATCH] xen-netfront: convert to GRO API Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netfront.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 36808bf..dd1011e 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev, u64_stats_update_end(&stats->syncp); /* Pass it up. */ - netif_receive_skb(skb); + napi_gro_receive(&np->napi, skb); } return packets_dropped; @@ -1051,6 +1051,8 @@ err: if (work_done < budget) { int more_to_do = 0; + napi_gro_flush(napi, false); + local_irq_save(flags); RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do); -- 1.7.10.4> Annie''s setup has both the guests running on the same host, in which case > packets are looped back. > > -Anirban >
Anirban Chakraborty
2013-Sep-18 20:38 UTC
Re: large packet support in netfront driver and guest network throughput
On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote:> On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote: >> >> >> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: >> >>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: >>>> <snip> >>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected >>>>>> (with GRO turned on in the physical interface). However, when I run >>>>>> guest to guest, things fall off. Is large packet not supported in >>>>>> netfront? I thought otherwise. I looked at the code and I do not see >>>>>> any call to napi_gro_receive(), rather it is using >>>>>> netif_receive_skb(). netback seems to be sending GSO packets to the >>>>>> netfront, but it is being segmented to 1500 byte (as it appears from >>>>>> the tcpdump). >>>>>> >>>>> OK, I get your problem. >>>>> >>>>> Indeed netfront doesn''t make use of GRO API at the moment. >>>> >>>> This is true. >>>> But I am wondering why large packet is not segmented into mtu size >>>> with upstream kernel? I did see large packets with upsteam kernel on >>>> receive guest(test between 2 domus on same host). >>>> >>> >>> I think Anirban''s setup is different. The traffic is from a DomU on >>> another host. >>> >>> I will need to setup testing environment with 10G link to test this. >>> >>> Anirban, can you share your setup, especially DomU kernel version, are >>> you using upstream kernel in DomU? >> >> Sure.. >> I have two hosts, say h1 and h2 running XenServer 6.1. >> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical >> guest, guest2. >> > > Do you have exact version of your DomU'' kernel? Is it available > somewhere online?Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time.> >> iperf server is running on guest1 with iperf client connecting from guest2. >> >> I haven''t tried with upstream kernel yet. However, what I found out is >> that the netback on the receiving host is transmitting GSO segments to the >> guest (guest1), but the packets are segmented at the netfront interface. >> > > I just tried, with vanilla upstream kernel I can see large packet size > on DomU''s side. > > I also tried to convert netfront to use GRO API (hopefully I didn''t get > it wrong), I didn''t see much improvement -- it''s quite obvious because I > already saw large packet even without GRO. > > If you fancy trying GRO API, see attached patch. Note that you might > need to do some contextual adjustment as this patch is for upstream > kernel. > > Wei. > > ---8<--- > From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001 > From: Wei Liu <wei.liu2@citrix.com> > Date: Wed, 18 Sep 2013 16:46:23 +0100 > Subject: [PATCH] xen-netfront: convert to GRO API > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/net/xen-netfront.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 36808bf..dd1011e 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev, > u64_stats_update_end(&stats->syncp); > > /* Pass it up. */ > - netif_receive_skb(skb); > + napi_gro_receive(&np->napi, skb); > } > > return packets_dropped; > @@ -1051,6 +1051,8 @@ err: > if (work_done < budget) { > int more_to_do = 0; > > + napi_gro_flush(napi, false); > + > local_irq_save(flags); > > RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do); > -- > 1.7.10.4I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the following patch (your patch plus the advertisement of NETIF_F_GRO) : --------- diff --git a/xen-netfront.c.orig b/xen-netfront.c index 23e467d..bc673d3 100644 --- a/xen-netfront.c.orig +++ b/xen-netfront.c @@ -818,6 +818,7 @@ static int handle_incoming_queue(struct net_device *dev, { int packets_dropped = 0; struct sk_buff *skb; + struct netfront_info *np = netdev_priv(dev); while ((skb = __skb_dequeue(rxq)) != NULL) { struct page *page = NETFRONT_SKB_CB(skb)->page; @@ -846,7 +847,7 @@ static int handle_incoming_queue(struct net_device *dev, dev->stats.rx_bytes += skb->len; /* Pass it up. */ - netif_receive_skb(skb); + napi_gro_receive(&np->napi, skb); } return packets_dropped; @@ -981,6 +982,7 @@ err: if (work_done < budget) { int more_to_do = 0; + napi_gro_flush(napi); local_irq_save(flags); RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do); @@ -1182,7 +1184,8 @@ static struct net_device * __devinit xennet_create_dev(struct xenbus_device *dev netif_napi_add(netdev, &np->napi, xennet_poll, 64); /* Assume all features and let xennet_set_features fix up. */ - netdev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO; + netdev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO | + NETIF_F_GRO; SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops); SET_NETDEV_DEV(netdev, &dev->dev); ----------- tcpdump showed that the guest interface received large packets. I haven''t checked upstream kernel as guest though. Anirban
Anirban Chakraborty
2013-Sep-18 21:06 UTC
Re: large packet support in netfront driver and guest network throughput
On Sep 17, 2013, at 7:28 PM, annie li <annie.li@oracle.com> wrote:> > On 2013-9-18 1:53, Anirban Chakraborty wrote: >> >> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: >> >>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: >>>> <snip> >>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected >>>>>> (with GRO turned on in the physical interface). However, when I run >>>>>> guest to guest, things fall off. Is large packet not supported in >>>>>> netfront? I thought otherwise. I looked at the code and I do not see >>>>>> any call to napi_gro_receive(), rather it is using >>>>>> netif_receive_skb(). netback seems to be sending GSO packets to the >>>>>> netfront, but it is being segmented to 1500 byte (as it appears from >>>>>> the tcpdump). >>>>>> >>>>> OK, I get your problem. >>>>> >>>>> Indeed netfront doesn''t make use of GRO API at the moment. >>>> This is true. >>>> But I am wondering why large packet is not segmented into mtu size >>>> with upstream kernel? I did see large packets with upsteam kernel on >>>> receive guest(test between 2 domus on same host). >>>> >>> I think Anirban''s setup is different. The traffic is from a DomU on >>> another host. >>> >>> I will need to setup testing environment with 10G link to test this. >>> >>> Anirban, can you share your setup, especially DomU kernel version, are >>> you using upstream kernel in DomU? >> Sure.. >> I have two hosts, say h1 and h2 running XenServer 6.1. >> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical >> guest, guest2. >> >> iperf server is running on guest1 with iperf client connecting from guest2. >> >> I haven''t tried with upstream kernel yet. However, what I found out is >> that the netback on the receiving host is transmitting GSO segments to the >> guest (guest1), but the packets are segmented at the netfront interface. > > Did you try the guests on same host in your environment?Yes, I did and it showed up 5.66 Gbps as opposed to 2.6-2.8Gbps.> >> >> Annie''s setup has both the guests running on the same host, in which case >> packets are looped back. > > If guests does not segment packets for same host case, it should not do segment for different host case. For current upstream, Netback->netfront mechanism does not treat differently for these two cases.For the two guests (Centos 6.4, 2.6.32.-358) on the same host, the receiving guest indeed receives large packet, while this is not true if the receiving guest is on a different host. Since both the vifs are connected to the same linux bridge and hence packets passed between them are forwarded without hitting the wire. This explains the higher throughput. However, I would still expect segmented packet at the receiving guest, but certainly that is not what I am seeing. Anirban
Wei Liu
2013-Sep-19 09:41 UTC
Re: large packet support in netfront driver and guest network throughput
On Wed, Sep 18, 2013 at 08:38:01PM +0000, Anirban Chakraborty wrote:> > On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote: > > > On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote: > >> > >> > >> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: > >> > >>> On Tue, Sep 17, 2013 at 10:09:21AM +0800, annie li wrote: > >>>> <snip> > >>>>>> I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected > >>>>>> (with GRO turned on in the physical interface). However, when I run > >>>>>> guest to guest, things fall off. Is large packet not supported in > >>>>>> netfront? I thought otherwise. I looked at the code and I do not see > >>>>>> any call to napi_gro_receive(), rather it is using > >>>>>> netif_receive_skb(). netback seems to be sending GSO packets to the > >>>>>> netfront, but it is being segmented to 1500 byte (as it appears from > >>>>>> the tcpdump). > >>>>>> > >>>>> OK, I get your problem. > >>>>> > >>>>> Indeed netfront doesn''t make use of GRO API at the moment. > >>>> > >>>> This is true. > >>>> But I am wondering why large packet is not segmented into mtu size > >>>> with upstream kernel? I did see large packets with upsteam kernel on > >>>> receive guest(test between 2 domus on same host). > >>>> > >>> > >>> I think Anirban''s setup is different. The traffic is from a DomU on > >>> another host. > >>> > >>> I will need to setup testing environment with 10G link to test this. > >>> > >>> Anirban, can you share your setup, especially DomU kernel version, are > >>> you using upstream kernel in DomU? > >> > >> Sure.. > >> I have two hosts, say h1 and h2 running XenServer 6.1. > >> h1 running Centos 6.4, 64bit kernel, say guest1 and h2 running identical > >> guest, guest2. > >> > > > > Do you have exact version of your DomU'' kernel? Is it available > > somewhere online? > > Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time. >So that''s a RHEL kernel, you might also want to ask Redhat to have a look at that?> > > >> iperf server is running on guest1 with iperf client connecting from guest2. > >> > >> I haven''t tried with upstream kernel yet. However, what I found out is > >> that the netback on the receiving host is transmitting GSO segments to the > >> guest (guest1), but the packets are segmented at the netfront interface. > >> > > > > I just tried, with vanilla upstream kernel I can see large packet size > > on DomU''s side. > > > > I also tried to convert netfront to use GRO API (hopefully I didn''t get > > it wrong), I didn''t see much improvement -- it''s quite obvious because I > > already saw large packet even without GRO. > > > > If you fancy trying GRO API, see attached patch. Note that you might > > need to do some contextual adjustment as this patch is for upstream > > kernel. > > > > Wei. > > > > ---8<--- > > From ca532dd11d7b8f5f8ce9d2b8043dd974d9587cb0 Mon Sep 17 00:00:00 2001 > > From: Wei Liu <wei.liu2@citrix.com> > > Date: Wed, 18 Sep 2013 16:46:23 +0100 > > Subject: [PATCH] xen-netfront: convert to GRO API > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/net/xen-netfront.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > > index 36808bf..dd1011e 100644 > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev, > > u64_stats_update_end(&stats->syncp); > > > > /* Pass it up. */ > > - netif_receive_skb(skb); > > + napi_gro_receive(&np->napi, skb); > > } > > > > return packets_dropped; > > @@ -1051,6 +1051,8 @@ err: > > if (work_done < budget) { > > int more_to_do = 0; > > > > + napi_gro_flush(napi, false); > > + > > local_irq_save(flags); > > > > RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do); > > -- > > 1.7.10.4 > > > I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the following patch (your patch plus the advertisement of NETIF_F_GRO) :OK, thanks for reporting back. I''m curious about the packet size after enabling GRO. I can get 5G/s upstream with packet size ~24K on a 10G nic. It''s not line rate yet, certainly there is space for improvement. Wei.> --------- > diff --git a/xen-netfront.c.orig b/xen-netfront.c > index 23e467d..bc673d3 100644 > --- a/xen-netfront.c.orig > +++ b/xen-netfront.c > @@ -818,6 +818,7 @@ static int handle_incoming_queue(struct net_device *dev, > { > int packets_dropped = 0; > struct sk_buff *skb; > + struct netfront_info *np = netdev_priv(dev); > > while ((skb = __skb_dequeue(rxq)) != NULL) { > struct page *page = NETFRONT_SKB_CB(skb)->page; > @@ -846,7 +847,7 @@ static int handle_incoming_queue(struct net_device *dev, > dev->stats.rx_bytes += skb->len; > > /* Pass it up. */ > - netif_receive_skb(skb); > + napi_gro_receive(&np->napi, skb); > } > > return packets_dropped; > @@ -981,6 +982,7 @@ err: > if (work_done < budget) { > int more_to_do = 0; > > + napi_gro_flush(napi); > local_irq_save(flags); > > RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do); > @@ -1182,7 +1184,8 @@ static struct net_device * __devinit xennet_create_dev(struct xenbus_device *dev > netif_napi_add(netdev, &np->napi, xennet_poll, 64); > > /* Assume all features and let xennet_set_features fix up. */ > - netdev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO; > + netdev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO | > + NETIF_F_GRO; > > SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops); > SET_NETDEV_DEV(netdev, &dev->dev); > ----------- > tcpdump showed that the guest interface received large packets. I haven''t checked upstream kernel as guest though. > > Anirban >
Anirban Chakraborty
2013-Sep-19 16:59 UTC
Re: large packet support in netfront driver and guest network throughput
On Sep 19, 2013, at 2:41 AM, Wei Liu <wei.liu2@citrix.com> wrote:> On Wed, Sep 18, 2013 at 08:38:01PM +0000, Anirban Chakraborty wrote: >> >> On Sep 18, 2013, at 8:48 AM, Wei Liu <wei.liu2@citrix.com> wrote: >> >>> On Tue, Sep 17, 2013 at 05:53:43PM +0000, Anirban Chakraborty wrote: >>>> >>>> >>>> On 9/17/13 1:25 AM, "Wei Liu" <wei.liu2@citrix.com> wrote: >>>> <snip> >> >> Yes, it is 2.6.32-358.el6.x86_64. Sorry, I missed it out last time. >> > > So that''s a RHEL kernel, you might also want to ask Redhat to have a > look at that?It is not RHEL kernel, Centos 6.4. This being a netfront driver issue, I think we should address it here.> >>> <snip>-- >>> 1.7.10.4 >> >> >> I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the following patch (your patch plus the advertisement of NETIF_F_GRO) : > > OK, thanks for reporting back. > > I''m curious about the packet size after enabling GRO. I can get 5G/s > upstream with packet size ~24K on a 10G nic. It''s not line rate yet, > certainly there is space for improvement.I am seeing varying packet sizes with the GRO patch, from 2K to all the way upto 64K. I do not think we can get line rate by enabling GRO only. The netback thread that is handling the guest traffic is running on a different CPU (and possibly different node) compared to the guest. If we can schedule netback for a guest to run on the same node as the guest, we should be able to see better numbers. In any case, are you going to submit the patch upstream or should I do it? Thanks. Anirban
Wei Liu
2013-Sep-19 18:43 UTC
Re: large packet support in netfront driver and guest network throughput
On Thu, Sep 19, 2013 at 04:59:49PM +0000, Anirban Chakraborty wrote: [...]> >> > >> I was able to see a bit of improvement (from 2.65 to 3.6 Gbps) with the following patch (your patch plus the advertisement of NETIF_F_GRO) : > > > > OK, thanks for reporting back. > > > > I''m curious about the packet size after enabling GRO. I can get 5G/s > > upstream with packet size ~24K on a 10G nic. It''s not line rate yet, > > certainly there is space for improvement. > > I am seeing varying packet sizes with the GRO patch, from 2K to all the way upto 64K. > I do not think we can get line rate by enabling GRO only. The netback thread that is handling the guest traffic is running on a different CPU (and possibly different node) compared to the guest. If we can schedule netback for a guest to run on the same node as the guest, we should be able to see better numbers.You can use vcpu pin to pin Dom0''s CPUs and Dom0''s CPUs to the same NUMA node. However domain''s memory might still be striped across different nodes.> In any case, are you going to submit the patch upstream or should I do it?I will do that once net-next is open. Wei.> Thanks. > > Anirban > >
Wei Liu
2013-Sep-19 19:04 UTC
Re: large packet support in netfront driver and guest network throughput
On Thu, Sep 19, 2013 at 07:43:15PM +0100, Wei Liu wrote: [...]> > > In any case, are you going to submit the patch upstream or should I do it? > > I will do that once net-next is open. >I will add your SoB to that patch, is that OK? Wei.> Wei. > > > Thanks. > > > > Anirban > > > >
Anirban Chakraborty
2013-Sep-19 20:54 UTC
Re: large packet support in netfront driver and guest network throughput
On Sep 19, 2013, at 12:04 PM, Wei Liu <wei.liu2@citrix.com> wrote:> On Thu, Sep 19, 2013 at 07:43:15PM +0100, Wei Liu wrote: > [...] >> >>> In any case, are you going to submit the patch upstream or should I do it? >> >> I will do that once net-next is open. >> > > I will add your SoB to that patch, is that OK?Thats fine. Thanks. Anirban
Reasonably Related Threads
- [PATCH net v2] virtio-net: don't disable guest csum when disable LRO
- [PATCH net-next] virtio_net: add gro capability
- [PATCH net-next] virtio_net: add gro capability
- nic poor performance after upgrade to xen 3.2
- Interesting observation with network event notification and batching