Ronny.Hegewald@online.de
2009-Oct-26 23:37 UTC
[Xen-devel] huge tcp performance-regression on pvops - kernel (+Solution)
Setup: pvops dom0 kernel 2.6.31.4 from Jeremys git-repository from 2009-10-18 Problem: Very slow network-performance (3-10 kbs) when tcp-packets are used (noticed when using scp, samba, nfs over tcp) But this occurs only in the following situations: 1.) domU to domU on same PC (domUs are paravirtualized linux-kernels) 2.) domU to another PC that doesn''t use pvops-kernel (packets sent from another PC to domU works fine) domU to dom0 and the opposite way works without performance-regression. Reason: bigger tcp-packets get dropped from the domU the tcp-packets are sent from (netstat -s in domU shows many retransmitted tcp-segments) tcpdump shows that the bigger packets leave the vif from the domU they were sent from, but never arrive the vif from the domU they are sent to. This is caused by this lines in drivers/xen/netback.c at line 1325 : if (skb->data_len < skb_shinfo(skb)->gso_size) { skb_shinfo(skb)->gso_size = 0; skb_shinfo(skb)->gso_type = 0; } These lines were reverted from the linux-2.6.18-xen mercurial repository on 2009-01-13 in changeset 774: 107e10e0e07c: netfront/back: do not mark packets of length < MSS as GSO I used the patch on the above mentioned pvops tree and the problem was gone. I never noticed such problems on the 2.6.18-xen kernel or the forward-ported xen-kernel 2.6.31.4 (from Andrew Lyon) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-27 14:53 UTC
Re: [Xen-devel] huge tcp performance-regression on pvops - kernel (+Solution)
Cc¹ing Jeremy. I think the patch of interest in linux-2.6.18-xen is 776, which is the one that removes those lines from the front/back drivers in favour of a solution integrated into the protocol stack. -- Keir On 26/10/2009 23:37, "Ronny.Hegewald@online.de" <Ronny.Hegewald@online.de> wrote:> Setup: pvops dom0 kernel 2.6.31.4 from Jeremys git-repository from 2009-10-18 > > > Problem: Very slow network-performance (3-10 kbs) when tcp-packets are used > (noticed when using scp, samba, nfs over tcp) > > > > But this occurs only in the following situations: > > > > 1.) domU to domU on same PC (domUs are paravirtualized linux-kernels) > > 2.) domU to another PC that doesn''t use pvops-kernel (packets sent from > another PC to domU works fine) > > domU to dom0 and the opposite way works without performance-regression. > > > Reason: bigger tcp-packets get dropped from the domU the tcp-packets are sent > from (netstat -s in domU shows many retransmitted tcp-segments) > > > > tcpdump shows that the bigger packets leave the vif from the domU they were > sent from, but never arrive the vif from the domU they are sent to. > > > This is caused by this lines in drivers/xen/netback.c at line 1325 : > > > > if (skb->data_len < skb_shinfo(skb)->gso_size) { > > skb_shinfo(skb)->gso_size = 0; > skb_shinfo(skb)->gso_type = 0; > } > > > These lines were reverted from the linux-2.6.18-xen mercurial repository > > > > on 2009-01-13 in changeset 774: 107e10e0e07c: netfront/back: do not mark > packets of length < MSS as GSO > <http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/107e10e0e07c> > > > > I used the patch on the above mentioned pvops tree and the problem was gone. > > > > I never noticed such problems on the 2.6.18-xen kernel or the forward-ported > xen-kernel 2.6.31.4 (from Andrew Lyon) >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2009-Oct-27 14:54 UTC
Re: [Xen-devel] huge tcp performance-regression on pvops - kernel (+Solution)
On Tue, Oct 27, 2009 at 12:37:25AM +0100, Ronny.Hegewald@online.de wrote:> < xmlns="http://www.w3.org/1999/x" xml:lang="en" lang="en"><title></title><head><meta http-equiv="Content-type" content="text/; charset=UTF-8" /><style type="text/css"> , body {overflow-x: visible; } { width:100%; height:100%;margin:0px; padding:0px; overflow-y: auto; overflow-x: auto; }body { font-size: 100.01%; font-family : Verdana, Geneva, Arial, Helvetica, sans-serif; background-color:transparent; overflow:show; background-image:none; margin:0px; padding:5px; }p { margin:0px; padding:0px; } body { font-size: 12px; font-family : Verdana, Geneva, Arial, Helvetica, sans-serif; } p { margin: 0; padding: 0; } blockquote { padding-left: 5px; margin-left: 5px; margin-bottom: 0px; margin-top: 0px; } blockquote.quote { border-left: 1px solid #CCC; padding-left: 5px; margin-left: 5px; } .misspelled { background: transparent url(//webmailerng.1und1.de/static_resource/mailclient/widgets/basic/parts/maileditor/spellchecking_underline.gif) repeat-x scroll center bottom; } .correct {} .unknown {} .ignored {}</style></head><body id="bodyElement" style=""> > <p>Setup: pvops dom0 kernel 2.6.31.4 from Jeremys git-repository from 2009-10-18<span></span></p><p id="__paragraph__1256597098000" style=""><span style=""><br></span></p>Problem: Very slow network-performance (3-10 kbs) when tcp-packets are used (noticed when using scp, samba, nfs over tcp)<p id="__paragraph__1256597073000" style=""><br></p><p id="__paragraph__1256597073000" style="">But this occurs only in the following situations:<br></p><p id="__paragraph__1256597073000" style=""> <span></span></p><p id="__paragraph__1256597073000" style="">1.) domU to domU on same PC (domUs are paravirtualized linux-kernels)<br></p><p id="__paragraph__1256597073000" style="">2.) domU to another PC that doesn''t use<span id="misspelled-have" class="misspelled" name="misspelled-have"></span> pvops-kernel (packets sent from another PC to domU works fine)<br></p><br>domU to dom0 and the opposite way works without performance-regression.<br><br><p>Reason: bigger tcp-packets get dropped from the domU the tcp-packets are sent from (netstat -s in domU shows many retransmitted tcp-segments)</p><p><br></p><p>tcpdump shows that the bigger packets leave the vif from the domU they were sent from, but never arrive the vif from the domU they are sent to.<br><span></span></p><br><span style=""></span><p>This is caused by this lines in drivers/xen/netback.c at line 1325 :<span></span></p><p id="__paragraph__1256597423000" style=""><br><span style=""></span></p><p id="__paragraph__1256597423000" style=""> if (skb->data_len < skb_shinfo(skb)->gso_size) { > </p><div class="pre"> skb_shinfo(skb)->gso_size = 0;</div> > <div class="pre"> skb_shinfo(skb)->gso_type = 0;</div> > <div class="pre"> }</div><br><p id="__paragraph__1256597121000" style="">These lines were reverted from the linux-2.6.18-xen mercurial repository <br></p><p id="__paragraph__1256597121000" style=""><br></p><p id="__paragraph__1256597121000" style="">on 2009-01-13 in changeset 774: > <a href="http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/107e10e0e07c">107e10e0e07c: netfront/back: do not mark packets of length < MSS as GSO</a></p><p id="__paragraph__1256598462000" style=""><br><span style=""></span></p><p id="__paragraph__1256598462000" style=""><span style="">I used the patch on the above mentioned pvops tree and the problem was gone.</span><span></span></p><p id="__paragraph__1256600217000" style=""><br><span style=""></span></p><p id="__paragraph__1256600217000" style=""><span style="">I never noticed such problems on the 2.6.18-xen kernel or the forward-ported xen-kernel 2.6.31.4 (from </span>Andrew Lyon)</p></body></> >Kudos for discovering this. Did you verify whether the domU has the corresponding patch: diff -r 28acedb66302 -r 107e10e0e07c drivers/xen/netfront/netfront.c --- a/drivers/xen/netfront/netfront.c Wed Jan 07 12:21:54 2009 +0900 +++ b/drivers/xen/netfront/netfront.c Tue Jan 13 15:17:54 2009 +0000 @@ -1439,6 +1439,14 @@ np->stats.rx_packets++; np->stats.rx_bytes += skb->len; +#if HAVE_TSO + if (skb->data_len < skb_shinfo(skb)->gso_size) { + skb_shinfo(skb)->gso_size = 0; +#if HAVE_GSO + skb_shinfo(skb)->gso_type = 0; +#endif + } +#endif __skb_queue_tail(&rxq, skb); np->rx.rsp_cons = ++i; Looking at the PV_OPs kernel it doesn''t look to be there, but I was wondering if the domU you are using has it? Ian, Would it make sense to remove this changeset if most of the DomU''s don''t have this fix? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ronny.Hegewald@online.de
2009-Oct-27 20:34 UTC
Re: Re: [Xen-devel] huge tcp performance-regression on pvops - kernel
Cc¹ing Jeremy. I think the patch of interest in linux-2.6.18-xen is 776, which is the one that removes those lines from the front/back drivers in favour of a solution integrated into the protocol stack. -- Keir On 26/10/2009 23:37, "Ronny.Hegewald@online.de" <Ronny.Hegewald@online.de> wrote:> Setup: pvops dom0 kernel 2.6.31.4 from Jeremys git-repository from 2009-10-18 > > > Problem: Very slow network-performance (3-10 kbs) when tcp-packets are used > (noticed when using scp, samba, nfs over tcp) > > > > But this occurs only in the following situations: > > > > 1.) domU to domU on same PC (domUs are paravirtualized linux-kernels) > > 2.) domU to another PC that doesn''t use pvops-kernel (packets sent from > another PC to domU works fine) > > domU to dom0 and the opposite way works without performance-regression. > > > Reason: bigger tcp-packets get dropped from the domU the tcp-packets are sent > from (netstat -s in domU shows many retransmitted tcp-segments) > > > > tcpdump shows that the bigger packets leave the vif from the domU they were > sent from, but never arrive the vif from the domU they are sent to. > > > This is caused by this lines in drivers/xen/netback.c at line 1325 : > > > > if (skb->data_len < skb_shinfo(skb)->gso_size) { > > skb_shinfo(skb)->gso_size = 0; > skb_shinfo(skb)->gso_type = 0; > } > > > These lines were reverted from the linux-2.6.18-xen mercurial repository > > > > on 2009-01-13 in changeset 774: 107e10e0e07c: netfront/back: do not mark > packets of length < MSS as GSO > <http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/107e10e0e07c> > > > > I used the patch on the above mentioned pvops tree and the problem was gone. > > > > I never noticed such problems on the 2.6.18-xen kernel or the forward-ported > xen-kernel 2.6.31.4 (from Andrew Lyon) >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-27 20:42 UTC
Re: [Xen-devel] huge tcp performance-regression on pvops - kernel
On 10/27/09 13:34, Ronny.Hegewald@online.de wrote:>> >> >> Cc¹ing Jeremy. I think the patch of interest in linux-2.6.18-xen >> is 776, >> >> which is the one that removes those lines from the front/back >> drivers in >> >> favour of a solution integrated into the protocol stack. >> >> >> -- Keir >> >> >> Right, i patched it with 776 to solve the regression, not with 774. >> Copied the wrong changeset in the email.... >> >> And only the netback part, because the netfront-part of the patch >> isnt in the pvops-kernel or upstream. >I''m about to push this into xen.git. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel