Hello, I recently set a new server (a HP Proliant with 2 Quad Xeon 2GHz and 8GB RAM), with a Slackware 13 64Bits, where I installed Xen 4.0, with the kernel 2.6.31.13 (the same for dom0 and for domU), and I set up a virtual mail server from which several people retrieve its mail. When i try to download big files or retrieve big mails from that server, it often start to slow down to 4-5Kb/s untill it finally stalls after some minutes of normal download (800-500Kb/s), it doesn''t happen nor in the same point of download, nor to the same users/computers. A user can try to download a mail, for example, 20 times without any success at all and then suddenly it starts to download correctly, and it seems to rarely happen with small mails. I tried to deactivate the ethernet checksuming, as I''ve seen to solve the problem in former versions, but it gives me an error getting the rx value: # ethtool -k eth0 Offload parameters for eth0: Cannot get device rx csum settings: Operation not supported rx-checksumming: off tx-checksumming: off By the way, my dmesg has a bunch of these errors: "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet" I''m not sure if this is also a related problem or not. Now I don''t have any clue about what or where to look, so, any ideas? _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
After a bunch of tests and sniffings I found out that the problem was not due to Xen, it is caused by the 2.6.3x kernels. It looks like all kernels after 2.6.18 have issues with the congestion control which make stall the connections at random; Disabling the whole congestion control stuff, seems to have resolved the problem. A Dimarts, 15 de Juny de 2010, David Diaz i Torrico va escriure:> Hello, > > I recently set a new server (a HP Proliant with 2 Quad Xeon 2GHz and 8GBRAM),> with a Slackware 13 64Bits, where I installed Xen 4.0, with the kernel > 2.6.31.13 (the same for dom0 and for domU), and I set up a virtual mailserver> from which several people retrieve its mail. > > When i try to download big files or retrieve big mails from that server, it > often start to slow down to 4-5Kb/s untill it finally stalls after someminutes> of normal download (800-500Kb/s), it doesn''t happen nor in the same point of > download, nor to the same users/computers. A user can try to download amail,> for example, 20 times without any success at all and then suddenly it starts > to download correctly, and it seems to rarely happen with small mails. > > I tried to deactivate the ethernet checksuming, as I''ve seen to solve the > problem in former versions, but it gives me an error getting the rx value: > # ethtool -k eth0 > Offload parameters for eth0: > Cannot get device rx csum settings: Operation not supported > rx-checksumming: off > tx-checksumming: off > > By the way, my dmesg has a bunch of these errors: > "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet" > I''m not sure if this is also a related problem or not. > > Now I don''t have any clue about what or where to look, so, any ideas? > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi!> After a bunch of tests and sniffings I found out that the problem was not due to > Xen, it is caused by the 2.6.3x kernels.The one issue you mention ("Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet") seems to be caused by an issue in netback. See eg. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=583366 ; you should be using at least 2.6.32.14 IIRC.> It looks like all kernels after 2.6.18 have issues with the congestion control > which make stall the connections at random; Disabling the whole congestion > control stuff, seems to have resolved the problem.Actually I don''t think so: Way too many people would stumble across such an issue within the kernel. Chances this goes unnoticed for such a long time are nil. I am pretty sure the upgrade to the latest kernel will fix the issue for you... -- Adi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
A Dilluns, 21 de Juny de 2010, Adi Kriegisch va escriure:> Hi! > > > After a bunch of tests and sniffings I found out that the problem was notdue to> > Xen, it is caused by the 2.6.3x kernels.> The one issue you mention ("Attempting to checksum a non-TCP/UDP packet, > dropping a protocol 1 packet") seems to be caused by an issue in netback. > See eg. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=583366 ; you > should be using at least 2.6.32.14 IIRC.I''ll give it a try, but it doesn''t seem to be the cause of the problem.> > It looks like all kernels after 2.6.18 have issues with the congestioncontrol> > which make stall the connections at random; Disabling the whole congestion > > control stuff, seems to have resolved the problem. > Actually I don''t think so: Way too many people would stumble across such an > issue within the kernel. Chances this goes unnoticed for such a long time > are nil.It only seems to happen to some Windows machines behind a NAT, and randomly, so it''s quite dificult to catch and to reproduce. I''ve a proxy box running ubuntu with a 2.6.32-22 kernel, with the same problem, but we didn''t ever noticed the problem until I ran the same tests that make the xen machine stall.> I am pretty sure the upgrade to the latest kernel will fix the issue for > you... > > -- Adi > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- Siau, --- Sistemes - DRAC telemàtic --------------------- David Diaz i Torrico C/Transversal 226 08225 Terrassa (Barcelona) Tel: +34 93 736 48 30 mailto:ddiaz@drac.com ---------------------------------------------------- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, Jun 21, 2010 at 4:12 AM, David Diaz i Torrico <ddiaz@drac.com>wrote:> > After a bunch of tests and sniffings I found out that the problem was not > due to > Xen, it is caused by the 2.6.3x kernels. > > It looks like all kernels after 2.6.18 have issues with the congestion > control > which make stall the connections at random; Disabling the whole congestion > control stuff, seems to have resolved the problem. > > >What exactly did you do to disable the congestion control? -Bruce> A Dimarts, 15 de Juny de 2010, David Diaz i Torrico va escriure: > > Hello, > > > > I recently set a new server (a HP Proliant with 2 Quad Xeon 2GHz and 8GB > RAM), > > with a Slackware 13 64Bits, where I installed Xen 4.0, with the kernel > > 2.6.31.13 (the same for dom0 and for domU), and I set up a virtual mail > server > > from which several people retrieve its mail. > > > > When i try to download big files or retrieve big mails from that server, > it > > often start to slow down to 4-5Kb/s untill it finally stalls after some > minutes > > of normal download (800-500Kb/s), it doesn''t happen nor in the same point > of > > download, nor to the same users/computers. A user can try to download a > mail, > > for example, 20 times without any success at all and then suddenly it > starts > > to download correctly, and it seems to rarely happen with small mails. > > > > I tried to deactivate the ethernet checksuming, as I''ve seen to solve the > > problem in former versions, but it gives me an error getting the rx > value: > > # ethtool -k eth0 > > Offload parameters for eth0: > > Cannot get device rx csum settings: Operation not supported > > rx-checksumming: off > > tx-checksumming: off > > > > By the way, my dmesg has a bunch of these errors: > > "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 > packet" > > I''m not sure if this is also a related problem or not. > > > > Now I don''t have any clue about what or where to look, so, any ideas? > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
This command: sysctl -w net.ipv4.tcp_ecn=0 A Dilluns, 21 de Juny de 2010, Bruce Edge va escriure:> On Mon, Jun 21, 2010 at 4:12 AM, David Diaz i Torrico <ddiaz@drac.com>wrote: > > > > > After a bunch of tests and sniffings I found out that the problem was not > > due to > > Xen, it is caused by the 2.6.3x kernels. > > > > It looks like all kernels after 2.6.18 have issues with the congestion > > control > > which make stall the connections at random; Disabling the whole congestion > > control stuff, seems to have resolved the problem. > > > > > > > What exactly did you do to disable the congestion control? > > -Bruce_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users