Joel Richard
2009-Mar-07 13:08 UTC
[Xen-users] Massive numbers of dropped packets / ssh problems
Good morning, I''m hoping that someone can help me out with this. I recently moved a running server from an internal network to a public-facing network (moved from our office to a data center) Prior to this, we had no trouble with the server, but since we''ve put it on the net and gave it some public IPs (versus the 192.168.x.x addresses it had before) we''ve had trouble with SSH and Web connections onto one of the DomU servers (possibly all of them). That is, sometimes when we connect from SSH or HTTP, it will sometimes "time out" somehow (not your usual connection timed out) and we need to try again a couple of times to get the server to respond. It seems to happen most when the server has been idle for some time, like first thing in the morning. On the Dom0 if I do an ifconfig, I get the info that''s at the bottom of this message (IPs have been changed to protect the innocent) The server has 4 DomU''s running on it. Note that the peth1 interface has 79 billion dropped packets. The server has been running for a week. My first thought is "what the hell?" :) I can''t find those dropped packets with tcpdump and yet I am convinced that this is a problem. When we SSH into the server (the DomU), sometimes it will hang with this information: (my laptop) $ ssh -v XXX.XXX.XXX.XXX OpenSSH_5.0p1, OpenSSL 0.9.7l 28 Sep 2006 debug1: Reading configuration data /Users/joel/.ssh/config debug1: Applying options for dev debug1: Reading configuration data /etc/ssh_config debug1: Connecting to dev.richard-group.com [XXX.XXX.XXX.XXX] port 22. debug1: Connection established. debug1: identity file /Users/joel/.ssh/identity type -1 debug1: identity file /Users/joel/.ssh/id_rsa type 1 debug1: identity file /Users/joel/.ssh/id_dsa type 2 Yes, that''s where it stops. It''s already established the connection, but it doesn''t continue. It initially sounds like an SSH problem, but it seems more to me something possibly between the Dom0 and the DomU. I do not have this problem with ssh on a real server on the same network that is not using Xen. To clarify, I am using Debian Etch, Xen 3.0.3, AMD Phenom CPUs, 2.6.18-6-xen-amd64 #1 SMP kernel. We''re using the onboard NIC. Can anyone help? Thanks a lot. --Joel (dom0) # ifconfig eth1 Link encap:Ethernet HWaddr 00:1F:D0:99:C8:1D inet addr:XXX.XXX.XXX.XXX Bcast:XXX.XXX.XXX.XXX Mask: 255.255.255.192 inet6 addr: XXX.XXX.XXX.XXX/64 Scope:Link UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:790038 errors:0 dropped:0 overruns:0 frame:0 TX packets:30111 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:259565273 (247.5 MiB) TX bytes:8294720 (7.9 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:3231189 errors:0 dropped:79653025568 overruns:0 frame:0 TX packets:1839869 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1834207650 (1.7 GiB) TX bytes:617376732 (588.7 MiB) Interrupt:16 Base address:0xa000 vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:30111 errors:0 dropped:0 overruns:0 frame:0 TX packets:790038 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8294720 (7.9 MiB) TX bytes:259565273 (247.5 MiB) vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:1621897 errors:0 dropped:0 overruns:0 frame:0 TX packets:2841674 errors:0 dropped:1371 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:437418179 (417.1 MiB) TX bytes:2168128688 (2.0 GiB) vif2.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:10114591 errors:0 dropped:0 overruns:0 frame:0 TX packets:11032176 errors:0 dropped:244457 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8708636564 (8.1 GiB) TX bytes:10165626759 (9.4 GiB) vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:10714696 errors:0 dropped:0 overruns:0 frame:0 TX packets:10554522 errors:0 dropped:220230 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:10211262961 (9.5 GiB) TX bytes:8503978649 (7.9 GiB) vif8.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:6432 errors:0 dropped:0 overruns:0 frame:0 TX packets:39471 errors:0 dropped:50 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1485151 (1.4 MiB) TX bytes:12072170 (11.5 MiB) xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:579865 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:113821662 (108.5 MiB) TX bytes:0 (0.0 b) _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
tom.ashley@gmail.com
2009-Mar-07 14:39 UTC
Re: [Xen-users] Massive numbers of dropped packets / ssh problems
That actaully sounds like a reverse dns lookup issue from your server. Make sure the ssh server can resolve the ip you are coming from. Tom Sent using BlackBerry® from Orange -----Original Message----- From: Joel Richard <xen@joelrichard.com> Date: Sat, 7 Mar 2009 08:08:49 To: <xen-users@lists.xensource.com> Subject: [Xen-users] Massive numbers of dropped packets / ssh problems Good morning, I'm hoping that someone can help me out with this. I recently moved a running server from an internal network to a public-facing network (moved from our office to a data center) Prior to this, we had no trouble with the server, but since we've put it on the net and gave it some public IPs (versus the 192.168.x.x addresses it had before) we've had trouble with SSH and Web connections onto one of the DomU servers (possibly all of them). That is, sometimes when we connect from SSH or HTTP, it will sometimes "time out" somehow (not your usual connection timed out) and we need to try again a couple of times to get the server to respond. It seems to happen most when the server has been idle for some time, like first thing in the morning. On the Dom0 if I do an ifconfig, I get the info that's at the bottom of this message (IPs have been changed to protect the innocent) The server has 4 DomU's running on it. Note that the peth1 interface has 79 billion dropped packets. The server has been running for a week. My first thought is "what the hell?" :) I can't find those dropped packets with tcpdump and yet I am convinced that this is a problem. When we SSH into the server (the DomU), sometimes it will hang with this information: (my laptop) $ ssh -v XXX.XXX.XXX.XXX OpenSSH_5.0p1, OpenSSL 0.9.7l 28 Sep 2006 debug1: Reading configuration data /Users/joel/.ssh/config debug1: Applying options for dev debug1: Reading configuration data /etc/ssh_config debug1: Connecting to dev.richard-group.com [XXX.XXX.XXX.XXX] port 22. debug1: Connection established. debug1: identity file /Users/joel/.ssh/identity type -1 debug1: identity file /Users/joel/.ssh/id_rsa type 1 debug1: identity file /Users/joel/.ssh/id_dsa type 2 Yes, that's where it stops. It's already established the connection, but it doesn't continue. It initially sounds like an SSH problem, but it seems more to me something possibly between the Dom0 and the DomU. I do not have this problem with ssh on a real server on the same network that is not using Xen. To clarify, I am using Debian Etch, Xen 3.0.3, AMD Phenom CPUs, 2.6.18-6-xen-amd64 #1 SMP kernel. We're using the onboard NIC. Can anyone help? Thanks a lot. --Joel (dom0) # ifconfig eth1 Link encap:Ethernet HWaddr 00:1F:D0:99:C8:1D inet addr:XXX.XXX.XXX.XXX Bcast:XXX.XXX.XXX.XXX Mask: 255.255.255.192 inet6 addr: XXX.XXX.XXX.XXX/64 Scope:Link UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:790038 errors:0 dropped:0 overruns:0 frame:0 TX packets:30111 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:259565273 (247.5 MiB) TX bytes:8294720 (7.9 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:3231189 errors:0 dropped:79653025568 overruns:0 frame:0 TX packets:1839869 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1834207650 (1.7 GiB) TX bytes:617376732 (588.7 MiB) Interrupt:16 Base address:0xa000 vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:30111 errors:0 dropped:0 overruns:0 frame:0 TX packets:790038 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8294720 (7.9 MiB) TX bytes:259565273 (247.5 MiB) vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:1621897 errors:0 dropped:0 overruns:0 frame:0 TX packets:2841674 errors:0 dropped:1371 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:437418179 (417.1 MiB) TX bytes:2168128688 (2.0 GiB) vif2.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:10114591 errors:0 dropped:0 overruns:0 frame:0 TX packets:11032176 errors:0 dropped:244457 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8708636564 (8.1 GiB) TX bytes:10165626759 (9.4 GiB) vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:10714696 errors:0 dropped:0 overruns:0 frame:0 TX packets:10554522 errors:0 dropped:220230 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:10211262961 (9.5 GiB) TX bytes:8503978649 (7.9 GiB) vif8.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:6432 errors:0 dropped:0 overruns:0 frame:0 TX packets:39471 errors:0 dropped:50 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1485151 (1.4 MiB) TX bytes:12072170 (11.5 MiB) xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:579865 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:113821662 (108.5 MiB) TX bytes:0 (0.0 b) _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Joel Richard
2009-Mar-08 14:11 UTC
Re: [Xen-users] Massive numbers of dropped packets / ssh problems
That was my gut instinct, as well, Tom, but if I let the connection sit, it will eventually time out after a few minutes. And it still doesn''t explain the barrage of dropped packets on peth1. I have to admit that this situation is beyond my understanding and skills which is why I came to the lit for help. Could it simply be that whatever it is that is dealing with the now 89 billion dropped packets is simply overloaded and can''t provide a timely response to a normal SSH request? I mean 89 billion in a week is 140,000 in a second. What the heck could be going on to provide that many dropped packets? I just don''t get it. I''m now leaning towards some sort of driver problem with the NIC card since, upon reboot it shows 2 billion packets dropped. How can this be? It must be a problem with the network card. As for SSH problems could it be an ARP problem? I know that my upstream router does some sort of ARP caching. I have not yet established any patterns here, but there are a few tests I want to run. Thanks, --Joel On Mar 7, 2009, at 9:39 AM, tom.ashley@gmail.com wrote:> That actaully sounds like a reverse dns lookup issue from your server. > Make sure the ssh server can resolve the ip you are coming from. > > Tom > Sent using BlackBerry® from Orange > > -----Original Message----- > From: Joel Richard <xen@joelrichard.com> > > Date: Sat, 7 Mar 2009 08:08:49 > To: <xen-users@lists.xensource.com> > Subject: [Xen-users] Massive numbers of dropped packets / ssh problems > > > Good morning, > > I''m hoping that someone can help me out with this. I recently moved a > running server from an internal network to a public-facing network > (moved from our office to a data center) > > Prior to this, we had no trouble with the server, but since we''ve put > it on the net and gave it some public IPs (versus the 192.168.x.x > addresses it had before) we''ve had trouble with SSH and Web > connections onto one of the DomU servers (possibly all of them). That > is, sometimes when we connect from SSH or HTTP, it will sometimes > "time out" somehow (not your usual connection timed out) and we need > to try again a couple of times to get the server to respond. It seems > to happen most when the server has been idle for some time, like first > thing in the morning. > > On the Dom0 if I do an ifconfig, I get the info that''s at the bottom > of this message (IPs have been changed to protect the innocent) The > server has 4 DomU''s running on it. > > Note that the peth1 interface has 79 billion dropped packets. The > server has been running for a week. My first thought is "what the > hell?" :) I can''t find those dropped packets with tcpdump and yet I am > convinced that this is a problem. > > When we SSH into the server (the DomU), sometimes it will hang with > this information: > > (my laptop) $ ssh -v XXX.XXX.XXX.XXX > > OpenSSH_5.0p1, OpenSSL 0.9.7l 28 Sep 2006 > debug1: Reading configuration data /Users/joel/.ssh/config > debug1: Applying options for dev > debug1: Reading configuration data /etc/ssh_config > debug1: Connecting to dev.richard-group.com [XXX.XXX.XXX.XXX] port 22. > debug1: Connection established. > debug1: identity file /Users/joel/.ssh/identity type -1 > debug1: identity file /Users/joel/.ssh/id_rsa type 1 > debug1: identity file /Users/joel/.ssh/id_dsa type 2 > > Yes, that''s where it stops. It''s already established the connection, > but it doesn''t continue. It initially sounds like an SSH problem, but > it seems more to me something possibly between the Dom0 and the DomU. > I do not have this problem with ssh on a real server on the same > network that is not using Xen. > > To clarify, I am using Debian Etch, Xen 3.0.3, AMD Phenom CPUs, > 2.6.18-6-xen-amd64 #1 SMP kernel. We''re using the onboard NIC. > > Can anyone help? > > Thanks a lot. > --Joel > > > (dom0) # ifconfig > > eth1 Link encap:Ethernet HWaddr 00:1F:D0:99:C8:1D > inet addr:XXX.XXX.XXX.XXX Bcast:XXX.XXX.XXX.XXX Mask: > 255.255.255.192 > inet6 addr: XXX.XXX.XXX.XXX/64 Scope:Link > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > RX packets:790038 errors:0 dropped:0 overruns:0 frame:0 > TX packets:30111 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:259565273 (247.5 MiB) TX bytes:8294720 (7.9 MiB) > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:3231189 errors:0 dropped:79653025568 overruns:0 > frame:0 > TX packets:1839869 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:1834207650 (1.7 GiB) TX bytes:617376732 (588.7 MiB) > Interrupt:16 Base address:0xa000 > > vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:30111 errors:0 dropped:0 overruns:0 frame:0 > TX packets:790038 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:8294720 (7.9 MiB) TX bytes:259565273 (247.5 MiB) > > vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:1621897 errors:0 dropped:0 overruns:0 frame:0 > TX packets:2841674 errors:0 dropped:1371 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:437418179 (417.1 MiB) TX bytes:2168128688 (2.0 GiB) > > vif2.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:10114591 errors:0 dropped:0 overruns:0 frame:0 > TX packets:11032176 errors:0 dropped:244457 overruns:0 > carrier:0 > collisions:0 txqueuelen:0 > RX bytes:8708636564 (8.1 GiB) TX bytes:10165626759 (9.4 GiB) > > vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:10714696 errors:0 dropped:0 overruns:0 frame:0 > TX packets:10554522 errors:0 dropped:220230 overruns:0 > carrier:0 > collisions:0 txqueuelen:0 > RX bytes:10211262961 (9.5 GiB) TX bytes:8503978649 (7.9 GiB) > > vif8.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:6432 errors:0 dropped:0 overruns:0 frame:0 > TX packets:39471 errors:0 dropped:50 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:1485151 (1.4 MiB) TX bytes:12072170 (11.5 MiB) > > xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF > inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link > UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 > RX packets:579865 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:113821662 (108.5 MiB) TX bytes:0 (0.0 b) > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Joel Richard
2009-Mar-11 13:53 UTC
Re: [Xen-users] Massive numbers of dropped packets / ssh problems
Tom, ethtool -S gives me this, so now I''m thinking the high number of dropped packets is simply a quirk with the network card or driver, so I''m going to let it slide for the moment. # ethtool -S peth1 NIC statistics: tx_packets: 174781 rx_packets: 296975 tx_errors: 0 rx_errors: 0 rx_missed: 0 align_errors: 0 tx_single_collisions: 11974 tx_multi_collisions: 10024 unicast: 181041 broadcast: 30937 multicast: 115934 tx_aborted: 0 tx_underrun: 0 The larger concern is the fact that when I SSH into a DomU or go to a website provided by that DomU I will often time out. But a subsequent attempt will work. I have disabled DNS lookups for sshd ("UseDNS no" in /etc/ssh/sshd_config) and Apache (of course) is not configured to do that either. It also seems that the more people who are using that server during the day, the more likely it is that we will NOT get this happen. The more idle the computer has been (such as first thing in the morning) the more likely it is that we WILL get this to happen. Right now, I''m leaning towards an upstream router, which I know does ARP caching, but I can''t for the life of me figure out how that would be problem. It''s not like the MAC addresses are changing during the day. Seems to me that ARP might be the trouble, but I haven''t yet worked out the best way to test this reliably. Would it be better to switch from Bridged networking to Routed networking? Sounds to me that it may make sense to at least try it. Thanks, --Joel On Mar 8, 2009, at 2:27 PM, Tom Brown wrote:> On Sun, 8 Mar 2009, Joel Richard wrote: > >> That was my gut instinct, as well, Tom, but if I let the connection >> sit, it will eventually time out after a few minutes. > > So what? That''s supposed to be a diagnosis that rules out DNS issues? > > ssh closes connections that don''t authenticate within a reasonable > time period. It''s actually a significant problem when logging from a > slow box like my cell phone which takes forever to exchange > encryption keys, (and then a while to type in a password with some > non-standard characters). > >> And it still doesn''t explain the barrage of dropped packets on peth1. > > No one suggested it did. But that''s a more complex problem to > diagnose. it could be the PROMISC flag on eth1, it could be that > dom0 doesn''t have enough RAM to buffer the network traffic, it could > be that you''re overloaded the CPU and that''s causing network buffers > to overflow, it could be some strange hardware interrupt taking too > long, it could just about anything. > > It could be perfectly normal, a result of a network card being in > promiscuous mode to handle bridging traffic, and getting lots of > packets that "aren''t for it" because you''re NOT on a switched > network. You''re not complaining about slow network traffic, and > stalled connections, which is what one would expect if you were > losing more than a few percent of legit traffic. > > Try running ethtool -S and seeing if you can get any more detail on > why the packets are dropped. > > -Tom > >> >> I have to admit that this situation is beyond my understanding and >> skills which is why I came to the lit for help. >> >> Could it simply be that whatever it is that is dealing with the now >> 89 billion dropped packets is simply overloaded and can''t provide a >> timely response to a normal SSH request? I mean 89 billion in a >> week is 140,000 in a second. What the heck could be going on to >> provide that many dropped packets? I just don''t get it. >> >> I''m now leaning towards some sort of driver problem with the NIC >> card since, upon reboot it shows 2 billion packets dropped. How can >> this be? It must be a problem with the network card. >> >> As for SSH problems could it be an ARP problem? I know that my >> upstream router does some sort of ARP caching. I have not yet >> established any patterns here, but there are a few tests I want to >> run. >> >> Thanks, >> --Joel >> >> On Mar 7, 2009, at 9:39 AM, tom.ashley@gmail.com wrote: >> >>> That actaully sounds like a reverse dns lookup issue from your >>> server. >>> Make sure the ssh server can resolve the ip you are coming from. >>> Tom >>> Sent using BlackBerry® from Orange >>> -----Original Message----- >>> From: Joel Richard <xen@joelrichard.com> >>> Date: Sat, 7 Mar 2009 08:08:49 >>> To: <xen-users@lists.xensource.com> >>> Subject: [Xen-users] Massive numbers of dropped packets / ssh >>> problems >>> Good morning, >>> I''m hoping that someone can help me out with this. I recently >>> moved a >>> running server from an internal network to a public-facing network >>> (moved from our office to a data center) >>> Prior to this, we had no trouble with the server, but since we''ve >>> put >>> it on the net and gave it some public IPs (versus the 192.168.x.x >>> addresses it had before) we''ve had trouble with SSH and Web >>> connections onto one of the DomU servers (possibly all of them). >>> That >>> is, sometimes when we connect from SSH or HTTP, it will sometimes >>> "time out" somehow (not your usual connection timed out) and we need >>> to try again a couple of times to get the server to respond. It >>> seems >>> to happen most when the server has been idle for some time, like >>> first >>> thing in the morning. >>> On the Dom0 if I do an ifconfig, I get the info that''s at the bottom >>> of this message (IPs have been changed to protect the innocent) The >>> server has 4 DomU''s running on it. >>> Note that the peth1 interface has 79 billion dropped packets. The >>> server has been running for a week. My first thought is "what the >>> hell?" :) I can''t find those dropped packets with tcpdump and yet >>> I am >>> convinced that this is a problem. >>> When we SSH into the server (the DomU), sometimes it will hang with >>> this information: >>> (my laptop) $ ssh -v XXX.XXX.XXX.XXX >>> OpenSSH_5.0p1, OpenSSL 0.9.7l 28 Sep 2006 >>> debug1: Reading configuration data /Users/joel/.ssh/config >>> debug1: Applying options for dev >>> debug1: Reading configuration data /etc/ssh_config >>> debug1: Connecting to dev.richard-group.com [XXX.XXX.XXX.XXX] >>> port 22. >>> debug1: Connection established. >>> debug1: identity file /Users/joel/.ssh/identity type -1 >>> debug1: identity file /Users/joel/.ssh/id_rsa type 1 >>> debug1: identity file /Users/joel/.ssh/id_dsa type 2 >>> Yes, that''s where it stops. It''s already established the connection, >>> but it doesn''t continue. It initially sounds like an SSH problem, >>> but >>> it seems more to me something possibly between the Dom0 and the >>> DomU. >>> I do not have this problem with ssh on a real server on the same >>> network that is not using Xen. >>> To clarify, I am using Debian Etch, Xen 3.0.3, AMD Phenom CPUs, >>> 2.6.18-6-xen-amd64 #1 SMP kernel. We''re using the onboard NIC. >>> Can anyone help? >>> Thanks a lot. >>> --Joel >>> (dom0) # ifconfig >>> eth1 Link encap:Ethernet HWaddr 00:1F:D0:99:C8:1D >>> inet addr:XXX.XXX.XXX.XXX Bcast:XXX.XXX.XXX.XXX Mask: >>> 255.255.255.192 >>> inet6 addr: XXX.XXX.XXX.XXX/64 Scope:Link >>> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >>> RX packets:790038 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:30111 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:259565273 (247.5 MiB) TX bytes:8294720 (7.9 MiB) >>> lo Link encap:Local Loopback >>> inet addr:127.0.0.1 Mask:255.0.0.0 >>> inet6 addr: ::1/128 Scope:Host >>> UP LOOPBACK RUNNING MTU:16436 Metric:1 >>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) >>> peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:3231189 errors:0 dropped:79653025568 overruns:0 >>> frame:0 >>> TX packets:1839869 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:1000 >>> RX bytes:1834207650 (1.7 GiB) TX bytes:617376732 (588.7 >>> MiB) >>> Interrupt:16 Base address:0xa000 >>> vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:30111 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:790038 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:8294720 (7.9 MiB) TX bytes:259565273 (247.5 MiB) >>> vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:1621897 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:2841674 errors:0 dropped:1371 overruns:0 >>> carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:437418179 (417.1 MiB) TX bytes:2168128688 (2.0 >>> GiB) >>> vif2.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:10114591 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:11032176 errors:0 dropped:244457 overruns:0 >>> carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:8708636564 (8.1 GiB) TX bytes:10165626759 (9.4 >>> GiB) >>> vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:10714696 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:10554522 errors:0 dropped:220230 overruns:0 >>> carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:10211262961 (9.5 GiB) TX bytes:8503978649 (7.9 >>> GiB) >>> vif8.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:6432 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:39471 errors:0 dropped:50 overruns:0 carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:1485151 (1.4 MiB) TX bytes:12072170 (11.5 MiB) >>> xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >>> inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link >>> UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 >>> RX packets:579865 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:0 >>> RX bytes:113821662 (108.5 MiB) TX bytes:0 (0.0 b) >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> > > ---------------------------------------------------------------------- > tbrown@BareMetal.com | Courage is doing what you''re afraid to do. > http://BareMetal.com/ | There can be no courage unless you''re scared. > | - Eddie Rickenbacker_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users