Hi guys, I''ve been experiencing intermittent network problems. I can''t be sure if they are caused by Xen or if using Xen simply aggravates an existing problem (ie: hardware, driver, software bug, etc) but I thought I''d run it by you guys in hopes of getting some ideas. I''ve narrowed down the problem to be related to the named daemon from bind9. This may not exclusively be a "named" problem. Other daemons could be affected but I don''t have any others installed yet. The symptoms are as follows: extremely slow network with a lot of dropped packets. I can''t actually test the machine locally without having to go over the network because the server is remote from me. Based on the load averages alone, it appears it''s otherwise responsive. Pings to local IP addresses aren''t dropped. Only when the traffic has to leave the physical machine. Shutting down named makes the problem go away. Sometimes I can run it for a few hours or a day before it starts acting up. Once it acts up like this the only way I can run named again is by rebooting the physical machine (dom0) and restart the domUs. Pertinent system info: Dom0 and DomUs all running Ubuntu Oneiric with kernel 3.0.0-20. I tried older kernel versions as well with the same issues. NIC: Intel Gigabit, using the e1000 driver Network setup is NAT (per requirements of the hosting company I''m with. Routed or bridged isn''t an option for the domUs to have Internet access). I''m focussing my troubleshooting around named and trying to figure out exactly what it''s doing that triggers this network problem. Do any of you have suggestions, ideas or a fix if this is a known problem? If you need additional information, please let me know. Thanks, -- Gerard Beekmans _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Gerard Beekmans
2012-Jun-06 22:33 UTC
Re: Suspected network problem after domUs come online
Allow me to reply to my own email. The issue may not have anything to do with Xen but perhaps only loosely. The actual problem is a "simple" DoS attack to my IP addresses. The host is receiving a huge amount of incoming of DNS requests. As soon as this attack starts and there''s a DNS server listening on any of the domUs that happen to have that IP address the network slows to a crawl to the point the machine is overloaded and just locks up. Of note, the NIC link to the switch is set to 10 Mbit Full Duplex (that''s the speed I''ve signed up for with the hosting plan). Even if that link is flooded, the maximum incoming data is finite. I wouldn''t have expected a Linux box to keel over and die at such a speed. Perhaps it''s related to the fact the listening service runs in a DomU? Thanks, Gerard On Wed, Jun 6, 2012 at 5:02 PM, Gerard Beekmans <gerard@beekmansworld.com>wrote:> Hi guys, > > I''ve been experiencing intermittent network problems. I can''t be sure if > they are caused by Xen or if using Xen simply aggravates an existing > problem (ie: hardware, driver, software bug, etc) but I thought I''d run it > by you guys in hopes of getting some ideas. > > I''ve narrowed down the problem to be related to the named daemon from > bind9. This may not exclusively be a "named" problem. Other daemons could > be affected but I don''t have any others installed yet. > > The symptoms are as follows: extremely slow network with a lot of dropped > packets. I can''t actually test the machine locally without having to go > over the network because the server is remote from me. Based on the load > averages alone, it appears it''s otherwise responsive. Pings to local IP > addresses aren''t dropped. Only when the traffic has to leave the physical > machine. > > Shutting down named makes the problem go away. Sometimes I can run it for > a few hours or a day before it starts acting up. Once it acts up like this > the only way I can run named again is by rebooting the physical machine > (dom0) and restart the domUs. > > Pertinent system info: > > Dom0 and DomUs all running Ubuntu Oneiric with kernel 3.0.0-20. I tried > older kernel versions as well with the same issues. > NIC: Intel Gigabit, using the e1000 driver > > Network setup is NAT (per requirements of the hosting company I''m with. > Routed or bridged isn''t an option for the domUs to have Internet access). > > I''m focussing my troubleshooting around named and trying to figure out > exactly what it''s doing that triggers this network problem. Do any of you > have suggestions, ideas or a fix if this is a known problem? > > If you need additional information, please let me know. > > Thanks, > > -- > Gerard Beekmans > > > > >-- Gerard Beekmans _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Gerard Beekmans wrote:>The actual problem is a "simple" DoS attack to my IP addresses. The >host is receiving a huge amount of incoming of DNS requests. As soon >as this attack starts and there''s a DNS server listening on any of >the domUs that happen to have that IP address the network slows to a >crawl to the point the machine is overloaded and just locks up. > >Of note, the NIC link to the switch is set to 10 Mbit Full Duplex >(that''s the speed I''ve signed up for with the hosting plan). Even if >that link is flooded, the maximum incoming data is finite. I >wouldn''t have expected a Linux box to keel over and die at such a >speed. Perhaps it''s related to the fact the listening service runs >in a DomU?Don''t forget that if your pipe is being filled, the device(s) at the other end of it will "become unresponsive" as you''ll get lots of dropped packets. You''ve said yourself that while this is going on, internal communications seems OK. It could just be that you are getting enough traffic that the DNS replies (typically larger than the queries) will fill the pipe. Once that happens, packets will be dropped. TCP will cope with dropped packets up to a point, but if it''s severe, throughput will be terrible - responsiveness more so. I do know exactly what that''s like, I''ve worked remotely on machines when they''ve been under that sort of load - in my case, a users website had been compromised and the ***** was using it to attempt brute force logins against FTP servers. IIRC it had 1000 threads running, all doing brute force username/password guessing against a different address ! It''s the traffic and the effect it had on the network that flagged it up. In your case, there''s ***-all you can do about the attack, other than to not run a DNS server on that address until the attacker gets bored and moves on. *IF* it''s from a small number of addresses then your hosting provider may be able to block them upstream, but if it''s a distributed attack then that isn''t practical without blocking large chunks of the internet. You could rate-limit requests with an iptables rule - but you''ll still be paying (I assume) for the bandwidth consumed by the requests. If it is a single (or small number of) address(es) then a complaint to the owner of the IP block would be in order - they be unaware of the malware they are hosting. -- Simon Hobson Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books.