eric van blokland
2009-Mar-25 14:36 UTC
[Xen-users] Strange network issue; Guest/DomU outgoing traffic
Hello all, I''ve recently observed some strange behaviour on several DomU (spread over several xen setups). While I''ve found a few reports on similar issues, the discussions were fairly old and discontinued without providing a cause or solution. Every once in a while (sometimes with an interval of several months) I get a "host down" alert for a guest host. The host is actually very responsive and working fine in every aspect but for the network. Due to lack of time I reboot those guests and everything is fine again. (Read on, ofcourse I did some checks to identify the issue but production servers have to come back up.) Today a development guest is having exactly the same problem and I started to investigate. Tcpdump on the guest shows incomming traffic, however, no outgoing traffic is to be observed: Arp requests (who-has) are being received, but we never reply. After manually adding addresses to arp: ICMP ping/TCP testpackets/UDP testpackets are being received, never replied to. Trying to ping from the specific guest or sending anything through the network interface doesn''t show up in tcpdump. (writing to the socket does not give an error) Things I tried but did nothing: - Restarting the network (interface). - Changing the interface hw address. - Bringing the Dom0 virtual interface down and up again. (As suggested as temporary fix in an old discussion). Nothing of interest can be found in any log (for either the guest or Dom0). This particalur setup is Xen 3.1.? running CentOS5 as Dom0 and DomU using a network-bridge setup. Other guest(s) running simultaneous on the same hypervisor continue to work without any (networking) issue. Does anyone have an idea about the cause or a solution to this problem? Kind regards and thanks in advance, Eric _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2009-Mar-25 15:21 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
On Wed, Mar 25, 2009 at 9:36 PM, eric van blokland <ericvanblokland@gmail.com> wrote:> This particalur setup is Xen 3.1.? running CentOS5 as Dom0 and DomU > using a network-bridge setup. Other guest(s) running simultaneous on > the same hypervisor continue to work without any (networking) issue. > > Does anyone have an idea about the cause or a solution to this problem?Which Centos version? You should use at least 5.2. How did you assign domU mac? manually? Is it possible some other host on the network use the same MAC? Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Luke S Crawford
2009-Mar-25 22:13 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
eric van blokland <ericvanblokland@gmail.com> writes:> Tcpdump on the guest shows incomming traffic, > however, no outgoing traffic is to be observed: > > Arp requests (who-has) are being received, but we never reply. > > After manually adding addresses to arp: > > ICMP ping/TCP testpackets/UDP testpackets are being received, never replied to. > > Trying to ping from the specific guest or sending anything through the > network interface doesn''t show up in tcpdump. (writing to the socket > does not give an error)yup. A client had the same problem on actual RHEL. Never got it figured out. I poke a little more every time I get one, but it''s ongoing I''ve only seen it using i386 Dom0; my x86_64 Dom0s have never had the problem. I''d suggest upgrading (the dom0/hypervisor) to latest. If you get any traction on the issue, please email me. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer, Anna
2009-Mar-25 22:23 UTC
RE: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
> Subject: [Xen-users] Strange network issue; Guest/DomU outgoing traffic > > Hello all, > > I''ve recently observed some strange behaviour on several DomU (spread > over several xen setups). While I''ve found a few reports on similar > issues, the discussions were fairly old and discontinued without > providing a cause or solution. > > Every once in a while (sometimes with an interval of several months) I > get a "host down" alert for a guest host. The host is actually very > responsive and working fine in every aspect but for the network. Due > to lack of time I reboot those guests and everything is fine again. > (Read on, ofcourse I did some checks to identify the issue but > production servers have to come back up.) > > Today a development guest is having exactly the same problem and I > started to investigate. Tcpdump on the guest shows incomming traffic, > however, no outgoing traffic is to be observed: > > Arp requests (who-has) are being received, but we never reply.You are tracing this within the guest OS? Or are you tracing on the bridge within Dom0?> After manually adding addresses to arp: > > ICMP ping/TCP testpackets/UDP testpackets are being received, never > replied to. > > Trying to ping from the specific guest or sending anything through the > network interface doesn''t show up in tcpdump. (writing to the socket > does not give an error)Again, you are tracing this within the guest OS? What do the stats on the guest network interface say? Do you see packet counters increasing (both within the guest OS eth0, and on vifX.0 in Dom0?) when you are sending packet?> Things I tried but did nothing: > > - Restarting the network (interface). > - Changing the interface hw address. > - Bringing the Dom0 virtual interface down and up again. (As suggested > as temporary fix in an old discussion). > > Nothing of interest can be found in any log (for either the guest or > Dom0). > > This particalur setup is Xen 3.1.? running CentOS5 as Dom0 and DomU > using a network-bridge setup. Other guest(s) running simultaneous on > the same hypervisor continue to work without any (networking) issue.If you only have a single guest failing on Xen, then I would guess this is an issue with the guest OS and not with anything relating to Xen networking. What kernel does the guest OS run?> Does anyone have an idea about the cause or a solution to this problem?I have seen strange things with libpcap under certain guest OS versions when used with Xen''s netfront driver... libpcap is used for tcpdump and some network testing tools (potentially the ones you are using). It might be worth trying to also trace packets on the backend devices in Dom0, and also have a look at packet counters on all interfaces (Dom0/DomU). _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Luke S Crawford
2009-Mar-25 23:08 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
"Fischer, Anna" <anna.fischer@hp.com> writes:> You are tracing this within the guest OS? Or are you tracing on the bridge within Dom0?Assuming the original poster has the same problem I do, within the DomU. only one guest has a problem at a time, and the problem remains even if i ''xm save'' and then ''xm restore'' the guest.> Again, you are tracing this within the guest OS? What do the stats on the guest network interface say? Do you see packet counters increasing (both within the guest OS eth0, and on vifX.0 in Dom0?) when you are sending packet?after you reset the interfaces within the guest, the transmitted packet count stays at 1.> If you only have a single guest failing on Xen, then I would guess this is an issue with the guest OS and not with anything relating to Xen networking. What kernel does the guest OS run?In my case, CentOS 5.1 kernel-xen (I don''t remember exact uname. sorry, I know that is important.)> > Does anyone have an idea about the cause or a solution to this problem? > > I have seen strange things with libpcap under certain guest OS versions when used with Xen''s netfront driver... libpcap is used for tcpdump and some network testing tools (potentially the ones you are using). It might be worth trying to also trace packets on the backend devices in Dom0, and also have a look at packet counters on all interfaces (Dom0/DomU).the packet counters within the DomU, in my case, would increment for rx, for tx, they would not increment. this looked the same if you looked at the iface from within the domu or within the dom0 (which makes sense, if the DomU doesn''t think it''s transmitting, the dom0 is unlikely to see a packet) _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer, Anna
2009-Mar-25 23:47 UTC
RE: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
> -----Original Message----- > From: Luke S Crawford [mailto:lsc@prgmr.com] > Sent: 25 March 2009 16:08 > To: Fischer, Anna > Cc: eric van blokland; xen-users@lists.xensource.com > Subject: Re: [Xen-users] Strange network issue; Guest/DomU outgoing > traffic > > "Fischer, Anna" <anna.fischer@hp.com> writes: > > > You are tracing this within the guest OS? Or are you tracing on the > bridge within Dom0? > > > Assuming the original poster has the same problem I do, within the > DomU. > > only one guest has a problem at a time, and the problem remains even if > i > ''xm save'' and then ''xm restore'' the guest. > > > Again, you are tracing this within the guest OS? What do the stats on > the guest network interface say? Do you see packet counters increasing > (both within the guest OS eth0, and on vifX.0 in Dom0?) when you are > sending packet? > > after you reset the interfaces within the guest, the transmitted packet > count stays at 1. > > > If you only have a single guest failing on Xen, then I would guess > this is an issue with the guest OS and not with anything relating to > Xen networking. What kernel does the guest OS run? > > > In my case, CentOS 5.1 kernel-xen (I don''t remember exact uname. > sorry, I know that is important.) > > > > > Does anyone have an idea about the cause or a solution to this > problem? > > > > I have seen strange things with libpcap under certain guest OS > versions when used with Xen''s netfront driver... libpcap is used for > tcpdump and some network testing tools (potentially the ones you are > using). It might be worth trying to also trace packets on the backend > devices in Dom0, and also have a look at packet counters on all > interfaces (Dom0/DomU). > > > the packet counters within the DomU, in my case, would increment for > rx, > for tx, they would not increment. this looked the same if you looked > at > the iface from within the domu or within the dom0 (which makes sense, > if the > DomU doesn''t think it''s transmitting, the dom0 is unlikely to see a > packet)If you are tracing within DomU, assuming you are doing a simple Linux "ping x.x.x.x" command where x.x.x.x is on the same network as your DomU, then the only reason that you would not see an ICMP packet would be that you do not have an ARP table entry for x.x.x.x on DomU. What does your ARP table show? If there is no entry, then you should see an ARP packet in the trace. If you do not see an ARP packet, then it could be that your routing is not set up properly. What does ip route show? Do your interface counters / netstat values show any TX errors at all?>From your description I don''t think this issue is Xen related.When I had problems with libpcap, I actually saw interface packet counters increasing, but no packets in libpcap. So this is a different issue. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Luke S Crawford
2009-Mar-26 00:08 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
"Fischer, Anna" <anna.fischer@hp.com> writes:> If you are tracing within DomU, assuming you are doing a simple Linux "ping x.x.x.x" command where x.x.x.x is on the same network as your DomU, then the only reason that you would not see an ICMP packet would be that you do not have an ARP table entry for x.x.x.x on DomU. What does your ARP table show? If there is no entry, then you should see an ARP packet in the trace. If you do not see an ARP packet, then it could be that your routing is not set up properly. What does ip route show?ip route (well, I used netstat -rn) showed the correct things. my default gateway was on the same network as the netmask of eth0, and netstat -rn showed that. the arp table had an unresolved entry for the default gateway and nothing else. deleting that entry and trying again would not incerement the tx packet count even by 1. I even tried arping the default gateway. the packet counters still would not increment (and I didn''t see any outgoing arp packets) the syntax was right, as I try it after reboot (rebooting the DomU not the Dom0) and I see arp packets as I would expect. when this is happening I ususally leave an inbound ping going from another host. I see the packets heading in, nothing (not even an arp who-has) going out.> Do your interface counters / netstat values show any TX errors at all?None.> >From your description I don''t think this issue is Xen related. > > When I had problems with libpcap, I actually saw interface packet counters increasing, but no packets in libpcap. So this is a different issue._______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer, Anna
2009-Mar-26 00:59 UTC
RE: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
> -----Original Message----- > From: Luke S Crawford [mailto:lsc@prgmr.com] > Sent: 25 March 2009 17:08 > To: Fischer, Anna > Cc: eric van blokland; xen-users@lists.xensource.com > Subject: Re: [Xen-users] Strange network issue; Guest/DomU outgoing > traffic > > "Fischer, Anna" <anna.fischer@hp.com> writes: > > > If you are tracing within DomU, assuming you are doing a simple Linux > "ping x.x.x.x" command where x.x.x.x is on the same network as your > DomU, then the only reason that you would not see an ICMP packet would > be that you do not have an ARP table entry for x.x.x.x on DomU. What > does your ARP table show? If there is no entry, then you should see an > ARP packet in the trace. If you do not see an ARP packet, then it could > be that your routing is not set up properly. What does ip route show? > > ip route (well, I used netstat -rn) showed the correct things. my > default > gateway was on the same network as the netmask of eth0, and netstat -rn > showed > that. the arp table had an unresolved entry for the default gateway > and > nothing else. deleting that entry and trying again would not > incerement > the tx packet count even by 1. > > I even tried arping the default gateway.I guess you mean you do "arp -d x.x.x.x", and then "ping x.x.x.x", or "arping x.x.x.x" where x.x.x.x is configured as your default gateway? That should definitely cause an ARP request to go out. You do not have any weird arpd/kernel configuration enabled? Also, you do not have any weird network setup within your DomU? Like a bridge, VLAN bonding, or IP forwarding, or IP aliases, or whatever else? And, you only have a single interface assigned (and configured!) per virtual machine? And you have /proc/sys/net/ipv4/conf/all/arp_filter set to 0?> the packet counters still > would not > increment (and I didn''t see any outgoing arp packets) the syntax was > right, > as I try it after reboot (rebooting the DomU not the Dom0) and I see > arp > packets as I would expect. > > when this is happening I ususally leave an inbound ping going from > another > host. I see the packets heading in, nothing (not even an arp who-has) > going out.I guess you capture at the interface level with tcpdump, but for incoming packets it could also be that they are not received on the higher level, e.g. if you have packet filtering enabled or something similar. I guess you are not running a firewall or something?> > Do your interface counters / netstat values show any TX errors at > all? > > None.Then this would be a failure somewhere in the IP stack, or possibly in the ARP kernel code... If you are sure that you have not misconfigured anything, then I would probably go for a kernel upgrade... _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Luke S Crawford
2009-Mar-26 02:18 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
"Fischer, Anna" <anna.fischer@hp.com> writes:> I guess you mean you do "arp -d x.x.x.x", and then "ping x.x.x.x", or "arping x.x.x.x" where x.x.x.x is configured as your default gateway? That should definitely cause an ARP request to go out. You do not have any weird arpd/kernel configuration enabled?Exactly. I''m not doing anything with arptables or otherwise changing the arp config for this box (and it worked just fine for a period of months until one day it just... didn''t.) this has happened on several DomUs, restarting the DomU fixes the problem.> Also, you do not have any weird network setup within your DomU? Like a bridge, VLAN bonding, or IP forwarding, or IP aliases, or whatever else?It just has one IPv4 address on eth0. only one interface. No iptables, even. no bonding.> And, you only have a single interface assigned (and configured!) per virtual machine? And you have /proc/sys/net/ipv4/conf/all/arp_filter set to 0?I have not touched the arp_filter proc. I checked on a box with the identical image and it is in fact zero. but yes, one IP and one eth per virtual machene.> I guess you capture at the interface level with tcpdump, but for incoming packets it could also be that they are not received on the higher level, e.g. if you have packet filtering enabled or something similar. I guess you are not running a firewall or something?Nope, and from the domU, I see incoming packets in tcpdump just fine... only outgoing that has the problem. no packet filtering.> > > Do your interface counters / netstat values show any TX errors at > > all? > > > > None. > > Then this would be a failure somewhere in the IP stack, or possibly in the ARP kernel code... If you are sure that you have not misconfigured anything, then I would probably go for a kernel upgrade...Hm. OK. thanks. I will try that. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
eric van blokland
2009-Mar-26 10:41 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
Hey all On Thu, Mar 26, 2009 at 3:18 AM, Luke S Crawford <lsc@prgmr.com> wrote:> "Fischer, Anna" <anna.fischer@hp.com> writes: > >> I guess you mean you do "arp -d x.x.x.x", and then "ping x.x.x.x", or "arping x.x.x.x" where x.x.x.x is configured as your default gateway? That should definitely cause an ARP request to go out. You do not have any weird arpd/kernel configuration enabled? > > Exactly. > > I''m not doing anything with arptables or otherwise changing the arp config > for this box (and it worked just fine for a period of months until one day > it just... didn''t.) this has happened on several DomUs, restarting the > DomU fixes the problem. > >> Also, you do not have any weird network setup within your DomU? Like a bridge, VLAN bonding, or IP forwarding, or IP aliases, or whatever else? > > It just has one IPv4 address on eth0. only one interface. No iptables, even. > no bonding. > >> And, you only have a single interface assigned (and configured!) per virtual machine? And you have /proc/sys/net/ipv4/conf/all/arp_filter set to 0? > > I have not touched the arp_filter proc. I checked on a box with the > identical image and it is in fact zero. but yes, one IP and one eth > per virtual machene. > > >> I guess you capture at the interface level with tcpdump, but for incoming packets it could also be that they are not received on the higher level, e.g. if you have packet filtering enabled or something similar. I guess you are not running a firewall or something? > > Nope, and from the domU, I see incoming packets in tcpdump just fine... > only outgoing that has the problem. > > no packet filtering. > >> > > Do your interface counters / netstat values show any TX errors at >> > all? >> > >> > None. >> >> Then this would be a failure somewhere in the IP stack, or possibly in the ARP kernel code... If you are sure that you have not misconfigured anything, then I would probably go for a kernel upgrade... > > Hm. OK. thanks. I will try that. >Arp kernel code appears to be fine. Incomming ARP packets keep the arp table up to date. Our reply packet is just never seen "on the wire". If we try to ping a host we dont know yet, we never see our arp packet comming by, which doesnt mean arpd didnt try to send it. Pinging a host we do know, just gives "unreachable", nothing seen "on the wire". If i send some unsolicited UDP packet to the DomU, it triggers the firewall. Anything else seems to come right by. Already tried to clear iptables once. This is not the issue. The socket interface on the affected DomU thinks everything works fine. I can send UDP packets without errors. Again (ofcourse) nothing "on the wire". How about poking a developer if he could imagine some race condition, network driver/interrupt related, that could block outgoing traffic? Or would they just yell at me for using those old kernels? Regards, Eric _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2009-Mar-26 11:17 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
On Thu, Mar 26, 2009 at 5:41 PM, eric van blokland <ericvanblokland@gmail.com> wrote:> Our reply packet is just never seen "on the wire".> How about poking a developer if he could imagine some race condition, > network driver/interrupt related, that could block outgoing traffic? > Or would they just yell at me for using those old kernels?I suspect the later. In xen upstream, bugfix will go to Xen 3.3.x, not on 3.1.x series. Plus Redhat''s Xen is pretty much redhat-specific, not directly coresspond to any Xen release. I would''ve suggest you file a bug with Redhat (RH bugzilla is available for public), but since you''re using an old release I imagine the the first reponse would be "please try the latest version" :) FYI, I actually had a similar problem on opensolaris dom0, Windows HVM domU with GPLV in that reply packets never arrived at domU. Turns out to be dom0 problem, and upgrading dom0 to snv_109 fixed it. So again, my suggestion is update your system. Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
eric van blokland
2009-Mar-26 11:42 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
Upgrading first is probably the best step, however, I would love to hear from a developer if he knows any changes that might have fixed this issue. Then I know at what version I could expect the issue to be gone. Besides that, I would like to know what information I should collect for them in order to investigate the issue in case it still occurs after the update. I know too little about the linux kernel, let alone the drivers to come up with any serious theory. But could it be the DomU missed or did not receive a transmit complete interrupt, leaving the network driver in a weird state. If so, would it be still possible to receive packets? And what about the send buffer? Wouldn''t it overrun eventually or will packets in the buffer timeout and removed from the queue? Then again, I haven''t got any serious evidence that it''s related to driver as Fisher said libpcap might be unreliable. Just some thoughts... On Thu, Mar 26, 2009 at 12:17 PM, Fajar A. Nugraha <fajar@fajar.net> wrote:> On Thu, Mar 26, 2009 at 5:41 PM, eric van blokland > <ericvanblokland@gmail.com> wrote: >> Our reply packet is just never seen "on the wire". > >> How about poking a developer if he could imagine some race condition, >> network driver/interrupt related, that could block outgoing traffic? >> Or would they just yell at me for using those old kernels? > > I suspect the later. > In xen upstream, bugfix will go to Xen 3.3.x, not on 3.1.x series. > Plus Redhat''s Xen is pretty much redhat-specific, not directly > coresspond to any Xen release. I would''ve suggest you file a bug with > Redhat (RH bugzilla is available for public), but since you''re using > an old release I imagine the the first reponse would be "please try > the latest version" :) > > FYI, I actually had a similar problem on opensolaris dom0, Windows HVM > domU with GPLV in that reply packets never arrived at domU. Turns out > to be dom0 problem, and upgrading dom0 to snv_109 fixed it. So again, > my suggestion is update your system. > > Regards, > > Fajar >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2009-Mar-26 12:49 UTC
Re: [Xen-users] Strange network issue; Guest/DomU outgoing traffic
On Thu, Mar 26, 2009 at 6:42 PM, eric van blokland <ericvanblokland@gmail.com> wrote:> Upgrading first is probably the best step, however, I would love to > hear from a developer if he knows any changes that might have fixedSolving problems like this is usually easiest if : - you''re using the latest available version - the problem is easily reproducible neither is true on your case. Also, since it seems to be redhat-i386-specific (as Luke mentioned), your best bet to obtain such information is to purchase Redhat support and ask them in a support ticket, where they''re obligated to response in a timely fashion.> this issue. Then I know at what version I could expect the issue to beYeah, I know that would be great. So far the only community list that I know of that can provide such information are opensolaris lists (like the fact that the xen-networking problem I described earlier was fixed in snv_109). Those information came from Sun employees. So again, in your case your best bet is probably to purchase Redhat support. Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users