Matthew Law
2010-Feb-16 13:24 UTC
[Xen-users] domU loses incoming network after period of inactivity
I have a domU which is losing incoming network connectivity after ''some period of inactivity'' - I assume overnight, since that is what appears to be happening. If I ping it from the dom0 it magically wakes up and is accessible again. Likewise, if I console onto it from the dom0 and then ping either the dom0 or an outside host it will wake up. In both cases the initial ping response time is initially several seconds and then settles down to sub-millisecond time (as I would expect). There are no firewall rules or cron jobs running on the domU. The dom0 has nothing running other than a regular NTP sync. The dom0 has the following iptables and ebtables rules in place (ebtables is there to try and prevent IP spoofing): iptables: ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED PHYSDEV match --physdev-out vif10.0 ACCEPT udp -- anywhere anywhere PHYSDEV match --physdev-in vif10.0 udp spt:bootpc dpt:bootps ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED PHYSDEV match --physdev-out vif10.0 ACCEPT all -- domu.fqdn anywhere PHYSDEV match --physdev-in vif10.0 ebtables: Bridge chain: vif10.0, entries: 5, policy: DROP -p ARP --arp-op Request -j ACCEPT -p IPv4 --ip-src domu.ip.address -j ACCEPT -p IPv4 --ip-dst domu.ip.address -j ACCEPT -p ARP --arp-op Reply --arp-ip-src domu.ip.address -j ACCEPT --log-level notice --log-prefix "arp-drop" --log-arp -j DROP The domU is an Ubuntu Karmic image that I took from stacklet and other than this, has no other obvious problems. It has been halted and restarted (from the domU) several times and comes up with no problems whatsoever. There are 8 other debian Lenny and Centos 5.4 domUs on this host which have no problems afaik. The dom0 uses bridging for all domUs and the brctl show looks like this: brctl show: bridge name bridge id STP enabled interfaces eth0 8000.003048d9edf6 no vif10.0 vif9.0 vif8.0 vif5.0 vif6.0 vif4.0 vif2.0 vif3.0 vif1.0 peth0 The vif names are assigned in the domU config, as are the mac addresses and static IPs. There is nothing immediately obvious in the xen logs and no messages in the system logs that look suspect either. iptables logging is currently disabled, however. What could this be? - is there any housekeeping that xen does which could cause this or perhaps some misconfiguration on my part? Thanks in advance, Matt. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Martin Gombač
2010-Feb-16 13:34 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
I have exactly the same problem. Lot''s of users here do have it. A workaround is to ping gateway from crontab, each minute. But this won''t help you from loosing connection to other machines on the same network. :-/ People also told me to start using routed setup, but that''s not an possible option for me. Hope someone helps us this time. Regards, M. Matthew Law wrote:> I have a domU which is losing incoming network connectivity after ''some > period of inactivity'' - I assume overnight, since that is what appears to > be happening. > > If I ping it from the dom0 it magically wakes up and is accessible again. > Likewise, if I console onto it from the dom0 and then ping either the dom0 > or an outside host it will wake up. > > In both cases the initial ping response time is initially several seconds > and then settles down to sub-millisecond time (as I would expect). There > are no firewall rules or cron jobs running on the domU. The dom0 has > nothing running other than a regular NTP sync. > > The dom0 has the following iptables and ebtables rules in place (ebtables > is there to try and prevent IP spoofing): > > iptables: > > ACCEPT all -- anywhere anywhere state > RELATED,ESTABLISHED PHYSDEV match --physdev-out vif10.0 > ACCEPT udp -- anywhere anywhere PHYSDEV match > --physdev-in vif10.0 udp spt:bootpc dpt:bootps > ACCEPT all -- anywhere anywhere state > RELATED,ESTABLISHED PHYSDEV match --physdev-out vif10.0 > ACCEPT all -- domu.fqdn anywhere PHYSDEV match > --physdev-in vif10.0 > > ebtables: > > Bridge chain: vif10.0, entries: 5, policy: DROP > -p ARP --arp-op Request -j ACCEPT > -p IPv4 --ip-src domu.ip.address -j ACCEPT > -p IPv4 --ip-dst domu.ip.address -j ACCEPT > -p ARP --arp-op Reply --arp-ip-src domu.ip.address -j ACCEPT > --log-level notice --log-prefix "arp-drop" --log-arp -j DROP > > The domU is an Ubuntu Karmic image that I took from stacklet and other > than this, has no other obvious problems. It has been halted and > restarted (from the domU) several times and comes up with no problems > whatsoever. There are 8 other debian Lenny and Centos 5.4 domUs on this > host which have no problems afaik. The dom0 uses bridging for all domUs > and the brctl show looks like this: > > brctl show: > bridge name bridge id STP enabled interfaces > eth0 8000.003048d9edf6 no vif10.0 > vif9.0 > vif8.0 > vif5.0 > vif6.0 > vif4.0 > vif2.0 > vif3.0 > vif1.0 > peth0 > > The vif names are assigned in the domU config, as are the mac addresses > and static IPs. There is nothing immediately obvious in the xen logs and > no messages in the system logs that look suspect either. iptables logging > is currently disabled, however. > > What could this be? - is there any housekeeping that xen does which could > cause this or perhaps some misconfiguration on my part? > > > Thanks in advance, > > Matt. > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthew Law
2010-Feb-16 23:49 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
On Tue, February 16, 2010 1:34 pm, Martin GombaÄ wrote:> I have exactly the same problem. Lot''s of users here do have it. > A workaround is to ping gateway from crontab, each minute. But this > won''t help you from loosing connection to other machines on the same > network. :-/ > People also told me to start using routed setup, but that''s not an > possible option for me. > Hope someone helps us this time. > > Regards, > M.Oh, dear :-( I have to use bridged for most domUs, so routed is not an option for me either. Did you try posting it in the xen developers list? - surely something as serious as this is being or has been addressed... Thanks, Matt. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Martin Gombač
2010-Feb-17 09:40 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
I haven''t. Been too busy working on other things. For now i ping out. When i have the time i''ll go back and try to solve the problem. regards, M. Matthew Law wrote:> On Tue, February 16, 2010 1:34 pm, Martin GombaÄ wrote: > >> I have exactly the same problem. Lot''s of users here do have it. >> A workaround is to ping gateway from crontab, each minute. But this >> won''t help you from loosing connection to other machines on the same >> network. :-/ >> People also told me to start using routed setup, but that''s not an >> possible option for me. >> Hope someone helps us this time. >> >> Regards, >> M. >> > > Oh, dear :-( > > I have to use bridged for most domUs, so routed is not an option for me > either. Did you try posting it in the xen developers list? - surely > something as serious as this is being or has been addressed... > > Thanks, > > Matt. > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2010-Feb-17 11:07 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
On Wed, Feb 17, 2010 at 6:49 AM, Matthew Law <matt@webcontracts.co.uk> wrote:>> A workaround is to ping gateway from crontab, each minute. But this >> won''t help you from loosing connection to other machines on the same >> network. :-/> > Oh, dear :-( > > I have to use bridged for most domUs, so routed is not an option for me > either. Did you try posting it in the xen developers list? - surely > something as serious as this is being or has been addressed...At first glance it looks like switch problem. You should be able to trace it when the problem occurs: - ping from outside host - "brctl showmacs name_of_bridge", see if domU MAC is listed there - tcpdump on uplink interface (usually peth0), see if arp goes through correctly and domU replies it - look at mac address table on the switch, see if you can find domUs MAC on the correct port It might be that the switch who should be doing arp broadcast to all port when a MAC is not cached didn''t do that, so the request from outside world to domU never reach dom0''s port. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi All, I''m testing Xen 3.3 on my desktop as well as on laptop (both have debian 5.4) with Xenified kernels. The IP''s are assigned statically to dom0 and domUs. The dom0 has static IP(192.168.1.8) and domU has 192.168.1.33 on Desktop & [192.168.1.7; 192.168.1.32 on Laptop]. The domU and dom0 can ping each other (both on desktop and laptop); but they don''t ping none of the IP''s outside the physical machine (i.e. other then their host dom0); although in the /etc/profile I explicitely mention the htt[_proxy variable such as "http_proxy=http://<Proxy address>:<Port>". In case for dom0 things working fine; I can browse, ping or get updates via apt-get using the same http_proxy settings for dom0''s on both physical machines. I need to connect domU to download updates via "apt-get"; but the domUs hang on such as the following error: --------------------------------------- "debiantest:~# apt-get update Err http://ftp.uk.debian.org etch Release.gpg Temporary failure resolving ''Proxy-address'' Err http://security.debian.org etch/updates Release.gpg Temporary failure resolving ''<Proxy-address>'' Failed to fetch http://ftp.uk.debian.org/debian/dists/etch/Release.gpg Temporary error" ============================================ The /etc/resolv.con file has valid Name server entries on both dom0 and domU (on both laptop & Desktop) such as on any dom0 or domU the file /etc/resolv.conf has entries like below: -------------------------------- domain localdomain.ac.uk search localdomain.ac.uk nameserver xxx.xxx.xxx.xxx nameserver xxx.xxx.xxx.xxx nameserver xxx.xxx.xxx.xxx ---------------------------------------------- Any ideas what conf file I might missed and it don''t connect to http-proxy? Thanks in advance for help. Jan Muhammad _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Martin Gombač
2010-Feb-18 13:33 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
Matthew L., replacing the switch might be a solution. If it works for you, please let us all know. Tnx. M. Fajar A. Nugraha wrote:> On Wed, Feb 17, 2010 at 6:49 AM, Matthew Law <matt@webcontracts.co.uk> wrote: > >>> A workaround is to ping gateway from crontab, each minute. But this >>> won''t help you from loosing connection to other machines on the same >>> network. :-/ >>> > > >> Oh, dear :-( >> >> I have to use bridged for most domUs, so routed is not an option for me >> either. Did you try posting it in the xen developers list? - surely >> something as serious as this is being or has been addressed... >> > > At first glance it looks like switch problem. You should be able to > trace it when the problem occurs: > - ping from outside host > - "brctl showmacs name_of_bridge", see if domU MAC is listed there > - tcpdump on uplink interface (usually peth0), see if arp goes through > correctly and domU replies it > - look at mac address table on the switch, see if you can find domUs > MAC on the correct port > > It might be that the switch who should be doing arp broadcast to all > port when a MAC is not cached didn''t do that, so the request from > outside world to domU never reach dom0''s port. > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthew Law
2010-Feb-18 13:50 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
On Thu, February 18, 2010 1:33 pm, Martin GombaÄ wrote:> Matthew L., > > replacing the switch might be a solution. If it works for you, please > let us all know. > > Tnx. > > M.Fajar, Martin, thanks for the pointer. I will get around to looking at it very soon and post results here. Regards, Matt. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthew Law
2010-Feb-22 11:31 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
On Wed, February 17, 2010 11:07 am, Fajar A. Nugraha wrote:> At first glance it looks like switch problem. You should be able to > trace it when the problem occurs: > - ping from outside host > - "brctl showmacs name_of_bridge", see if domU MAC is listed there > - tcpdump on uplink interface (usually peth0), see if arp goes through > correctly and domU replies it > - look at mac address table on the switch, see if you can find domUs > MAC on the correct port > > It might be that the switch who should be doing arp broadcast to all > port when a MAC is not cached didn''t do that, so the request from > outside world to domU never reach dom0''s port.Ok, after some investigation, here''s where I''m at with this little peach of a problem! 1) Confirm I can''t ping or ssh onto the domU. 2) From dom0, confirmed domU is running and consoled onto it. 3) Check again that I can''t ping or ssh onto it (just in case the act of consoling onto it has changed anything). No change. 4) ''brctl showmacs peth0'' shows the correct domU mac address 5) On dom0 I ran ''tcpdump -i peth0 -e -n arp or icmp'' and then ping the domU from outside. This results in: grep -v Broadcast tcpdump.log 11:09:59.481106 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, id 59096, seq 0, length 64 11:09:59.481206 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id 59096, seq 0, length 64 11:10:00.480863 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, id 59096, seq 1, length 64 11:10:00.480925 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id 59096, seq 1, length 64 11:10:01.481083 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, id 59096, seq 2, length 64 11:10:01.481138 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id 59096, seq 2, length 64 11:10:04.471206 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype ARP (0x0806), length 42: arp who-has gateway.ip.address tell domu.ip.address 11:10:04.471773 00:00:0c:07:ac:05 > ae:00:59:15:1a:09, ethertype ARP (0x0806), length 60: arp reply gateway.ip.address is-at 00:00:0c:07:ac:05 So, the domU is replying to the pings. Interesting. 6) I try the domU again and it is now accessible - different behaviour to what I was seeing before, but it may be that just before I ran the tcpdump command, something on the domU initiated a connection to the outside world and threw a spanner in the works... I don''t have access to the switches I am immediately connected to in this case, but I do know that they are running HSRP which has been known to ''crap out'' in the past, so I will get that looked into. Also, I run ebtables on the dom0 to prevent IP spoofing, so another possibility is that ebtables might be the issue - perhaps it is ''forgetting'' the MAC to IP mapping?. I will enable logging and investigate some more... Regards, Matt. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Martin Gombač
2010-Feb-22 12:03 UTC
Re: [Xen-users] domU loses incoming network after period of inactivity
Tnx for the followup. Keep us informed if or how you solve the problem. Regards, M. Matthew Law wrote:> On Wed, February 17, 2010 11:07 am, Fajar A. Nugraha wrote: > >> At first glance it looks like switch problem. You should be able to >> trace it when the problem occurs: >> - ping from outside host >> - "brctl showmacs name_of_bridge", see if domU MAC is listed there >> - tcpdump on uplink interface (usually peth0), see if arp goes through >> correctly and domU replies it >> - look at mac address table on the switch, see if you can find domUs >> MAC on the correct port >> >> It might be that the switch who should be doing arp broadcast to all >> port when a MAC is not cached didn''t do that, so the request from >> outside world to domU never reach dom0''s port. >> > > Ok, after some investigation, here''s where I''m at with this little peach > of a problem! > > 1) Confirm I can''t ping or ssh onto the domU. > 2) From dom0, confirmed domU is running and consoled onto it. > 3) Check again that I can''t ping or ssh onto it (just in case the act of > consoling onto it has changed anything). No change. > 4) ''brctl showmacs peth0'' shows the correct domU mac address > 5) On dom0 I ran ''tcpdump -i peth0 -e -n arp or icmp'' and then ping the > domU from outside. This results in: > > grep -v Broadcast tcpdump.log > 11:09:59.481106 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 > (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, > id 59096, seq 0, length 64 > 11:09:59.481206 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 > (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id > 59096, seq 0, length 64 > 11:10:00.480863 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 > (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, > id 59096, seq 1, length 64 > 11:10:00.480925 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 > (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id > 59096, seq 1, length 64 > 11:10:01.481083 00:d0:01:78:a4:00 > ae:00:59:15:1a:09, ethertype IPv4 > (0x0800), length 98: my.ip.address > domu.ip.address: ICMP echo request, > id 59096, seq 2, length 64 > 11:10:01.481138 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype IPv4 > (0x0800), length 98: domu.ip.address > my.ip.address: ICMP echo reply, id > 59096, seq 2, length 64 > 11:10:04.471206 ae:00:59:15:1a:09 > 00:00:0c:07:ac:05, ethertype ARP > (0x0806), length 42: arp who-has gateway.ip.address tell domu.ip.address > 11:10:04.471773 00:00:0c:07:ac:05 > ae:00:59:15:1a:09, ethertype ARP > (0x0806), length 60: arp reply gateway.ip.address is-at 00:00:0c:07:ac:05 > > So, the domU is replying to the pings. Interesting. > > 6) I try the domU again and it is now accessible - different behaviour to > what I was seeing before, but it may be that just before I ran the tcpdump > command, something on the domU initiated a connection to the outside world > and threw a spanner in the works... > > I don''t have access to the switches I am immediately connected to in this > case, but I do know that they are running HSRP which has been known to > ''crap out'' in the past, so I will get that looked into. Also, I run > ebtables on the dom0 to prevent IP spoofing, so another possibility is > that ebtables might be the issue - perhaps it is ''forgetting'' the MAC to > IP mapping?. > > I will enable logging and investigate some more... > > Regards, > > Matt. > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users