Matt Ayres
2006-Mar-31 20:49 UTC
[Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
Synopsis: A user of mine has debugged this issue for me. It seems a Xen guest in routed mode wants to arp cache any host it connects to with the MAC address FE:FF:FF:FF:FF:FF. The user also identified long connection times due to this. While a remote host is in the arp cache connection times are fast (30ms or so), when it is not it can be well over 1000ms. They have provided me the tcpdump output that proves this. They also proved it is due to the ARP cache by statically adding a remote host to the ARP cache and noting that connection times are very low. Full debugging information is attached to the bug. Bug URL: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt Ayres
2006-Apr-01 16:05 UTC
Re: [Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
Matt Ayres wrote:> Synopsis: > > A user of mine has debugged this issue for me. It seems a Xen guest in > routed mode wants to arp cache any host it connects to with the MAC > address FE:FF:FF:FF:FF:FF. The user also identified long connection > times due to this. While a remote host is in the arp cache connection > times are fast (30ms or so), when it is not it can be well over 1000ms. > They have provided me the tcpdump output that proves this. They also > proved it is due to the ARP cache by statically adding a remote host to > the ARP cache and noting that connection times are very low. > > Full debugging information is attached to the bug. > > Bug URL: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596 >I have assigned this by to myself and marked it as INVALID. It appears to be specific to CentOS / Fedora and my specific setup. Thank you, Matt Ayres _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-01 16:56 UTC
Re: [Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
On 1 Apr 2006, at 17:05, Matt Ayres wrote:>> A user of mine has debugged this issue for me. It seems a Xen guest >> in routed mode wants to arp cache any host it connects to with the >> MAC address FE:FF:FF:FF:FF:FF. The user also identified long >> connection times due to this. While a remote host is in the arp cache >> connection times are fast (30ms or so), when it is not it can be well >> over 1000ms. They have provided me the tcpdump output that proves >> this. They also proved it is due to the ARP cache by statically >> adding a remote host to the ARP cache and noting that connection >> times are very low. >> Full debugging information is attached to the bug. >> Bug URL: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596 > > I have assigned this by to myself and marked it as INVALID. It > appears to be specific to CentOS / Fedora and my specific setup.We''ll be interested to learn the full details if you manage to work out what''s going on. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt Ayres
2006-Apr-01 17:12 UTC
Re: [Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
Keir Fraser wrote:> > On 1 Apr 2006, at 17:05, Matt Ayres wrote: > >>> A user of mine has debugged this issue for me. It seems a Xen guest >>> in routed mode wants to arp cache any host it connects to with the >>> MAC address FE:FF:FF:FF:FF:FF. The user also identified long >>> connection times due to this. While a remote host is in the arp cache >>> connection times are fast (30ms or so), when it is not it can be well >>> over 1000ms. They have provided me the tcpdump output that proves >>> this. They also proved it is due to the ARP cache by statically >>> adding a remote host to the ARP cache and noting that connection >>> times are very low. >>> Full debugging information is attached to the bug. >>> Bug URL: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596 >> >> I have assigned this by to myself and marked it as INVALID. It >> appears to be specific to CentOS / Fedora and my specific setup. > > We''ll be interested to learn the full details if you manage to work out > what''s going on. :-) >I know exactly what went wrong. I chose to use 169.254.1.1 as the IP to assign to my vif interfaces. Inside the guest a static route is added for 169.254.1.1/24 via eth0 and then a default gateway to 169.254.1.1. I chose this as various proxy ARP howto''s use it and it is reserved "link local" space, which made sense. CentOS (RHEL) / Fedora add a static route for 169.254.0.0/18 for DHCP purposes. I see no reason why, it''s not required by any other distribution and removing it doesn''t make DHCP not work. Anyhow, it appears having the finer-grained /24 route was causing all remote IP''s to be cached in the ARP table as local. Removing my /24 static route fixes everything and causes only 169.254.1.1 to be in the ARP cache. Perhaps the community can enlighten me, who is in the wrong here, RedHat or I? We support many other distributions (Gentoo, Debian, Ubuntu, Mandriva/Mandrake, Slackware) and no others want to add the link local network as a static route. The other oddity is why does having the /24 statically routed along with the /18 cause any IP on the internet to be added to the ARP cache? That part right there is what is most confusing to myself. I fixed it, but I''m far from completely understanding it. Thank you, Matt Ayres _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-01 17:32 UTC
Re: [Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
On 1 Apr 2006, at 18:12, Matt Ayres wrote:> Perhaps the community can enlighten me, who is in the wrong here, > RedHat or I? We support many other distributions (Gentoo, Debian, > Ubuntu, Mandriva/Mandrake, Slackware) and no others want to add the > link local network as a static route.RedHat-based distros are configured with ''zeroconf'' support by default, which is what adds the 169.254 route iirc. If you don''t care for that (I can''t imagine many people do) then it''s a good idea to add ''NOZEROCONF=yes'' to /etc/sysconfig/network.> The other oddity is why does having the /24 statically routed along > with the /18 cause any IP on the internet to be added to the ARP > cache? That part right there is what is most confusing to myself.Agreed, that doesn''t seem to make any sense! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt Ayres
2006-Apr-01 19:43 UTC
Re: [Xen-devel] ARP cache problems / slow connect times in routed mode - Bug #596 opened
Keir Fraser wrote:> >> The other oddity is why does having the /24 statically routed along >> with the /18 cause any IP on the internet to be added to the ARP >> cache? That part right there is what is most confusing to myself. > > Agreed, that doesn''t seem to make any sense! > > -- KeirI figured it out. Proxy ARP is _very_ touchy when it comes to subnets. Since the netmask is a /32 subnet on the host vif the route works if added as 169.254.1.1/32 on the guest (versus a /24). This allows RedHat to do what they want and I need not worry if a user turns of zeroconf support on their own now. Thanks, Matt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel