On a 2.0.6 installation on top of Debian and kernel 2.6.11.12 (Dell PE1750, tg3 nic), I''m seeing temporary network hangs while working on the child domains. From outside the install, pinging domain0 results in 0% packet loss while pinging any of its child domains results in roughly 1% packet loss over a multi-hour ping. While ssh''d in, a hang appears to occur every ~10 minutes and lasts for 3-7 seconds. Connectivity to domain0 is never interrupted and no external packet loss occurs. I''m also not seeing any interruption between the child and domain0. I see no kernel messages, nothing in the logs, etc. Any ideas on what''s going on? Thanks, John -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Connectivity to domain0 is never interrupted and no external packet loss occurs. > I''m also not seeing any interruption between the child and domain0.(Sorry to be replying to my own post) I should be more clear there: There is packet interruption between the child and domain0, but I haven''t yet seen any loss from outside the Xen host into domain0. The problem appears to be between the frontend and backend interfaces. John -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
John Madden wrote:> On a 2.0.6 installation on top of Debian and kernel 2.6.11.12 (Dell PE1750, tg3 > nic), I''m seeing temporary network hangs while working on the child domains. From > outside the install, pinging domain0 results in 0% packet loss while pinging any > of its child domains results in roughly 1% packet loss over a multi-hour ping. > While ssh''d in, a hang appears to occur every ~10 minutes and lasts for 3-7 > seconds.I''ve seen this pattern on hosts with errors in the bridge config. Especially when more than one bridge was used. The default spanning tree ageing time is set to 300 secs. That''s more than your ~10 minutes, though. ''brctl showstp xen-br0'' and ''brctl showmacs xen-br0''> Connectivity to domain0 is never interrupted and no external packet loss occurs. > I''m also not seeing any interruption between the child and domain0. > > I see no kernel messages, nothing in the logs, etc. Any ideas on what''s going on?>> Thanks, > JohnMike _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> I''ve seen this pattern on hosts with errors in the bridge config. > Especially when more than one bridge was used.Ah, good to hear.> The default spanning tree ageing time is set to 300 secs. That''s more > than your ~10 minutes, though.And the problem may happen more frequently than 10 minutes, I''ve just only noticed it while in coding roughly that often. I''ve done nothing at all to configure or change the configuration of the bridges, as I remember it, it all just worked out of the box. All of my domains have "vif = [ ''bridge=xen-br0'' ]" -- is that a problem?> ''brctl showstp xen-br0'' and > ''brctl showmacs xen-br0''# brctl showmacs xen-br0 port no mac addr is local? ageing timer 1 00:00:0c:07:ac:01 no 4.78 1 00:02:b3:1e:36:15 no 296.86 1 00:02:b3:4b:cf:1f no 0.15 1 00:02:b3:4b:cf:8d no 299.25 1 00:03:ba:65:fe:db no 36.73 1 00:03:ba:d0:2a:f7 no 106.00 1 00:03:ff:1f:36:15 no 63.28 1 00:04:00:12:b0:4c no 51.13 1 00:06:5b:39:c8:20 no 8.43 1 00:06:5b:3e:c6:25 no 163.45 1 00:06:5b:ec:3a:a1 no 2.76 1 00:06:5b:f3:42:48 no 12.03 1 00:06:5b:f4:a9:61 no 17.32 1 00:06:5b:f4:a9:69 no 2.39 1 00:0b:db:a8:a8:e9 no 34.92 1 00:0d:56:fd:9a:bb no 74.03 1 00:0d:bc:8d:38:da no 1.57 1 00:0f:1f:03:f9:0f no 163.29 1 00:0f:1f:04:7c:e3 no 36.59 1 00:0f:1f:04:7c:e9 no 39.61 1 00:0f:1f:64:a0:9a yes 0.00 1 00:0f:1f:65:b8:53 no 211.43 1 00:0f:1f:6b:b2:89 no 39.68 1 00:0f:1f:9d:18:30 no 77.91 1 00:10:18:06:5d:4b no 26.58 1 00:11:43:de:20:58 no 6.83 1 00:30:48:20:ab:17 no 0.56 1 00:30:48:20:ae:3a no 295.75 1 00:30:48:20:ae:6e no 98.88 1 00:80:72:09:8a:f4 no 13.11 1 00:90:27:86:f7:b9 no 25.31 1 00:90:27:e5:3a:af no 2.11 1 00:90:27:e5:3f:66 no 32.59 1 00:b0:d0:20:5e:1a no 95.73 1 00:b0:d0:20:5e:79 no 22.35 1 00:b0:d0:20:f0:43 no 30.34 1 00:b0:d0:49:2d:2b no 97.35 1 00:b0:d0:49:c5:b1 no 13.12 1 00:b0:d0:68:1c:97 no 20.86 1 00:b0:d0:fc:01:a2 no 10.18 1 00:d0:01:88:ec:0a no 0.00 1 00:d0:b7:74:b6:1e no 82.95 1 00:d0:b7:74:b7:a3 no 36.52 1 00:d0:b7:84:a8:fc no 28.91 1 00:d0:b7:91:5b:f6 no 39.66 1 08:00:20:cd:84:c0 no 0.12 1 08:00:20:cd:b5:53 no 259.89 1 08:00:20:d2:2e:17 no 170.59 1 08:00:20:fe:61:16 no 127.50 1 08:00:20:fe:a2:5c no 8.44 3 aa:00:00:07:82:1e no 3.49 5 aa:00:00:0b:16:e3 no 0.01 2 aa:00:00:0c:fc:84 no 34.63 4 aa:00:00:50:55:f8 no 8.44 2 fe:ff:ff:ff:ff:ff yes 0.00 # brctl showstp xen-br0 xen-br0 bridge id 8000.000f1f64a09a designated root 8000.000f1f64a09a root port 0 path cost 0 max age 20.00 bridge max age 20.00 hello time 2.00 bridge hello time 2.00 forward delay 0.00 bridge forward delay 0.00 ageing time 300.00 hello timer 0.06 tcn timer 0.00 topology change timer 0.00 gc timer 5.27 flags eth0 (1) port id 8001 state forwarding designated root 8000.000f1f64a09a path cost 4 designated bridge 8000.000f1f64a09a message age timer 0.00 designated port 8001 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags vif25.0 (2) port id 8002 state forwarding designated root 8000.000f1f64a09a path cost 100 designated bridge 8000.000f1f64a09a message age timer 0.00 designated port 8002 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags vif26.0 (3) port id 8003 state forwarding designated root 8000.000f1f64a09a path cost 100 designated bridge 8000.000f1f64a09a message age timer 0.00 designated port 8003 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags vif27.0 (4) port id 8004 state forwarding designated root 8000.000f1f64a09a path cost 100 designated bridge 8000.000f1f64a09a message age timer 0.00 designated port 8004 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags vif28.0 (5) port id 8005 state forwarding designated root 8000.000f1f64a09a path cost 100 designated bridge 8000.000f1f64a09a message age timer 0.00 designated port 8005 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags Thanks... -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
FWIW, this intermittent packet loss & hang problem continues. We''ve disabled STP on the network, enabled/disabled it on xen-br0, sniffed, and watched layer 2 with arpwatch, all showing nothing wrong -- packets just disappear on the bridge or are delayed for the external interface. Everything else seems normal though. Any suggestions? Such an unstable network is useless for production applications and I''m afraid we won''t be able to continue using Xen... Thanks, John -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> FWIW, this intermittent packet loss & hang problem continues. > We''ve disabled STP on the network, enabled/disabled it on > xen-br0, sniffed, and watched layer 2 with arpwatch, all > showing nothing wrong -- packets just disappear on the bridge > or are delayed for the external interface. Everything else > seems normal though. > > Any suggestions? Such an unstable network is useless for > production applications and I''m afraid we won''t be able to > continue using Xen...Have you tried using a routed rather than bridged setup? See /etc/xen/scripts/network-router and vif-route Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
John Madden wrote:> FWIW, this intermittent packet loss & hang problem continues. We''ve disabled STP > on the network, enabled/disabled it on xen-br0, sniffed, and watched layer 2 with > arpwatch, all showing nothing wrong -- packets just disappear on the bridge or are > delayed for the external interface. Everything else seems normal though. > > Any suggestions? Such an unstable network is useless for production applications > and I''m afraid we won''t be able to continue using Xen... > > Thanks, > JohnJohn, are you seeing tcp retransmits, at least? thanks, Nivedita _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> John, are you seeing tcp retransmits, at least?No, although I''m no expert when it comes to tcpdump. Here''s the relevant snippet of the sniff from the domU: 10:07:34.826810 IP 10.0.15.67 > helpdesk.ivytech.edu: icmp 64: echo request seq 194 10:07:35.157683 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: P 27984:28032(48) ack 164033 win 15376 <nop,nop,timestamp 331977649 25997694> 10:07:35.158475 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: . ack 164753 win 15376 <nop,nop,timestamp 331977650 25997769> 10:07:39.784691 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: P 28032:28080(48) ack 164753 win 15376 <nop,nop,timestamp 331978253 25997769> 10:07:39.784707 IP 10.0.15.67 > helpdesk.ivytech.edu: icmp 64: echo request seq 195 10:07:39.784728 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: P 28032:28080(48) ack 164753 win 15376 <nop,nop,timestamp 331978454 25997769> 10:07:39.785085 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: P 28080:28320(240) ack 164753 win 15376 <nop,nop,timestamp 331982277 25998232> 10:07:39.786365 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: . ack 164817 win 15376 <nop,nop,timestamp 331982278 25998232> 10:07:39.786411 IP 10.0.15.67.26034 > helpdesk.ivytech.edu.ssh: . ack 164929 win 15376 <nop,nop,timestamp 331982278 25998232> 10:07:39.826972 IP 10.0.15.67 > helpdesk.ivytech.edu: icmp 64: echo request seq 199 10:07:40.827022 IP 10.0.15.67 > helpdesk.ivytech.edu: icmp 64: echo request seq 200 10:07:41.827073 IP 10.0.15.67 > helpdesk.ivytech.edu: icmp 64: echo request seq 201 Basically, I was running the tcpdump from domU (helpdesk.ivytech.edu) and ssh''d into it and pinging it from 10.0.15.67. The dump shows the missing icmp echo''s for seq''s 196, 197, and 198, during which time there was no response to ssh traffic either (although it eventually catches up, but that''s likely ssh-specific behavior anyway, not tcp retrans). John -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Have you tried using a routed rather than bridged setup? > > See /etc/xen/scripts/network-router and vif-routeI haven''t, although I noticed the option. I couldn''t find much documentation on what''s needed setup-wise and with a couple domains in production currently, I''d like to minimize downtime/disruption. Is there anything explaining the setup, or is it as plug-n-play as the default bridging setup? John -- John Madden UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
John Madden wrote:>>Have you tried using a routed rather than bridged setup? >> >>See /etc/xen/scripts/network-router and vif-route > > > I haven''t, although I noticed the option. I couldn''t find much documentation on > what''s needed setup-wise and with a couple domains in production currently, I''d > like to minimize downtime/disruption. Is there anything explaining the setup, or > is it as plug-n-play as the default bridging setup? > > JohnThere isn''t any documentation right now (but I can help you offline, still working up a doc patch for it), but it''s not any more complicated than setting up bridging (easier, actually). thanks, Nivedita _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users