Wilmer van der Gaast
2006-May-12 14:02 UTC
[Xen-users] (Network) unstability :-(( (Modified by Wilmer van der Gaast)
Hello, I''m running Xen for a while already by now, and usually it works very well, I''m really impressed. But there are problems too. :-( From time to time (it used to happen once in a month, sometimes twice, but now it happened twice in one hour) the network stack seems to break. I can reach the dom0 host perfectly from outside, but it can''t communicate with the domus anymore. (Not over IP, at least, xm console still works.) I tried to shut down the domus properly (using poweroff, as usual), but it doesn''t seem to work very well for two of the machines. They shut down, and IIRC previous time xm console also exits properly (didn''t check this time since I now used xm destroy), however, in xentop I still see this: xentop - 15:58:54 Xen 3.0.1 3 domains: 1 running, 1 blocked, 0 paused, 0 crashed, 1 dying, 0 shutdown Mem: 458296k total, 228864k used, 229432k free CPUs: 2 @ 548MHz NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) SSID d----- 69 0.0 60 0.0 98304 21.4 1 1 2295 1910 0 d-b--- 28 0.0 176 0.0 65536 14.3 1 1 934 247 0 Domain-0 -----r 138 0.8 209948 45.8 no limit n/a 2 8 0 0 0 I can reboot now (I''m 200km away from the machine right now), and it will work, but it takes about ten minutes first to shut down everything (it hangs for a while when Xen wants to save the machine states) and finally restart. So anyway, I''m afraid this isn''t really useful information. Things I can add: It''s a dual-processor (P3) machine, so maybe it''s an SMP issue? Or maybe it''s not very reliable on P3 (Katmai) hardware? Would upgrading 3.0.2 be a likely solution to this problem? Because this is really too annoying, I''m not used to having to reboot my server more than once a year. :-( Maybe the zombie files will contain useful information for debugging? [update: I tried to post this last Friday but I wasn''t subscribed. Upgraded to 3.0.2 yesterday but the problem is still there! :-( It''s especially strange that it shows up so often now, while it previously ran without any problems for a couple of weeks already.] Greetings, Wilmer van der Gaast. -- +-------- .''''`. - -- ---+ + - -- --- ---- ----- ------+ | wilmer : :'' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~'' debian.org | | Full-time geek wilmer.gaast.net | +--- -- - ` ---------------+ +------ ----- ---- --- -- - + _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Anton Khalikov
2006-May-16 16:57 UTC
Re: [Xen-users] (Network) unstability :-(( (Modified by Wilmer van der Gaast)
Hello Wilmer van der Gaast wrote:> From time to time (it used to happen once in a month, sometimes twice, > but now it happened twice in one hour) the network stack seems to break. > I can reach the dom0 host perfectly from outside, but it can''t > communicate with the domus anymore. (Not over IP, at least, xm console > still works.)Do you see Zombies in `xm list` output ? Does your syslog contain anything like "unregister_netdevice: waiting for vifX.0 to become free" ? If so it looks very similar to my problem except I''ve never seen that i can''t communicate to domUs but I get such message from time to time when i do reboot or poweroff on any of domUs. After this happens only SysRq call helps me to reboot the dom0 :( -- Best regards, Anton Khalikov _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Wilmer van der Gaast
2006-May-21 16:24 UTC
Re: [Xen-users] (Network) unstability :-(( (Modified by Wilmer van der Gaast)
Anton Khalikov wrote:> Do you see Zombies in `xm list` output ? Does your syslog contain > anything like "unregister_netdevice: waiting for vifX.0 to become free" > ?No, nothing like that... :-/ Last Friday I decided to blame hostap (running a wireless access point using PC hardware), so I unloaded that driver. Worked for a couple of days, but my domUs went down again today. :-( It seems I really can''t get any debugging information. Nothing in dmesg, nothing in "xm dmesg". I just have those zombie-files in /var/lib/xen/save/, and the fact that network traffic to domUs just doesn''t work anymore. "xm console" is the only way to reach the machines, and as soon as I try to shut them down properly, most of them block and break completely (can''t destroy them anymore, they always stay in "xm list"). So what can be the cause of this problem? It can''t be hostap anymore. I can''t imagine it''s an SMP problem because SMP isn''t quite rare anymore these days. I hope PIII (Katmai) CPUs aren''t a problem either. Is it maybe because of IPv6? I disabled IPv6 for all VMs now, I hope it''ll help. If it doesn''t, I really don''t know what else it can be anymore... Can anyone tell me how to do more debugging on this? Wilmer van der Gaast. -- +-------- .''''`. - -- ---+ + - -- --- ---- ----- ------+ | wilmer : :'' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~'' debian.org | | Full-time geek wilmer.gaast.net | +--- -- - ` ---------------+ +------ ----- ---- --- -- - + _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users