I have domu hanging on 3 different pv domus. All are dom0 and domu with Debian lenny. It happens randomly roughly once a month. There are no errors in the logs. Can not ping domu. xm console comes up but is unresponsive. Have tried watch watchdog with softdog bit it does not help. xm shutdown does not work. Have to restart with xm destroy, xm create. All have on_crash = ''restart'' set. System1 amd64 dom0 64bit xen 2.6.18.8-xen xen 3.4.2 domu kernel 2.6.26-2-xen-amd64 System 2 intel 64 bit dom0 64bit xen 2.6.18.8-xen xen 3.4.2 domu kernel 2.6.26-2-xen-amd64 System3 ( This system had some random hangs before going to xen so it could be hardware) intel 32 bit dom0 2.6.26-2-xen-686 xen 3.2-1 domu 2.6.26-2-xen-686 While I''d like to fix it from hanging why doesn''t xen restart it? I remember the on_crash = ''restart'' working fine in the past. What does on_crash look for? Any ideas? John _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Check memory usage. Some kernels for domU have bug with incorrect free memory reporting. In this case in OOM situation kernel become mad and killing every process and start to panic. And oom happens not at very end of memory, but much early. Just try to disable swap in VM and ''eat'' memory. I usually use interactive python with constructions: a=" "*1024*1024*256 b=" "*1024*1024*128 c=" "*1024*1024*64, etc... Until OOM happens. If python killed - kernel sane. If Something Horrible Happens - buggy kernel. F.e. debian xen-686 kernel is buggy, I right now takling with maintainers. Gentoo/SUSE kernel is fine with memory but have some issues with wallclock. В Втр, 07/09/2010 в 15:28 -0500, John McMonagle пишет:> I have domu hanging on 3 different pv domus. > > All are dom0 and domu with Debian lenny. > It happens randomly roughly once a month. > There are no errors in the logs. > Can not ping domu. > xm console comes up but is unresponsive. > Have tried watch watchdog with softdog bit it does not help. > xm shutdown does not work. > Have to restart with xm destroy, xm create. > All have on_crash = ''restart'' set. > > System1 > amd64 > dom0 64bit xen 2.6.18.8-xen xen 3.4.2 > domu kernel 2.6.26-2-xen-amd64 > > System 2 > intel 64 bit > dom0 64bit xen 2.6.18.8-xen xen 3.4.2 > domu kernel 2.6.26-2-xen-amd64 > > System3 ( This system had some random hangs before going to xen so it could be > hardware) > intel 32 bit > dom0 2.6.26-2-xen-686 xen 3.2-1 > domu 2.6.26-2-xen-686 > > While I''d like to fix it from hanging why doesn''t xen restart it? > I remember the on_crash = ''restart'' working fine in the past. > What does on_crash look for? > Any ideas? > > John > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
George I just tried on a test domu on one of the problem servers using the same domu amd64 kernel.>>> a=" "*1024*1024*256 >>> b=" "*1024*1024*256 >>> c=" "*1024*1024*256 >>> d=" "*1024*1024*256Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError Did I do it correctly? Domu did not crash. Python did not die but had the error message above. John On Tuesday 07 September 2010 03:40:01 pm George Shuklin wrote:> Check memory usage. Some kernels for domU have bug with incorrect free > memory reporting. In this case in OOM situation kernel become mad and > killing every process and start to panic. And oom happens not at very > end of memory, but much early. > > Just try to disable swap in VM and ''eat'' memory. I usually use > interactive python with constructions: > a=" "*1024*1024*256 > b=" "*1024*1024*128 > c=" "*1024*1024*64, etc... Until OOM happens. If python killed - kernel > sane. If Something Horrible Happens - buggy kernel. > > F.e. debian xen-686 kernel is buggy, I right now takling with > maintainers. Gentoo/SUSE kernel is fine with memory but have some issues > with wallclock. > > В Втр, 07/09/2010 в 15:28 -0500, John McMonagle пишет: > > I have domu hanging on 3 different pv domus. > > > > All are dom0 and domu with Debian lenny. > > It happens randomly roughly once a month. > > There are no errors in the logs. > > Can not ping domu. > > xm console comes up but is unresponsive. > > Have tried watch watchdog with softdog bit it does not help. > > xm shutdown does not work. > > Have to restart with xm destroy, xm create. > > All have on_crash = ''restart'' set._______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users