Chiu, PCM (Peter)
2008-Jun-03 10:38 UTC
[Xen-users] OpenSuse10.3/2.6.22.17 DomU intermittent hang (irqbalance?)
Wonder if anyone can shed some light to this. I have set up a XEN server using OpenSuse 10.3 (x86_64) and a virtual machine also with OpenSuse 10.3 (x86_64), the latter with 4 CPUs and 10GB of memory space. We have been hitting some intermittent system hangs on the virtual machine. The system log does not reveal much, so I have turned on the audit log. In the last occasion, the audit log reveals the last system call being on irqbalance. Then on the console, a slightly distorted messages shows "unable to handle kernel NULL pointer dereference" - see below. Will be grateful if anyone has got any idea on this. Regards, Peter 1. Audit log type=PATH msg=audit(02/06/08 23:47:11.005:14164084) : item=0 name=/proc/irq/264/smp_affinity inode=4026532404 dev=00:03 mode=file,600 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(02/06/08 23:47:11.005:14164084) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164084) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) ---- type=PATH msg=audit(02/06/08 23:47:11.005:14164085) : item=0 name=/proc/irq/261/smp_affinity inode=4026532398 dev=00:03 mode=file,600 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(02/06/08 23:47:11.005:14164085) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164085) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) ---- type=CWD msg=audit(02/06/08 23:47:11.005:14164086) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164086) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) -------> the above is last system message before system hang. 2. Console log audit(1211876547.970:2338): cwd="/etc/sysconfig/network" audit(1211876547.970:2338): item=0 name="/etc/modprobe.d" inode=251861312 dev=ca:01 mode=040755 ouid=0 ogid=0 rdev=00:00 audit(1211876547.970:2339): arch=c000003e syscall=2 success=yes exit=7 a0=7fff6cbf4240 a1=0 a2=1b6 a3=0audit(1211876548.002:2423): arch=c000003e syscall=2 success=yes exit=3 a0=7fff4809c7e0 a1=0 a2=0 a3=4547415353454d5f items=1 ppid=2332 pid=2349 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) comm="rm" exe="/bin/rm" key=(null) audit(1211876548.002:2423): cwd="/etc/sysconfig/network" audit(1211876548.002:2423): item=0 name="/usr/lib/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES" inode=83999014 dev=ca:01 mode=0100644 ouid=0 ogid=audit(121187654audaudit(1211876548.046:2544): arch=c000003e syscall=2 success=yes exit=3 a0=2aaba2c34b96 a1=0 a2=1 a3=ffffffffffaudit(1211876548.05audiauUnable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff80278380>] free_bl ipt_REJECT xt_state iptablFS: 00002af5d3bd7b00(0000) GS:ffffffff804dc000(0000) knlGS:00 48 c7 46 08 00 02 20 00 RIP [<ffffffff80278380>] free_block+0x9d/0x146 RSP <ffff88027f837a50> CR2: 0000000000000000 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Chiu, PCM (Peter)
2008-Jun-03 11:39 UTC
[Xen-users] OpenSuse10.3/2.6.22.17 DomU intermittent hang (irqbalance?)
Wonder if anyone can shed some light to this. I have set up a XEN server using OpenSuse 10.3 (x86_64) and a virtual machine also with OpenSuse 10.3 (x86_64), the latter with 4 CPUs and 10GB of memory space. We have been hitting some intermittent system hangs on the virtual machine. The system log does not reveal much, so I have turned on the audit log. In the last occasion, the audit log reveals the last system call being on irqbalance. Then on the console, a slightly distorted messages shows "unable to handle kernel NULL pointer dereference" - see below. Will be grateful if anyone has got any idea on this. Regards, Peter 1. Audit log type=PATH msg=audit(02/06/08 23:47:11.005:14164084) : item=0 name=/proc/irq/264/smp_affinity inode=4026532404 dev=00:03 mode=file,600 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(02/06/08 23:47:11.005:14164084) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164084) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) ---- type=PATH msg=audit(02/06/08 23:47:11.005:14164085) : item=0 name=/proc/irq/261/smp_affinity inode=4026532398 dev=00:03 mode=file,600 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(02/06/08 23:47:11.005:14164085) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164085) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) ---- type=CWD msg=audit(02/06/08 23:47:11.005:14164086) : cwd=/ type=SYSCALL msg=audit(02/06/08 23:47:11.005:14164086) : arch=x86_64 syscall=open success=yes exit=3 a0=7fff19073ff0 a1=241 a2=1b6 a3=241 items=1 ppid=1 pid=2555 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) comm=irqbalance exe=/usr/sbin/irqbalance key=(null) -------> the above is last system message before system hang. 2. Console log audit(1211876547.970:2338): cwd="/etc/sysconfig/network" audit(1211876547.970:2338): item=0 name="/etc/modprobe.d" inode=251861312 dev=ca:01 mode=040755 ouid=0 ogid=0 rdev=00:00 audit(1211876547.970:2339): arch=c000003e syscall=2 success=yes exit=7 a0=7fff6cbf4240 a1=0 a2=1b6 a3=0audit(1211876548.002:2423): arch=c000003e syscall=2 success=yes exit=3 a0=7fff4809c7e0 a1=0 a2=0 a3=4547415353454d5f items=1 ppid=2332 pid=2349 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) comm="rm" exe="/bin/rm" key=(null) audit(1211876548.002:2423): cwd="/etc/sysconfig/network" audit(1211876548.002:2423): item=0 name="/usr/lib/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES" inode=83999014 dev=ca:01 mode=0100644 ouid=0 ogid=audit(121187654audaudit(1211876548.046:2544): arch=c000003e syscall=2 success=yes exit=3 a0=2aaba2c34b96 a1=0 a2=1 a3=ffffffffffaudit(1211876548.05audiauUnable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff80278380>] free_bl ipt_REJECT xt_state iptablFS: 00002af5d3bd7b00(0000) GS:ffffffff804dc000(0000) knlGS:00 48 c7 46 08 00 02 20 00 RIP [<ffffffff80278380>] free_block+0x9d/0x146 RSP <ffff88027f837a50> CR2: 0000000000000000 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users