I¹m running SXCE b90 on a couple of Dells in an xVM cluster, and I¹ve had two recent unexplained xVM panics on the same SXCE b90 system. Jun 19 15:09:36 lab-xvm-00-a ^Mpanic[cpu3]/thread=ffffff06ecefd1e0: Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 103648 kern.notice] mutex_enter: bad mutex, lp=ffffff06deb5b000 owner=644c6c2d007368 thread=ffffff06ecefd1e0 Jun 19 15:09:36 lab-xvm-00-a unix: [ID 100000 kern.notice] Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265bc0 unix:mutex_panic+73 () Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265c30 unix:mutex_vector_enter+452 () Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265c90 genunix:close_exec+8b () Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265e90 genunix:exec_common+71e () Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265ec0 genunix:exece+1b () Jun 19 15:09:36 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002e265f10 unix:brand_sys_syscall32+1d0 () Jun 19 15:11:52 lab-xvm-00-a savecore: [ID 570001 auth.error] reboot after panic: mutex_enter: bad mutex, lp=ffffff06deb5b000 owner=644c6c2d007368 thread=ffffff06ecefd1e0 And another panic: Jun 20 19:06:38 lab-xvm-00-a ^Mpanic[cpu0]/thread=ffffff06de785220: Jun 20 19:06:38 lab-xvm-00-a genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff002dee8970 addr=ffffff06deb78000 Jun 20 19:06:38 lab-xvm-00-a unix: [ID 100000 kern.notice] Jun 20 19:06:38 lab-xvm-00-a unix: [ID 839527 kern.notice] intrd: Jun 20 19:06:38 lab-xvm-00-a unix: [ID 753105 kern.notice] #pf Page fault Jun 20 19:06:38 lab-xvm-00-a unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xffffff06deb78000 Jun 20 19:06:38 lab-xvm-00-a unix: [ID 243837 kern.notice] pid=437, pc=0xfffffffffb84a6b9, sp=0xffffff002dee8a68, eflags=0x10246 Jun 20 19:06:38 lab-xvm-00-a unix: [ID 211416 kern.notice] cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 660<xmme,fxsr,mce,pae> Jun 20 19:06:38 lab-xvm-00-a unix: [ID 624947 kern.notice] cr2: ffffff06deb78000 Jun 20 19:06:38 lab-xvm-00-a unix: [ID 100000 kern.notice] Jun 20 19:06:38 lab-xvm-00-a unix: [ID 592667 kern.notice] rdi: ffffff06deb78000 rsi: a81 rdx: ffffff06de785220 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] rcx: 150 r8: 0 r9: ffffff06d32b7b78 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] rax: 0 rbx: 100001 rbp: ffffff002dee8aa0 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] r10: ffffffffc00c0f68 r11: 0 r12: a81 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] r13: ffffff06deb78000 r14: ffffff06d322c900 r15: 0 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] fsb: 0 gsb: fffffffffbc5bff0 ds: 4b Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] trp: e err: 2 rip: fffffffffb84a6b9 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 592667 kern.notice] cs: e030 rfl: 10246 rsp: ffffff002dee8a68 Jun 20 19:06:39 lab-xvm-00-a unix: [ID 266532 kern.notice] ss: e02b Jun 20 19:06:39 lab-xvm-00-a unix: [ID 100000 kern.notice] Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8850 unix:die+ea () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8960 unix:trap+13cf () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8970 unix:_cmntrap+12f () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8aa0 unix:bzero+9 () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8c70 kstat:read_kstat_data+111 () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8cb0 kstat:kstat_ioctl+4a () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8cf0 genunix:cdev_ioctl+48 () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8d30 specfs:spec_ioctl+86 () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8db0 genunix:fop_ioctl+7b () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8ec0 genunix:ioctl+174 () Jun 20 19:06:39 lab-xvm-00-a genunix: [ID 655072 kern.notice] ffffff002dee8f10 unix:brand_sys_syscall32+1d0 () Jun 20 23:09:10 lab-xvm-00-a savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff002dee8970 addr=ffffff06deb78000 Note the time jump. The system RTC still appears to be completely wrong. NTP corrects things, however. Both of these crashes appear to have happened only after having fixed my timezone/clock problems in my domUs by running the following before only the first panic. I don¹t see how this could be related, but it¹s worth mentioning (previous timezone was EST5EDT): # rtc -c -z UTC This is a Dell 2970 dual Opteron 2220 dual core system with 32G of RAM, but I¹ve configured them for ³spare bank² operation so only 24G is visible to OS and the BIOS. The dom0 network interfaces include onboard bnx and an add-in e1000 (Intel 1000-PT). DomUs that were running at the time: * about a dozen 64bit paravirt Ubuntu Hardy (8.04) domU guests running with the XenSource 2.4.18 vmlinuz kernel from the Xen 3.1 x86_64 binary tarball * one SXCE b85 paravirt domU The guests were all idle. Primarily playground/development guests with nobody using them at the time. Of the 24G of memory, most of it was allocated to guests at both times. The domUs are all running on iSCSI devices exposed by the dom0¹s initiator, as exposed from another Dell 2970 acting as an iSCSI server (zfs zvols, iscsishare=on). This is all in a lab, of course, none of this is production. Per my last email to the list, the last time this happened, Stuart suggested that the 32G bug was likely causing my ills. Having upgraded from b87 to b90, and having limited RAM availability to 24G by enabling ³spare bank² operation, could this conceivably still be causing me grief? Thanks for any advice in advance, - Ian C. Blenke <ian@blenke.com> <iblenke@csdvrs.com>