We had a panic around noon on Saturday, which it mostly recovered itself. All ZFS NFS exports just remounted, but the UFS on zdev NFS exports did not, needed manual umount && mount on all clients for some reason. Is this a known bug we should consider a patch for? May 10 11:49:46 x4500-01.unix ufs: [ID 912200 kern.notice] quota_ufs: over hard disk limit (pid 477, uid 127409, inum 1047211, fs /export/zero1) May 10 11:51:26 x4500-01.unix unix: [ID 836849 kern.notice] May 10 11:51:26 x4500-01.unix ^Mpanic[cpu3]/thread=ffffffff17b8c820: May 10 11:51:26 x4500-01.unix genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f4ca220 addr=0 occurred in module "<unknown>" due t o a NULL pointer dereference May 10 11:51:26 x4500-01.unix unix: [ID 100000 kern.notice] May 10 11:51:26 x4500-01.unix unix: [ID 839527 kern.notice] nfsd: May 10 11:51:26 x4500-01.unix unix: [ID 753105 kern.notice] #pf Page fault May 10 11:51:26 x4500-01.unix unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x0 May 10 11:51:26 x4500-01.unix unix: [ID 243837 kern.notice] pid=477, pc=0x0, sp0xffffff001f4ca318, eflags=0x10246 May 10 11:51:26 x4500-01.unix unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp, ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> May 10 11:51:26 x4500-01.unix unix: [ID 354241 kern.notice] cr2: 0 cr3: 1fcbbc00 0 cr8: c May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rdi: fffffffedef ea000 rsi: 9 rdx: 0 May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rcx: ffffffff17b 8c820 r8: 0 r9: ffffff054797dc48 May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rax: 0 rbx: 97eaffc rbp: ffffff001f4ca350 May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] r10: 0 r11: fffffffec8b93868 r12: 27991000 May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] r13: fffffffed1b 59c00 r14: fffffffecf8d8cc0 r15: 1000 May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] fsb: 0 gsb: fffffffec3d5a580 ds: 4b May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] trp: e err: 10 rip: 0 May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] cs: 30 rfl: 10246 rsp: ffffff001f4ca318 May 10 11:51:27 x4500-01.unix unix: [ID 266532 kern.notice] ss: 38 May 10 11:51:27 x4500-01.unix unix: [ID 100000 kern.notice] May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca100 unix:die+c8 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca210 unix:trap+135b () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca220 unix:_cmntrap+e9 () May 10 11:51:27 x4500-01.unix genunix: [ID 802836 kern.notice] ffffff001f4ca350 0 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca3d0 ufs:top_end_sync+cb () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca440 ufs:ufs_fsync+1cb () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca490 genunix:fop_fsync+51 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4ca770 nfssrv:rfs3_create+604 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4caa70 nfssrv:common_dispatch+444 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4caa90 nfssrv:rfs_dispatch+2d () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4cab80 rpcmod:svc_getreq+1c6 () May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4cabf0 rpcmod:svc_run+171 () May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4cac30 rpcmod:svc_do_run+85 () May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4caec0 nfs:nfssys+748 () May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] ffffff001f4caf10 unix:brand_sys_syscall32+1a3 () May 10 11:51:28 x4500-01.unix unix: [ID 100000 kern.notice] May 10 11:51:28 x4500-01.unix genunix: [ID 672855 kern.notice] syncing file syst ems... May 10 11:51:28 x4500-01.unix genunix: [ID 733762 kern.notice] 8 May 10 11:51:29 x4500-01.unix genunix: [ID 733762 kern.notice] 5 May 10 11:51:30 x4500-01.unix genunix: [ID 733762 kern.notice] 2 May 10 11:51:54 x4500-01.unix last message repeated 20 times May 10 11:51:55 x4500-01.unix genunix: [ID 622722 kern.notice] done (not all i/ o completed) May 10 11:51:56 x4500-01.unix genunix: [ID 111219 kern.notice] dumping to /dev/d sk/c6t0d0s1, offset 65536, content: kernel -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
OK, this is a pretty damn poor panic report if I may say no, not had much sleep. Solaris Express Developer Edition 9/07 snv_70b X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 30 August 2007 SunOS x4500-01.unix 5.11 snv_70b i86pc i386 i86pc Even though it dumped, it wrote nothing to /var/crash/. Perhaps because swap is mirrored. Jorgen Lundman wrote:> We had a panic around noon on Saturday, which it mostly recovered > itself. All ZFS NFS exports just remounted, but the UFS on zdev NFS > exports did not, needed manual umount && mount on all clients for some > reason. > > Is this a known bug we should consider a patch for? > > > > May 10 11:49:46 x4500-01.unix ufs: [ID 912200 kern.notice] quota_ufs: > over hard > disk limit (pid 477, uid 127409, inum 1047211, fs /export/zero1) > May 10 11:51:26 x4500-01.unix unix: [ID 836849 kern.notice] > May 10 11:51:26 x4500-01.unix ^Mpanic[cpu3]/thread=ffffffff17b8c820: > May 10 11:51:26 x4500-01.unix genunix: [ID 335743 kern.notice] BAD TRAP: > type=e > (#pf Page fault) rp=ffffff001f4ca220 addr=0 occurred in module > "<unknown>" due t > o a NULL pointer dereference > May 10 11:51:26 x4500-01.unix unix: [ID 100000 kern.notice] > May 10 11:51:26 x4500-01.unix unix: [ID 839527 kern.notice] nfsd: > May 10 11:51:26 x4500-01.unix unix: [ID 753105 kern.notice] #pf Page fault > May 10 11:51:26 x4500-01.unix unix: [ID 532287 kern.notice] Bad kernel > fault at > addr=0x0 > May 10 11:51:26 x4500-01.unix unix: [ID 243837 kern.notice] pid=477, > pc=0x0, sp> 0xffffff001f4ca318, eflags=0x10246 > May 10 11:51:26 x4500-01.unix unix: [ID 211416 kern.notice] cr0: > 8005003b<pg,wp, > ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > May 10 11:51:26 x4500-01.unix unix: [ID 354241 kern.notice] cr2: 0 cr3: > 1fcbbc00 > 0 cr8: c > May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rdi: > fffffffedef > ea000 rsi: 9 rdx: 0 > May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rcx: > ffffffff17b > 8c820 r8: 0 r9: ffffff054797dc48 > May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rax: > > 0 rbx: 97eaffc rbp: ffffff001f4ca350 > May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] r10: > > 0 r11: fffffffec8b93868 r12: 27991000 > May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] r13: > fffffffed1b > 59c00 r14: fffffffecf8d8cc0 r15: 1000 > May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] fsb: > > 0 gsb: fffffffec3d5a580 ds: 4b > May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] es: > > 4b fs: 0 gs: 1c3 > May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] trp: > > e err: 10 rip: 0 > May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] cs: > > 30 rfl: 10246 rsp: ffffff001f4ca318 > May 10 11:51:27 x4500-01.unix unix: [ID 266532 kern.notice] ss: > > 38 > May 10 11:51:27 x4500-01.unix unix: [ID 100000 kern.notice] > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca100 > unix:die+c8 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca210 > unix:trap+135b () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca220 > unix:_cmntrap+e9 () > May 10 11:51:27 x4500-01.unix genunix: [ID 802836 kern.notice] > ffffff001f4ca350 > 0 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca3d0 > ufs:top_end_sync+cb () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca440 > ufs:ufs_fsync+1cb () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca490 > genunix:fop_fsync+51 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4ca770 > nfssrv:rfs3_create+604 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4caa70 > nfssrv:common_dispatch+444 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4caa90 > nfssrv:rfs_dispatch+2d () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4cab80 > rpcmod:svc_getreq+1c6 () > May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4cabf0 > rpcmod:svc_run+171 () > May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4cac30 > rpcmod:svc_do_run+85 () > May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4caec0 > nfs:nfssys+748 () > May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] > ffffff001f4caf10 > unix:brand_sys_syscall32+1a3 () > May 10 11:51:28 x4500-01.unix unix: [ID 100000 kern.notice] > May 10 11:51:28 x4500-01.unix genunix: [ID 672855 kern.notice] syncing > file syst > ems... > May 10 11:51:28 x4500-01.unix genunix: [ID 733762 kern.notice] 8 > May 10 11:51:29 x4500-01.unix genunix: [ID 733762 kern.notice] 5 > May 10 11:51:30 x4500-01.unix genunix: [ID 733762 kern.notice] 2 > May 10 11:51:54 x4500-01.unix last message repeated 20 times > May 10 11:51:55 x4500-01.unix genunix: [ID 622722 kern.notice] done > (not all i/ > o completed) > May 10 11:51:56 x4500-01.unix genunix: [ID 111219 kern.notice] dumping > to /dev/d > sk/c6t0d0s1, offset 65536, content: kernel > >-- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Dumping to /dev/dsk/c6t0d0s1 certainly looks like a non-mirrored dump dev... You might try a manual savecore telling it to ignore the dump valid header and see what you get... savecore -d and perhaps try telling it to look directly at the dump device... savecore -f <device> You should also, when you get the chance, deliberately panic the box to make sure you can actually capture a dump... dumpadm is your friend as far as checking where you are going to dump to, and it it''s one side of your swap mirror, that''s bad, M''Kay? :) Nathan. Jorgen Lundman wrote:> OK, this is a pretty damn poor panic report if I may say no, not had > much sleep. > > Solaris Express Developer Edition 9/07 snv_70b X86 > Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 30 August 2007 > > SunOS x4500-01.unix 5.11 snv_70b i86pc i386 i86pc > > Even though it dumped, it wrote nothing to /var/crash/. Perhaps because > swap is mirrored. > > > > Jorgen Lundman wrote: >> We had a panic around noon on Saturday, which it mostly recovered >> itself. All ZFS NFS exports just remounted, but the UFS on zdev NFS >> exports did not, needed manual umount && mount on all clients for some >> reason. >> >> Is this a known bug we should consider a patch for? >> >> >> >> May 10 11:49:46 x4500-01.unix ufs: [ID 912200 kern.notice] quota_ufs: >> over hard >> disk limit (pid 477, uid 127409, inum 1047211, fs /export/zero1) >> May 10 11:51:26 x4500-01.unix unix: [ID 836849 kern.notice] >> May 10 11:51:26 x4500-01.unix ^Mpanic[cpu3]/thread=ffffffff17b8c820: >> May 10 11:51:26 x4500-01.unix genunix: [ID 335743 kern.notice] BAD TRAP: >> type=e >> (#pf Page fault) rp=ffffff001f4ca220 addr=0 occurred in module >> "<unknown>" due t >> o a NULL pointer dereference >> May 10 11:51:26 x4500-01.unix unix: [ID 100000 kern.notice] >> May 10 11:51:26 x4500-01.unix unix: [ID 839527 kern.notice] nfsd: >> May 10 11:51:26 x4500-01.unix unix: [ID 753105 kern.notice] #pf Page fault >> May 10 11:51:26 x4500-01.unix unix: [ID 532287 kern.notice] Bad kernel >> fault at >> addr=0x0 >> May 10 11:51:26 x4500-01.unix unix: [ID 243837 kern.notice] pid=477, >> pc=0x0, sp>> 0xffffff001f4ca318, eflags=0x10246 >> May 10 11:51:26 x4500-01.unix unix: [ID 211416 kern.notice] cr0: >> 8005003b<pg,wp, >> ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> >> May 10 11:51:26 x4500-01.unix unix: [ID 354241 kern.notice] cr2: 0 cr3: >> 1fcbbc00 >> 0 cr8: c >> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rdi: >> fffffffedef >> ea000 rsi: 9 rdx: 0 >> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rcx: >> ffffffff17b >> 8c820 r8: 0 r9: ffffff054797dc48 >> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rax: >> >> 0 rbx: 97eaffc rbp: ffffff001f4ca350 >> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] r10: >> >> 0 r11: fffffffec8b93868 r12: 27991000 >> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] r13: >> fffffffed1b >> 59c00 r14: fffffffecf8d8cc0 r15: 1000 >> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] fsb: >> >> 0 gsb: fffffffec3d5a580 ds: 4b >> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] es: >> >> 4b fs: 0 gs: 1c3 >> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] trp: >> >> e err: 10 rip: 0 >> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] cs: >> >> 30 rfl: 10246 rsp: ffffff001f4ca318 >> May 10 11:51:27 x4500-01.unix unix: [ID 266532 kern.notice] ss: >> >> 38 >> May 10 11:51:27 x4500-01.unix unix: [ID 100000 kern.notice] >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca100 >> unix:die+c8 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca210 >> unix:trap+135b () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca220 >> unix:_cmntrap+e9 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 802836 kern.notice] >> ffffff001f4ca350 >> 0 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca3d0 >> ufs:top_end_sync+cb () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca440 >> ufs:ufs_fsync+1cb () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca490 >> genunix:fop_fsync+51 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4ca770 >> nfssrv:rfs3_create+604 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4caa70 >> nfssrv:common_dispatch+444 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4caa90 >> nfssrv:rfs_dispatch+2d () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4cab80 >> rpcmod:svc_getreq+1c6 () >> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4cabf0 >> rpcmod:svc_run+171 () >> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4cac30 >> rpcmod:svc_do_run+85 () >> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4caec0 >> nfs:nfssys+748 () >> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >> ffffff001f4caf10 >> unix:brand_sys_syscall32+1a3 () >> May 10 11:51:28 x4500-01.unix unix: [ID 100000 kern.notice] >> May 10 11:51:28 x4500-01.unix genunix: [ID 672855 kern.notice] syncing >> file syst >> ems... >> May 10 11:51:28 x4500-01.unix genunix: [ID 733762 kern.notice] 8 >> May 10 11:51:29 x4500-01.unix genunix: [ID 733762 kern.notice] 5 >> May 10 11:51:30 x4500-01.unix genunix: [ID 733762 kern.notice] 2 >> May 10 11:51:54 x4500-01.unix last message repeated 20 times >> May 10 11:51:55 x4500-01.unix genunix: [ID 622722 kern.notice] done >> (not all i/ >> o completed) >> May 10 11:51:56 x4500-01.unix genunix: [ID 111219 kern.notice] dumping >> to /dev/d >> sk/c6t0d0s1, offset 65536, content: kernel >> >> >
savecore -d System dump time: Sat May 10 11:51:56 2008 Constructing namelist /var/crash/x4500/unix.1 Constructing corefile /var/crash/x4500/vmcore.1 45% done When we get the second x4500 in we can do more testing in that area. But more importantly we need try to work out why the UFS exported file-systems failed to recover properly. They are mounted "hard" so that IO should wait, and yet it seems to just fail IO and make the mount-point invisible. Logging in to each and every server to remount the file-systems is somewhat tedious. # df -h Filesystem size used avail capacity Mounted on / 64G 24G 39G 39% / /dev 64G 24G 39G 39% /dev proc 0K 0K 0K 0% /proc [snip] swap 1.1G 12K 1.1G 1% /var/run df: cannot statvfs /export/test: No such file or directory # ls -l /export/ test-www01:~# ls -la /export/ total 16 drwxr-xr-x 6 root sys 512 Mar 26 16:09 . drwxr-xr-x 19 root root 512 Mar 19 11:30 .. drwxr-xr-x 23 root root 512 Apr 14 11:54 home drwxr-xr-x 2 root root 512 Mar 17 16:10 nfs No "test" directory there. # mount /export/test on x4500-01-vip:/export/test remote/read/write/setuid/nodevices/vers=3/hard/intr/quota/xattr/dev=4700002 on Tue Mar 25 11:10:52 2008 # mkdir -p /export/test/roo mkdir: "/export/test/roo": No such file or directory # umount /export/test # mount /export/test # df -h test-x4500-01-vip:/export/test 98G 4.1G 93G 5% /export/test More info from the panic: > $c top_end_sync+0xcb(fffffffedefea000, ffffff001f4ca424, b, 0) ufs_fsync+0x1cb(ffffffff3659f980, 10000, ffffffff6f0ccc70) fop_fsync+0x51(ffffffff3659f980, 10000, ffffffff6f0ccc70) rfs3_create+0x604(ffffff001f4ca7c8, ffffff001f4ca8b8, ffffff04e7627d80, ffffff001f4cab20, ffffffff6f0ccc70) common_dispatch+0x444(ffffff001f4cab20, ffffffffa71cc1c0, 2, 4, fffffffff8553a78 , ffffffffc039d3d0) rfs_dispatch+0x2d(ffffff001f4cab20, ffffffffa71cc1c0) svc_getreq+0x1c6(ffffffffa71cc1c0, fffffffec69bddc0) svc_run+0x171(fffffffecc7581c0) svc_do_run+0x85(1) nfssys+0x748(e, fec80fc8) sys_syscall32+0x101() > ::panicinfo cpu 3 thread ffffffff17b8c820 message BAD TRAP: type=e (#pf Page fault) rp=ffffff001f4ca220 addr=0 occurred in module "<unknown>" due to a NULL pointer dereference rdi fffffffedefea000 rsi 9 rdx 0 rcx ffffffff17b8c820 r8 0 r9 ffffff054797dc48 rax 0 rbx 97eaffc rbp ffffff001f4ca350 r10 0 r10 0 r11 fffffffec8b93868 r12 27991000 r13 fffffffed1b59c00 r14 fffffffecf8d8cc0 r15 1000 fsbase 0 gsbase fffffffec3d5a580 ds 4b es 4b fs 0 gs 1c3 trapno e err 10 rip 0 cs 30 rflags 10246 rsp ffffff001f4ca318 ss 38 gdt_hi 0 gdt_lo 500001ef idt_hi 0 idt_lo 40000fff ldt 0 task 70 cr0 8005003b cr2 0 cr3 1fcbbc000 cr4 6f8 Nathan Kroenert - Server ESG wrote:> Dumping to /dev/dsk/c6t0d0s1 > > certainly looks like a non-mirrored dump dev... > > You might try a manual savecore telling it to ignore the dump valid > header and see what you get... > > savecore -d > > and perhaps try telling it to look directly at the dump device... > > savecore -f <device> > > You should also, when you get the chance, deliberately panic the box to > make sure you can actually capture a dump... > > dumpadm is your friend as far as checking where you are going to dump > to, and it it''s one side of your swap mirror, that''s bad, M''Kay? > > :) > > Nathan. > > Jorgen Lundman wrote: >> OK, this is a pretty damn poor panic report if I may say no, not had >> much sleep. >> >> Solaris Express Developer Edition 9/07 snv_70b X86 >> Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. >> Use is subject to license terms. >> Assembled 30 August 2007 >> >> SunOS x4500-01.unix 5.11 snv_70b i86pc i386 i86pc >> >> Even though it dumped, it wrote nothing to /var/crash/. Perhaps >> because swap is mirrored. >> >> >> >> Jorgen Lundman wrote: >>> We had a panic around noon on Saturday, which it mostly recovered >>> itself. All ZFS NFS exports just remounted, but the UFS on zdev NFS >>> exports did not, needed manual umount && mount on all clients for >>> some reason. >>> >>> Is this a known bug we should consider a patch for? >>> >>> >>> >>> May 10 11:49:46 x4500-01.unix ufs: [ID 912200 kern.notice] quota_ufs: >>> over hard >>> disk limit (pid 477, uid 127409, inum 1047211, fs /export/zero1) >>> May 10 11:51:26 x4500-01.unix unix: [ID 836849 kern.notice] >>> May 10 11:51:26 x4500-01.unix ^Mpanic[cpu3]/thread=ffffffff17b8c820: >>> May 10 11:51:26 x4500-01.unix genunix: [ID 335743 kern.notice] BAD TRAP: >>> type=e >>> (#pf Page fault) rp=ffffff001f4ca220 addr=0 occurred in module >>> "<unknown>" due t >>> o a NULL pointer dereference >>> May 10 11:51:26 x4500-01.unix unix: [ID 100000 kern.notice] >>> May 10 11:51:26 x4500-01.unix unix: [ID 839527 kern.notice] nfsd: >>> May 10 11:51:26 x4500-01.unix unix: [ID 753105 kern.notice] #pf Page >>> fault >>> May 10 11:51:26 x4500-01.unix unix: [ID 532287 kern.notice] Bad kernel >>> fault at >>> addr=0x0 >>> May 10 11:51:26 x4500-01.unix unix: [ID 243837 kern.notice] pid=477, >>> pc=0x0, sp>>> 0xffffff001f4ca318, eflags=0x10246 >>> May 10 11:51:26 x4500-01.unix unix: [ID 211416 kern.notice] cr0: >>> 8005003b<pg,wp, >>> ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> >>> May 10 11:51:26 x4500-01.unix unix: [ID 354241 kern.notice] cr2: 0 cr3: >>> 1fcbbc00 >>> 0 cr8: c >>> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rdi: >>> fffffffedef >>> ea000 rsi: 9 rdx: 0 >>> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rcx: >>> ffffffff17b >>> 8c820 r8: 0 r9: ffffff054797dc48 >>> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] rax: >>> >>> 0 rbx: 97eaffc rbp: ffffff001f4ca350 >>> May 10 11:51:26 x4500-01.unix unix: [ID 592667 kern.notice] r10: >>> >>> 0 r11: fffffffec8b93868 r12: 27991000 >>> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] r13: >>> fffffffed1b >>> 59c00 r14: fffffffecf8d8cc0 r15: 1000 >>> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] fsb: >>> >>> 0 gsb: fffffffec3d5a580 ds: 4b >>> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] es: >>> >>> 4b fs: 0 gs: 1c3 >>> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] trp: >>> >>> e err: 10 rip: 0 >>> May 10 11:51:27 x4500-01.unix unix: [ID 592667 kern.notice] cs: >>> >>> 30 rfl: 10246 rsp: ffffff001f4ca318 >>> May 10 11:51:27 x4500-01.unix unix: [ID 266532 kern.notice] ss: >>> >>> 38 >>> May 10 11:51:27 x4500-01.unix unix: [ID 100000 kern.notice] >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca100 >>> unix:die+c8 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca210 >>> unix:trap+135b () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca220 >>> unix:_cmntrap+e9 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 802836 kern.notice] >>> ffffff001f4ca350 >>> 0 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca3d0 >>> ufs:top_end_sync+cb () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca440 >>> ufs:ufs_fsync+1cb () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca490 >>> genunix:fop_fsync+51 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4ca770 >>> nfssrv:rfs3_create+604 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4caa70 >>> nfssrv:common_dispatch+444 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4caa90 >>> nfssrv:rfs_dispatch+2d () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4cab80 >>> rpcmod:svc_getreq+1c6 () >>> May 10 11:51:27 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4cabf0 >>> rpcmod:svc_run+171 () >>> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4cac30 >>> rpcmod:svc_do_run+85 () >>> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4caec0 >>> nfs:nfssys+748 () >>> May 10 11:51:28 x4500-01.unix genunix: [ID 655072 kern.notice] >>> ffffff001f4caf10 >>> unix:brand_sys_syscall32+1a3 () >>> May 10 11:51:28 x4500-01.unix unix: [ID 100000 kern.notice] >>> May 10 11:51:28 x4500-01.unix genunix: [ID 672855 kern.notice] syncing >>> file syst >>> ems... >>> May 10 11:51:28 x4500-01.unix genunix: [ID 733762 kern.notice] 8 >>> May 10 11:51:29 x4500-01.unix genunix: [ID 733762 kern.notice] 5 >>> May 10 11:51:30 x4500-01.unix genunix: [ID 733762 kern.notice] 2 >>> May 10 11:51:54 x4500-01.unix last message repeated 20 times >>> May 10 11:51:55 x4500-01.unix genunix: [ID 622722 kern.notice] done >>> (not all i/ >>> o completed) >>> May 10 11:51:56 x4500-01.unix genunix: [ID 111219 kern.notice] dumping >>> to /dev/d >>> sk/c6t0d0s1, offset 65536, content: kernel >>> >>> >> >-- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)