Hi, One of my NFS servers running 6.2-RC1 that are highly-loaded causes a panic repeatedly these days. I am not sure which upgrading this panic starts after precisely, but this was running for almost one year (6.0R and 6.1R) with no problem at least. A core file is available. ----(from here) Fatal trap 12: page fault while in kernel mode fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0xc069d890 stack pointer = 0x28:0xed0ae920 frame pointer = 0x28:0xed0ae928 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 653 (nfsd) trap number = 12 panic: page fault Uptime: 46m22s Dumping 1021 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 1022MB (261423 pages) 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc067c512 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc067c7d8 in panic (fmt=0xc08d0c8e "%s") at /usr/src/sys/kern/kern_shutdown.c:565 #3 0xc0892122 in trap_fatal (frame=0xed0ae8e0, eva=0) at /usr/src/sys/i386/i386/trap.c:837 #4 0xc0891866 in trap (frame {tf_fs = -992346104, tf_es = 40, tf_ds = 268107816, tf_edi = 72, tf_esi = 0, tf_ebp = -318052056, tf_isp = -318052084, tf_ebx = -993986688, tf_edx = -993986688, tf_ecx = 4, tf_eax = 4, tf_trapno = 12, tf_err = 0, tf_eip = -1066805104, tf_cs = 32, tf_eflags = 589831, tf_esp = 0, tf_ss = -1063278752}) at /usr/src/sys/i386/i386/trap.c:270 #5 0xc088012a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #6 0xc069d890 in turnstile_broadcast (ts=0x0) at /usr/src/sys/kern/subr_turnstile.c:726 #7 0xc06739d7 in _mtx_unlock_sleep (m=0xc09fa760, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:690 #8 0xc077e00b in nfs_rephead (siz=0, nd=0xc5023c00, err=72, mbp=0x4, bposp=0x4) at /usr/src/sys/nfsserver/nfs_srvsock.c:152 #9 0xc07779f3 in nfsrv_symlink (nfsd=0xc5023c00, slp=0xc4f8ae80, td=0xc4c0f780, mrq=0xed0aec98) at /usr/src/sys/nfsserver/nfs_serv.c:2844 #10 0xc07819b1 in nfssvc_nfsd (td=0x4) at /usr/src/sys/nfsserver/nfs_syscalls.c:474 #11 0xc0781194 in nfssvc (td=0xc4c0f780, uap=0xed0aed04) at /usr/src/sys/nfsserver/nfs_syscalls.c:181 #12 0xc0892437 in syscall (frame {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 1, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -318050972, tf_ebx = 12, tf_edx = 672449048, tf_ecx = 26, tf_eax = 155, tf_trapno = 12, tf_err = 2, tf_eip = 671863223, tf_cs = 51, tf_eflags = 662, tf_esp = -1077941492, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 #13 0xc088017f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #14 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) ----(to here) -- | Hiroki SATO -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061204/cd1f31d1/attachment.pgp
On Tue, Dec 05, 2006 at 12:43:23AM +0900, Hiroki Sato wrote:> Hi, > > One of my NFS servers running 6.2-RC1 that are highly-loaded causes a > panic repeatedly these days. I am not sure which upgrading this > panic starts after precisely, but this was running for almost one > year (6.0R and 6.1R) with no problem at least. A core file is > available. > > ----(from here) > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x0 > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc069d890 > stack pointer = 0x28:0xed0ae920 > frame pointer = 0x28:0xed0ae928 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = resume, IOPL = 0 > current process = 653 (nfsd) > trap number = 12 > panic: page fault > Uptime: 46m22s > Dumping 1021 MB (2 chunks) > chunk 0: 1MB (159 pages) ... ok > chunk 1: 1022MB (261423 pages) 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 > > #0 doadump () at pcpu.h:165 > 165 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump () at pcpu.h:165 > #1 0xc067c512 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc067c7d8 in panic (fmt=0xc08d0c8e "%s") > at /usr/src/sys/kern/kern_shutdown.c:565 > #3 0xc0892122 in trap_fatal (frame=0xed0ae8e0, eva=0) > at /usr/src/sys/i386/i386/trap.c:837 > #4 0xc0891866 in trap (frame> {tf_fs = -992346104, tf_es = 40, tf_ds = 268107816, tf_edi = 72, tf_esi = 0, tf_ebp = -318052056, tf_isp = -318052084, tf_ebx = -993986688, tf_edx = -993986688, tf_ecx = 4, tf_eax = 4, tf_trapno = 12, tf_err = 0, tf_eip = -1066805104, tf_cs = 32, tf_eflags = 589831, tf_esp = 0, tf_ss = -1063278752}) > at /usr/src/sys/i386/i386/trap.c:270 > #5 0xc088012a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > #6 0xc069d890 in turnstile_broadcast (ts=0x0) > at /usr/src/sys/kern/subr_turnstile.c:726 > #7 0xc06739d7 in _mtx_unlock_sleep (m=0xc09fa760, opts=0, file=0x0, line=0) > at /usr/src/sys/kern/kern_mutex.c:690 > #8 0xc077e00b in nfs_rephead (siz=0, nd=0xc5023c00, err=72, mbp=0x4, > bposp=0x4) at /usr/src/sys/nfsserver/nfs_srvsock.c:152 > #9 0xc07779f3 in nfsrv_symlink (nfsd=0xc5023c00, slp=0xc4f8ae80, > td=0xc4c0f780, mrq=0xed0aec98) at /usr/src/sys/nfsserver/nfs_serv.c:2844 > #10 0xc07819b1 in nfssvc_nfsd (td=0x4) > at /usr/src/sys/nfsserver/nfs_syscalls.c:474 > #11 0xc0781194 in nfssvc (td=0xc4c0f780, uap=0xed0aed04) > at /usr/src/sys/nfsserver/nfs_syscalls.c:181 > #12 0xc0892437 in syscall (frame> {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 1, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -318050972, tf_ebx = 12, tf_edx = 672449048, tf_ecx = 26, tf_eax = 155, tf_trapno = 12, tf_err = 2, tf_eip = 671863223, tf_cs = 51, tf_eflags = 662, tf_esp = -1077941492, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 > #13 0xc088017f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 > #14 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > ----(to here) > > -- > | Hiroki SATOWhat version of sys/nfsserver/nfs_serv.c do you use ? If it is older than 1.156.2.7, please, update the system. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061204/b54f4018/attachment.pgp
Kostik Belousov <kostikbel@gmail.com> wrote in <20061204160949.GM35681@deviant.kiev.zoral.com.ua>: ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than ko> 1.156.2.7, please, update the system. Thanks, I updated it just now and see how it works. -- | Hiroki SATO -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061205/1b3c6551/attachment.pgp
On Tue, 2006-12-05 at 12:38 +0900, Hiroki Sato wrote:> Kostik Belousov <kostikbel@gmail.com> wrote > in <20061204160949.GM35681@deviant.kiev.zoral.com.ua>: > > ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than > ko> 1.156.2.7, please, update the system. > > Thanks, I updated it just now and see how it works. > > -- > | Hiroki SATOI was/am having the same issue. Updating world (6.2-stable) to include the above update sadly did not fix the problem for me. This is an amd64 box with only one client connecting to it via nfs. Reading further it may seem to be an issue with rpc.statd and/or rpc.lockd. As I only have one client connecting and it is being used as mail storage (i.e. the client pops/imaps the storage) would be safe to not using fcntl forwards over the wire? Is this same issue present in 6.1-RELENG? I am really at my wits end at this point and for the first time am actually considering moving to another OS (solaris more than likely) as I cannot have these types of issues interrupting services every couple days. What other information (spefically) can I provide to help the devs figure out what is going on? What can I do in the meantime to have some semblence of stability? I assume downgrading to 5.5-RELENG is out of the question but perhaps disabling SMP? Sven
Sven Willenberger
2007-Jan-15 18:56 UTC
bge panic (Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1))
On Mon, 2007-01-15 at 13:22 -0500, Kris Kennaway wrote:> On Mon, Jan 15, 2007 at 11:33:33AM -0500, Sven Willenberger wrote: > > > > This is indicating a problem either with your bge hardware or the driver. > > > > > > Kris > > > > I suspect the driver: This same hardware setup was being used as a > > databse server with FreeBSD 5.4. I had been using the bge driver but set > > at base100T without any issue at all. It was when I did a clean install > > of 6.2-Prerelease and setting bge to use the full gigE speed (via > > autonegotiate) that these issues cropped up. > > Be careful before you start blaming FreeBSD - since you did not test > the failing hardware configuration in the older version of FreeBSD you > cannot yet determine that it is a driver regression. > > KrisI will freely admit that this may be circumstantial, that the hardware failed at the same time I upgraded to the newer version of FreeBSD. It could also be that there is an issue with the bge driver being used with 1000 (gigE) speeds instead of at fastE speeds as I used it with the 5.4 release (same hardware). Unfortunately, now that the fxp connection seems stable (for the moment) I am going to take advantage of the uptime and will have to leave troubleshooting/debugging/etc to what I have provided in the other responses I have sent.