thr3ads.net - freebsd stable - panic in nfsd on 6.2-RC1 [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Hiroki Sato

2006-Dec-04 07:45 UTC

panic in nfsd on 6.2-RC1

Hi,

 One of my NFS servers running 6.2-RC1 that are highly-loaded causes a
 panic repeatedly these days.  I am not sure which upgrading this
 panic starts after precisely, but this was running for almost one
 year (6.0R and 6.1R) with no problem at least.  A core file is
 available.

----(from here)
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x0
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc069d890
stack pointer           = 0x28:0xed0ae920
frame pointer           = 0x28:0xed0ae928
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 653 (nfsd)
trap number             = 12
panic: page fault
Uptime: 46m22s
Dumping 1021 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 1022MB (261423 pages) 1006 990 974 958 942 926 910 894 878 862 846
830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526
510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206
190 174 158 142 126 110 94 78 62 46 30 14

#0  doadump () at pcpu.h:165
165     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc067c512 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc067c7d8 in panic (fmt=0xc08d0c8e "%s")
    at /usr/src/sys/kern/kern_shutdown.c:565
#3  0xc0892122 in trap_fatal (frame=0xed0ae8e0, eva=0)
    at /usr/src/sys/i386/i386/trap.c:837
#4  0xc0891866 in trap (frame      {tf_fs = -992346104, tf_es = 40, tf_ds =
268107816, tf_edi = 72, tf_esi = 0, tf_ebp = -318052056, tf_isp = -318052084,
tf_ebx = -993986688, tf_edx = -993986688, tf_ecx = 4, tf_eax = 4, tf_trapno =
12, tf_err = 0, tf_eip = -1066805104, tf_cs = 32, tf_eflags = 589831, tf_esp =
0, tf_ss = -1063278752})
    at /usr/src/sys/i386/i386/trap.c:270
#5  0xc088012a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#6  0xc069d890 in turnstile_broadcast (ts=0x0)
    at /usr/src/sys/kern/subr_turnstile.c:726
#7  0xc06739d7 in _mtx_unlock_sleep (m=0xc09fa760, opts=0, file=0x0, line=0)
    at /usr/src/sys/kern/kern_mutex.c:690
#8  0xc077e00b in nfs_rephead (siz=0, nd=0xc5023c00, err=72, mbp=0x4,
    bposp=0x4) at /usr/src/sys/nfsserver/nfs_srvsock.c:152
#9  0xc07779f3 in nfsrv_symlink (nfsd=0xc5023c00, slp=0xc4f8ae80,
    td=0xc4c0f780, mrq=0xed0aec98) at /usr/src/sys/nfsserver/nfs_serv.c:2844
#10 0xc07819b1 in nfssvc_nfsd (td=0x4)
    at /usr/src/sys/nfsserver/nfs_syscalls.c:474
#11 0xc0781194 in nfssvc (td=0xc4c0f780, uap=0xed0aed04)
    at /usr/src/sys/nfsserver/nfs_syscalls.c:181
#12 0xc0892437 in syscall (frame      {tf_fs = 59, tf_es = 59, tf_ds = 59,
tf_edi = 1, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -318050972, tf_ebx = 12,
tf_edx = 672449048, tf_ecx = 26, tf_eax = 155, tf_trapno = 12, tf_err = 2,
tf_eip = 671863223, tf_cs = 51, tf_eflags = 662, tf_esp = -1077941492, tf_ss =
59}) at /usr/src/sys/i386/i386/trap.c:983
#13 0xc088017f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#14 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)
----(to here)

--
| Hiroki SATO
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061204/cd1f31d1/attachment.pgp

Kostik Belousov

2006-Dec-04 08:10 UTC

head link

panic in nfsd on 6.2-RC1

On Tue, Dec 05, 2006 at 12:43:23AM +0900, Hiroki Sato
wrote:> Hi,
> 
>  One of my NFS servers running 6.2-RC1 that are highly-loaded causes a
>  panic repeatedly these days.  I am not sure which upgrading this
>  panic starts after precisely, but this was running for almost one
>  year (6.0R and 6.1R) with no problem at least.  A core file is
>  available.
> 
> ----(from here)
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x0
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc069d890
> stack pointer           = 0x28:0xed0ae920
> frame pointer           = 0x28:0xed0ae928
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = resume, IOPL = 0
> current process         = 653 (nfsd)
> trap number             = 12
> panic: page fault
> Uptime: 46m22s
> Dumping 1021 MB (2 chunks)
>   chunk 0: 1MB (159 pages) ... ok
>   chunk 1: 1022MB (261423 pages) 1006 990 974 958 942 926 910 894 878 862
846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542
526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222
206 190 174 158 142 126 110 94 78 62 46 30 14
> 
> #0  doadump () at pcpu.h:165
> 165     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:165
> #1  0xc067c512 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
> #2  0xc067c7d8 in panic (fmt=0xc08d0c8e "%s")
>     at /usr/src/sys/kern/kern_shutdown.c:565
> #3  0xc0892122 in trap_fatal (frame=0xed0ae8e0, eva=0)
>     at /usr/src/sys/i386/i386/trap.c:837
> #4  0xc0891866 in trap (frame>       {tf_fs = -992346104, tf_es = 40,
tf_ds = 268107816, tf_edi = 72, tf_esi = 0, tf_ebp = -318052056, tf_isp =
-318052084, tf_ebx = -993986688, tf_edx = -993986688, tf_ecx = 4, tf_eax = 4,
tf_trapno = 12, tf_err = 0, tf_eip = -1066805104, tf_cs = 32, tf_eflags =
589831, tf_esp = 0, tf_ss = -1063278752})
>     at /usr/src/sys/i386/i386/trap.c:270
> #5  0xc088012a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> #6  0xc069d890 in turnstile_broadcast (ts=0x0)
>     at /usr/src/sys/kern/subr_turnstile.c:726
> #7  0xc06739d7 in _mtx_unlock_sleep (m=0xc09fa760, opts=0, file=0x0,
line=0)
>     at /usr/src/sys/kern/kern_mutex.c:690
> #8  0xc077e00b in nfs_rephead (siz=0, nd=0xc5023c00, err=72, mbp=0x4,
>     bposp=0x4) at /usr/src/sys/nfsserver/nfs_srvsock.c:152
> #9  0xc07779f3 in nfsrv_symlink (nfsd=0xc5023c00, slp=0xc4f8ae80,
>     td=0xc4c0f780, mrq=0xed0aec98) at
/usr/src/sys/nfsserver/nfs_serv.c:2844
> #10 0xc07819b1 in nfssvc_nfsd (td=0x4)
>     at /usr/src/sys/nfsserver/nfs_syscalls.c:474
> #11 0xc0781194 in nfssvc (td=0xc4c0f780, uap=0xed0aed04)
>     at /usr/src/sys/nfsserver/nfs_syscalls.c:181
> #12 0xc0892437 in syscall (frame>       {tf_fs = 59, tf_es = 59, tf_ds =
59, tf_edi = 1, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -318050972, tf_ebx =
12, tf_edx = 672449048, tf_ecx = 26, tf_eax = 155, tf_trapno = 12, tf_err = 2,
tf_eip = 671863223, tf_cs = 51, tf_eflags = 662, tf_esp = -1077941492, tf_ss =
59}) at /usr/src/sys/i386/i386/trap.c:983
> #13 0xc088017f in Xint0x80_syscall () at
/usr/src/sys/i386/i386/exception.s:200
> #14 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> ----(to here)
> 
> --
> | Hiroki SATO
What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than
1.156.2.7, please, update the system.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061204/b54f4018/attachment.pgp

Hiroki Sato

2006-Dec-04 19:48 UTC

head link

panic in nfsd on 6.2-RC1

Kostik Belousov <kostikbel@gmail.com> wrote
  in <20061204160949.GM35681@deviant.kiev.zoral.com.ua>:

ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than
ko> 1.156.2.7, please, update the system.

 Thanks, I updated it just now and see how it works.

--
| Hiroki SATO
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061205/1b3c6551/attachment.pgp

Sven Willenberger

2006-Dec-15 06:58 UTC

head link

panic in nfsd on 6.2-RC1

On Tue, 2006-12-05 at 12:38 +0900, Hiroki Sato wrote:> Kostik Belousov <kostikbel@gmail.com> wrote
>   in <20061204160949.GM35681@deviant.kiev.zoral.com.ua>:
> 
> ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older
than
> ko> 1.156.2.7, please, update the system.
> 
>  Thanks, I updated it just now and see how it works.
> 
> --
> | Hiroki SATO
I was/am having the same issue. Updating world (6.2-stable) to include
the above update sadly did not fix the problem for me. This is an amd64
box with only one client connecting to it via nfs. Reading further it
may seem to be an issue with rpc.statd and/or rpc.lockd. As I only have
one client connecting and it is being used as mail storage (i.e. the
client pops/imaps the storage) would be safe to not using fcntl forwards
over the wire? Is this same issue present in 6.1-RELENG? I am really at
my wits end at this point and for the first time am actually considering
moving to another OS (solaris more than likely) as I cannot have these
types of issues interrupting services every couple days.

What other information (spefically) can I provide to help the devs
figure out what is going on? What can I do in the meantime to have some
semblence of stability? I assume downgrading to 5.5-RELENG is out of the
question but perhaps disabling SMP?

Sven

Sven Willenberger

2007-Jan-15 18:56 UTC

head link

bge panic (Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1))

On Mon, 2007-01-15 at 13:22 -0500, Kris Kennaway wrote:> On Mon, Jan 15, 2007 at 11:33:33AM -0500, Sven Willenberger wrote:
> 
> > > This is indicating a problem either with your bge hardware or the
driver.
> > > 
> > > Kris
> > 
> > I suspect the driver: This same hardware setup was being used as a
> > databse server with FreeBSD 5.4. I had been using the bge driver but
set
> > at base100T without any issue at all. It was when I did a clean
install
> > of 6.2-Prerelease and setting bge to use the full gigE speed (via
> > autonegotiate) that these issues cropped up.
> 
> Be careful before you start blaming FreeBSD - since you did not test
> the failing hardware configuration in the older version of FreeBSD you
> cannot yet determine that it is a driver regression.
> 
> Kris
I will freely admit that this may be circumstantial, that the hardware
failed at the same time I upgraded to the newer version of FreeBSD. It
could also be that there is an issue with the bge driver being used with
1000 (gigE) speeds instead of at fastE speeds as I used it with the 5.4
release (same hardware). Unfortunately, now that the fxp connection
seems stable (for the moment) I am going to take advantage of the uptime
and will have to leave troubleshooting/debugging/etc to what I have
provided in the other responses I have sent.

freebsd stable - Dec 2006 - panic in nfsd on 6.2-RC1

panic in nfsd on 6.2-RC1

panic in nfsd on 6.2-RC1

panic in nfsd on 6.2-RC1

panic in nfsd on 6.2-RC1

bge panic (Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1))