It seems that this mail has been sent encoded... Here I try it once more
verified as 'plain text'.
**************************************************************************
Hello,
finally I got it managed to upgrade and test my server last weekend.
There are good news: so far kernel r261208 (FreeBSD 9.2-STABLE) runs without
problems.
I could not apply the patch you supplied, but I saw that the code was modified
nonetheless and I gave it a try :-)
It seems that the problem has been solved.
Thank you very much! :-)
Mit freundlichen Gr??en
Matthias Sch?ndeh?tte
Siemens AG
Industry Sector
Drive Technologies Division
I DT IT LD BLN
Nonnendammallee 72
13629 Berlin, Deutschland
Tel: +49 30 386-29957
Mobile: +49 170 8162912
mailto:matthias.schuendehuette at siemens.com
> -----Urspr?ngliche Nachricht-----
> Von: Rick Macklem [mailto:rmacklem at uoguelph.ca]
> Gesendet: Sonntag, 19. Januar 2014 03:19
> An: Schuendehuette, Matthias
> Cc: Konstantin Belousov
> Betreff: Re: Stack overflow with kernel r254683
>
> I just found a bug that causes a stack overflow in the file handle
> affinity code done by ken at . It occurs for an NFSv2 client mounting
> a server, where sizeof(fhandle_t) < 32.
>
> I've attached the patch that fixes this, in case you can test it?
>
> Since your stack trace looks completely different, I won't guess if
> this was the bug, but this bug definitely trashed the stack.
>
> rick
>
> ----- Original Message -----
> > On Mon, Aug 26, 2013 at 07:11:48PM -0400, Rick Macklem wrote:
> > > Matthias Schuendehuette wrote:
> > > > Hello,
> > > >
> > > > yesterday I got a kernel crash on my server (a ProLiant
DL380
> > > > G5):
> > > >
> > > > "panic: stack overflow detected; backtrace may be
corrupted"
> > > >
> > > > Kernel is "9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #7
r254683"
> > > >
> > > >
> > > > The stack trace reads:
> > > >
> > > > #0 doadump (textdump=1) at pcpu.h:249
> > > > 249 pcpu.h: No such file or directory.
> > > > in pcpu.h
> > > > (kgdb) #0 doadump (textdump=1) at pcpu.h:249
> > > > #1 0xc0668a4d in kern_reboot (howto=260)
> > > > at /usr/src/sys/kern/kern_shutdown.c:449
> > > > #2 0xc0668f07 in panic (fmt=0x104 <Address 0x104 out of
bounds>)
> > > > at /usr/src/sys/kern/kern_shutdown.c:637
> > > > #3 0xc0691da2 in __stack_chk_fail ()
> > > > at /usr/src/sys/kern/stack_protector.c:17
> > > > #4 0xc7fdb175 in nfsrvd_setattr (nd=0xc73b4400,
> > > > isdgram=-952596480,
> > > > vp=0xc8001140, p=0xf405ecc8, exp=0xc07af7f0)
> > > > at
> > > >
/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:371
> > > > #5 0xc7fdb6e0 in nfsrvd_releaselckown (nd=0xc7442a00,
> > > > isdgram=-952596480,
> > > > vp=0xc7388848, p=0xf405ecb8, exp=0x0)
> > > > at
> > > >
/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:3481
> > > > #6 0xc07af7f0 in svc_run_internal (pool=0xc7de8b80,
ismaster=0)
> > > > at /usr/src/sys/rpc/svc.c:1109
> > > > #7 0xc07b006d in svc_thread_start (arg=0xc7de8b80)
> > > > at /usr/src/sys/rpc/svc.c:1200
> > > > #8 0xc06384f7 in fork_exit (callout=0xc07b0060
> > > > <svc_thread_start>,
> > > > arg=0xc7de8b80, frame=0xf405ed08) at
> > > > /usr/src/sys/kern/kern_fork.c:992
> > > > #9 0xc08787c4 in fork_trampoline () at
> > > > /usr/src/sys/i386/i386/exception.s:279
> > > >
> > > Well, when I've looked on i386, the nfsd threads normally
don't use
> > > 1 page
> > > and the stacks are 2 pages, so I doubt an nfsd thread is blowing
> > > the stack.
> > It is overflowing the frame, not the whole stack. In other word,
> > something
> > overwrote the canary which was put on the stack between local
> > variables
> > and the return address, possibly corrupting the return address as
> > well.
> >
> > > Also, nfsrvd_releaselckown() doesn't call nfsrvd_setattr(),
so the
> > > backtrace
> > > doesn't make much sense.
> > Yes, this might be one of the consequences of the stack smashing.
> >
> > >
> > > Afraid I can't help more than this. Good luck with it, rick
> > >
> > > >
> > > > I have all the files in /var/crash, so if someone wants
> > > > additional
> > > > informations
> > > > I should be able to deliver them.
> > > >
> > > > The kernel config file is customized in the sense that I
have
> > > > removed
> > > > kernel items, that aren't used on that machine.
> > > >
> > > > One major difference: I use
> > > >
> > > > < options NFSCLIENT # Network
Filesystem
> > > > Client
> > > > < options NFSSERVER # Network
Filesystem
> > > > Server
> > > >
> > > > instead of
> > > >
> > > > > options NFSCL # New Network
Filesystem
> > > > > Client
> > > > > options NFSD # New Network
Filesystem
> > > > > Server
> > > >
> > > > because a kernel a few weeks ago immediately crashed with
the new
> > > > NFS-code.
> > > >
> > > > But it seems now, that the old NFS-code is also somehow
damaged.
> > > >
> > > > Ah, and I still have from older releases of FreeBSD the
following
> > > > loader options - do they still make sense?
> > > >
> > > > geom_vinum_load="YES"
> > > > kern.maxdsiz="734003200"
> > > > vm.pmap.shpgperproc=256
> > > > vm.pmap.pv_entry_max=3145728
> > > >
> > > >
> > > > 'geom_vinum' is used as LVM only, no RAIDs are
configured.
> > > >
> > > > This server is primarily a Samba server with the SMB-shares
> > > > exported
> > > > as NFS-shares as well
> > > > for the other *nix-servers around.
> > > >
> > > > Because this is the most loaded production server, testing
is a
> > > > bit
> > > > difficult, restricted to the evening and the weekends.
> > > >
> > > > On my two other FreeBSD machines I have no problems at all,
one
> > > > of
> > > > them is an identical ProLiant server with a nearly identical
> > > > kernel
> > > > config - runs like a charm...
> > > >
> > > > Has someone a good advice or further questions?
> > > >
> > > >
> > > >
> > > > with best regards
> > > > Matthias Schuendehuette
> > > >