thr3ads.net - freebsd stable - 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+ [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Royce Williams

2008-Jul-22 20:05 UTC

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

We have 10 SuperMicro PDSMi+ 5015M-MTs that are panic'ing every few
days.  This started shortly after upgrade from 6.2-RELEASE to
6.3-RELEASE with freebsd-update.

Other than switching to a debugging kernel, a little sysctl tuning,
and patching with freebsd-update, they are stock.  The debugging
kernel was built from source that is also being patched with
freebsd-update.

These systems are running postfix and Courier imapd for an ISP with a
userbase on the order of 10^4 users.  They use gmirror, but the
mailstore is over NFS.  That NFS server is under pretty high load.
All of the servers with this app and load pattern are panic'ing.

I have little experience with kernel debugging, but the box in
question is out of our farm and available for testing, and I am
motivated to cooperate. :-)

The full debugging kernel options I used are:

include SMP
options KDB
options KDB_TRACE
options DDB
options BREAK_TO_DEBUGGER
options WITNESS
options WITNESS_SKIPSPIN


db> trace
Tracing pid 71182 tid 100325 td 0xcc08b180
kdb_enter(c095f294) at kdb_enter+0x2b
panic(c09768ad,1000,14000000,c145bc88,1000,...) at panic+0x127
kmem_malloc(c14680c0,1000,102,eba6a8cc,c07e3fa5,...) at kmem_malloc+0x89
page_alloc(c1453780,1000,eba6a8bf,102,c06b8a84,...) at page_alloc+0x1a
slab_zalloc(c1453780,102,c14537e0,c1453780,c1460d5c,...) at
slab_zalloc+0xa1
uma_zone_slab(c1453780,2) at uma_zone_slab+0xf0
uma_zalloc_bucket(c1453780,2) at uma_zalloc_bucket+0x11c
uma_zalloc_arg(c1453780,0,2) at uma_zalloc_arg+0x24c
cache_enter(cd02c220,c9e62880,eba6a9fc) at cache_enter+0xa6
nfs_readdirplusrpc(cd02c220,eba6aa60,cc0ab880) at nfs_readdirplusrpc+0x6a6
nfs_doio(cd02c220,dce59668,cc0ab880,cc08b180,dce59668,...) at
nfs_doio+0x20f
nfs_bioread(cd02c220,eba6acb0,0,cc0ab880) at nfs_bioread+0xa64
nfs_readdir(eba6ac90) at nfs_readdir+0xe6
VOP_READDIR_APV(c09ebbc0,eba6ac90) at VOP_READDIR_APV+0x38
getdirentries(cc08b180,eba6ad04) at getdirentries+0x146
syscall(3b,3b,3b,9085f00,9085f00,...) at syscall+0x22f
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (196, FreeBSD ELF32, getdirentries), eip = 0xb825a79b, esp
= 0xbfbfa1fc, ebp = 0xbfbfa228 ---



Royce

-- 
Royce D. Williams                                   - http://royce.ws/
  I don't like that man. I must get to know him better. - A. Lincoln

Kris Kennaway

2008-Jul-22 20:12 UTC

head link

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

Royce Williams wrote:
> db> trace
> Tracing pid 71182 tid 100325 td 0xcc08b180
> kdb_enter(c095f294) at kdb_enter+0x2b
> panic(c09768ad,1000,14000000,c145bc88,1000,...) at panic+0x127
> kmem_malloc(c14680c0,1000,102,eba6a8cc,c07e3fa5,...) at kmem_malloc+0x89
You forgot to include the panic, but this is probably the "kmem_map too 
small" panic.  It says that your kernel ran out of memory, and the 
solution is to fix that situation by giving more memory to the kernel. 
Increase the vm.kmem_size tunable until your system stops running out of 
memory on your workload.

Kris

Clifton Royston

2008-Jul-23 05:07 UTC

head link

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

On Tue, Jul 22, 2008 at 11:45:30AM -0800, Royce Williams
wrote:> We have 10 SuperMicro PDSMi+ 5015M-MTs that are panic'ing every few
> days.  This started shortly after upgrade from 6.2-RELEASE to
> 6.3-RELEASE with freebsd-update.
  I was having similar problems on some servers using 6.2-psomething,
which also use an NFS server heavily (for shared configuration files),
until I started using the same vmem.kmem_size tunable Kris is
recommending.

  That seemed to solve the problem for me.  (I just slapped it up to
512M rather than try to binary search for some optimum value.)

  -- Clifton

-- 
    Clifton Royston  --  cliftonr@iandicomputing.com / cliftonr@lava.net
       President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services

Jeremy Chadwick

2008-Jul-23 05:34 UTC

head link

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

On Tue, Jul 22, 2008 at 11:45:30AM -0800, Royce Williams
wrote:> We have 10 SuperMicro PDSMi+ 5015M-MTs that are panic'ing every few
> days.  This started shortly after upgrade from 6.2-RELEASE to
> 6.3-RELEASE with freebsd-update.
We use the same hardware (board and chassis), and have no such problems
running both RELENG_6 and RELENG_7.

I don't think your issue is specific to the board or chassis.  Kris's
explanation makes a lot more sense.  :-)

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

Reasonably Related Threads

Search for more apparently analagous threads

freebsd stable - Jul 2008 - 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+

Reasonably Related Threads