Hi,
With INVARIANT, WITNESS enabled, when I tried to ^C
to exit dd, it panics immediately. Some ddb & kgdb
messages below (I have KDB_TRACE, KDB_UNATTENDED).
Core file is available. Any help is appreciated :-)
UPDATE: sometimes, I cant ^C or kill -9 the dd process even
with mpsafenet=0. In that situation, a panic with similar trace
as below, which is mpsafenet=1.
panic: VOP_STRATEGY failed bp=0xd835acd8 vp=0xc4a1baa0
cpuid = 1
KDB: stack backtrace:
kdb_backtrace(1,c05056b4,1,e7f1b7d0,1) at kdb_backtrace+0x2e
panic(c061782c,d835acd8,c4a1baa0,c4a1baa0,4) at panic+0x12b
bufstrategy(c4a1bb60,d835acd8,e7f1b80c,c471ee63,d835acd8) at bufstrategy+0x7d
bstrategy(d835acd8,c060be84,23c,a00200a6,0) at bstrategy+0x60
nfs_writebp(d835acd8,1,c4369000,e7f1b82c,c471eb73) at nfs_writebp+0xf3
nfs_bwrite(d835acd8,e7f1b904,c471e92b,d835acd8,1dd88000) at nfs_bwrite+0x13
bwrite(d835acd8,1dd88000,0,1dd86000,0) at bwrite+0x5b
nfs_flush(c4a1baa0,1,c4369000,1,e7f1b92c) at nfs_flush+0x78b
nfs_fsync(e7f1b93c) at nfs_fsync+0x1c
VOP_FSYNC_APV(c4735fc0,e7f1b93c) at VOP_FSYNC_APV+0x99
VOP_FSYNC(c4a1baa0,1,c4369000) at VOP_FSYNC+0x2e
bufsync(c4a1bb60,1,c4369000) at bufsync+0x14
bufobj_invalbuf(c4a1bb60,1,c4369000,100,0) at bufobj_invalbuf+0xda
vinvalbuf(c4a1baa0,1,c4369000,100,0) at vinvalbuf+0x1d
nfs_vinvalbuf(c4a1baa0,1,c4369000,1,c04d5738) at nfs_vinvalbuf+0xda
nfs_write(e7f1bbc8) at nfs_write+0x16f
VOP_WRITE_APV(c4735fc0,e7f1bbc8) at VOP_WRITE_APV+0x11e
VOP_WRITE(c4a1baa0,e7f1bcb0,7f0001,c49f5180) at VOP_WRITE+0x34
vn_write(c46d6ca8,e7f1bcb0,c49f5180,0,c4369000) at vn_write+0x1ad
fo_write(c46d6ca8,e7f1bcb0,c49f5180,0,c4369000) at fo_write+0x1d
dofilewrite(c4369000,4,c46d6ca8,e7f1bcb0,ffffffff,ffffffff,0) at
dofilewrite+0x8e
kern_writev(c4369000,4,e7f1bcb0) at kern_writev+0x41
write(c4369000,e7f1bcf0) at write+0x58
syscall(3b,3b,3b,8076000,100000) at syscall+0x2cf
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (4, FreeBSD ELF32, write), eip = 0x880b9813, esp 0xbfbfeaac, ebp =
0xbfbfead8 ---
Uptime: 4m18s
Dumping 3062 MB (2 chunks)
[...]
(kgdb) bt full
#0 0xc04a8181 in doadump () at /usr/src/sys/kern/kern_shutdown.c:233
No locals.
#1 0xc04a8841 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399
first_buf_printf = 1
#2 0xc04a8bf9 in panic (fmt=0xc061782c "VOP_STRATEGY failed bp=%p
vp=%p")
at /usr/src/sys/kern/kern_shutdown.c:555
td = (struct thread *) 0xc4369000
bootopt = 260
newpanic = 1
ap = 0xe7f1b7d0 "?5???????\004"
buf = "VOP_STRATEGY failed bp=0xd835acd8 vp=0xc4a1baa0",
'\0'
<repeats 208 times>
#3 0xc0505689 in bufstrategy (bo=0xc4a1bb60, bp=0xd835acd8)
at /usr/src/sys/kern/vfs_bio.c:3690
i = 4
vp = (struct vnode *) 0xc4a1baa0
#4 0xc471ef28 in ?? ()
No symbol table info available.
#5 0xc4a1bb60 in ?? ()
No symbol table info available.
#6 0xd835acd8 in ?? ()
No symbol table info available.
#7 0xe7f1b80c in ?? ()
No symbol table info available.
#8 0xc471ee63 in ?? ()
No symbol table info available.
#9 0xd835acd8 in ?? ()
No symbol table info available.
#10 0xc060be84 in __func__.2 ()
No symbol table info available.
#11 0x0000023c in ?? ()
No symbol table info available.
#12 0xa00200a6 in ?? ()
No symbol table info available.
#13 0x00000000 in ?? ()
No symbol table info available.
#14 0xe7f1b820 in ?? ()
No symbol table info available.
#15 0xc471f2a3 in ?? ()
No symbol table info available.
#16 0xd835acd8 in ?? ()
No symbol table info available.
#17 0x00000001 in ?? ()
No symbol table info available.
#18 0xc4369000 in ?? ()
No symbol table info available.
#19 0xe7f1b82c in ?? ()
No symbol table info available.
#20 0xc471eb73 in ?? ()
No symbol table info available.
#21 0xd835acd8 in ?? ()
No symbol table info available.
#22 0xe7f1b904 in ?? ()
No symbol table info available.
#23 0xc471e92b in ?? ()
No symbol table info available.
#24 0xd835acd8 in ?? ()
No symbol table info available.
#25 0x1dd88000 in ?? ()
No symbol table info available.
#26 0x00000000 in ?? ()
No symbol table info available.
#27 0x1dd86000 in ?? ()
No symbol table info available.
#28 0x00000000 in ?? ()
No symbol table info available.
#29 0xe7f1b858 in ?? ()
No symbol table info available.
#30 0xc049ee97 in _mtx_assert (m=0xd835acd8, what=-1067401596,
file=0x23c <Address 0x23c out of bounds>, line=-1610481498)
at /usr/src/sys/kern/kern_mutex.c:754
No locals.
Previous frame inner to this frame (corrupt stack?)
(kgdb) l *0xc0505689
0xc0505689 is in bufstrategy (/usr/src/sys/kern/vfs_bio.c:3691).
3686 KASSERT(vp == bo->bo_private, ("Inconsistent vnode
bufstrategy"));
3687 KASSERT(vp->v_type != VCHR && vp->v_type != VBLK,
3688 ("Wrong vnode in bufstrategy(bp=%p, vp=%p)", bp,
vp));
3689 i = VOP_STRATEGY(vp, bp);
3690 KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p",
bp, bp->b_vp));
3691 }
3692
3693 void
3694 bufobj_wrefl(struct bufobj *bo)
3695 {
On 3/10/06, Rong-En Fan <grafan@gmail.com> wrote:> Hi,
>
> forget to mention all the clients/servers here are SMP kernel.
> After some Googling, a post on current@ 2005/01/12
> "NFS problems, locking up" is hightly related to my situation.
> An workaround is to set debug.mpsafenet=0, just verified this
> indeed works.
>
> Now I'm turning on INVARIANTS, WITNESS to see if there
> are some output. However, I'm afriad that I can not get a
> serial console access to these machines (and thus no ddb
> output :( ).
>
> Thanks,
> Rong-En Fan
>
> On 3/10/06, Rong-En Fan <grafan@gmail.com> wrote:
> > Hi,
> >
> > After upgrading several our nfs clients from 5.4-RELEASE to
6.0-RELEASE
> > and some are now 6.1-PRERELEASE (a weeks ago). From time to time,
> > we saw some processes stuck in nfsaio, and unkillable. These processes
> > generate lots of traffic to nfs server (write to nfs, but nfs
server's disk does
> > not really in write. from netstat, client sends ~100Mbps, on nfs
server, iostat
> > does not show me ~12.5MB/s). The nfsd on the server side is either in
RUN
> > or in ufs state. Server is running 5.5-PRELEASE as of yesterday.
> >
> > Client mount options: rw,nosuid,bg,intr,nodev. Both client and server
> > are running
> > rpc.lockd, rpc.statd. I'm sure it's not related to any locking
problems.
> >
> > I have another set of nfs server/client both running 6.0-RELEASE. And
I can
> > easily reproduce this situation on these two boxesnes, just by running
> >
> > dd if=/dev/zero of=/nfs/ooo bs=1m
> >
> > If I do not add bs=1m, it works fine. Of all the boxes I mentioned
above,
> > I did not do anything special to kernel config, i.e., they are GENERIC
w/o
> > unnecessary devices and w/ firewal. Basically, I can do anything on
these
> > two boxes (they are not in production mode). Any suggestion are
welcome.
> >
> > Thanks,
> > Rong-En Fan
> >
>