thr3ads.net - freebsd stable - panic: sbflush_locked on 5.4-p5/i386 [Jul 2005]

If this information is useful, please help other people find it:
Share via:

Rong-En Fan

2005-Jul-24 17:01 UTC

panic: sbflush_locked on 5.4-p5/i386

hello,

I have a 5.4-p5 running on i386. Got a panic:
panic: sbflush_locked: cc 0 || mb 0xc33bf000 || mbcnt 4294967040
It is an web server running Apache and Postfix as a backup MX.
I'm using gmirror on all partitions and thus cannot get a dump (swap
is on gmirror). Some ddb outputs are below.

Google told me that
http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044535.html
looks related. But the code path is different. Note that the patch in that mail
is already in 5.4.

If needed, I can provide kernel conf. I also tuned following sysctls:
vfs.hirunningspace=2097152
kern.ipc.somaxconn=4096
kern.maxfiles=30000
kern.maxfilesperproc=30000
net.inet.ip.random_id=1
machdep.hyperthreading_allowed=1

The DDB messages go here:
cpuid = 3
KDB: enter: panic
[thread pid 61 tid 100061 ]
Stopped at      kdb_enter+0x2b: nop
db> wh
Tracing pid 61 tid 100061 td 0xc311e180
kdb_enter(c05f3bc6) at kdb_enter+0x2b
panic(c05f6f09,0,c33bf000,ffffff00,c3a1970c) at panic+0x127
sbflush_locked(c3a1970c,c3a19654,e74aeba4,c04e4cb4,c3a1970c) at
sbflush_locked+0x6f
sbrelease_locked(c3a1970c,c3a19654) at sbrelease_locked+0xd
sofree(c3a19654) at sofree+0x26c
in_pcbdetach(c371d870,c3e996f0,c3e996f0,e74aec9c,c05355df) at in_pcbdetach+0xb6
tcp_close(c3e996f0,1,1,1042e,1) at tcp_close+0x16
tcp_input(c4513400,14,1c1e708c,0,0) at tcp_input+0x2297
ip_input(c4513400) at ip_input+0x4f1
netisr_processqueue(c0643298) at netisr_processqueue+0xa3
swi_net(0) at swi_net+0xf2
ithread_loop(c3094c80,e74aed48) at ithread_loop+0x159
fork_exit(c049c138,c3094c80,e74aed48) at fork_exit+0x75
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe74aed7c, ebp = 0 ---
db> ps
   61 c311ce20    0     0     0 0000204 [CPU 3] swi1: net

Regards,
Rong-En Fan

Alexander S. Usov

2005-Jul-24 18:22 UTC

head link

panic: sbflush_locked on 5.4-p5/i386

Rong-En Fan wrote:
> hello,
> 
> I have a 5.4-p5 running on i386. Got a panic:
> panic: sbflush_locked: cc 0 || mb 0xc33bf000 || mbcnt 4294967040
> It is an web server running Apache and Postfix as a backup MX.
> I'm using gmirror on all partitions and thus cannot get a dump (swap
> is on gmirror). Some ddb outputs are below.
I got a few similar panics.
It looks that I managet to get rid of them by setting mpsafenet=0, but I am
not sure -- I have to monitor it for a bit longer.
I have managed to get a few dumps, so the traces are:

========================== N 1 ========================#0  doadump () at
pcpu.h:159
#1  0xc0513885 in boot (howto=260) at ../../../kern/kern_shutdown.c:410
#2  0xc0513eca in panic (fmt=0xc06ac866 "sbflush_locked: cc %u || mb %p ||
mbcnt %u")
    at ../../../kern/kern_shutdown.c:566
#3  0xc05559a6 in sbflush_locked (sb=0xc28400b8)
at ../../../kern/uipc_socket2.c:1119
#4  0xc05559ce in sbrelease_locked (sb=0xc28400b8, so=0x0)
at ../../../kern/uipc_socket2.c:564
#5  0xc05525eb in sofree (so=0xc2840000) at ../../../kern/uipc_socket.c:405
#6  0xc05a56e1 in in_pcbdetach (inp=0xc2312654)
at ../../../netinet/in_pcb.c:719
#7  0xc05b6284 in tcp_close (tp=0x0) at ../../../netinet/tcp_subr.c:783
#8  0xc05b2c13 in tcp_input (m=0xc1cff600, off0=-1625741474)
at ../../../netinet/tcp_input.c:2286
#9  0xc05a9aff in ip_input (m=0xc1cff600) at ../../../netinet/ip_input.c:776
#10 0xc059214a in netisr_processqueue (ni=0xc070b0d8)
at ../../../net/netisr.c:233
#11 0xc0592409 in swi_net (dummy=0x0) at ../../../net/netisr.c:346
#12 0xc04fb98d in ithread_loop (arg=0xc1979500)
at ../../../kern/kern_intr.c:547
#13 0xc04fa9c8 in fork_exit (callout=0xc04fb8d6 <ithread_loop>, arg=0x0,
frame=0x0)
    at ../../../kern/kern_fork.c:791
#14 0xc0656a7c in fork_trampoline () at ../../../i386/i386/exception.s:209
============================================================
and

======================== N 2 ===============================#0  doadump () at
pcpu.h:159
#1  0xc0513885 in boot (howto=260) at ../../../kern/kern_shutdown.c:410
#2  0xc0513eca in panic (fmt=0xc06989e7 "%s")
at ../../../kern/kern_shutdown.c:566
#3  0xc0667756 in trap_fatal (frame=0xe686fa60, eva=12)
at ../../../i386/i386/trap.c:817
#4  0xc06679e4 in trap_pfault (frame=0xe686fa60, usermode=0, eva=12)
    at ../../../i386/i386/trap.c:735
#5  0xc0667db3 in trap (frame      {tf_fs = -427425768, tf_es = -1067253744,
tf_ds = -1044447216, tf_edi
= 16, tf_esi = 0, tf_ebp = -427361608, tf_isp = -427361652, tf_ebx = 40,
tf_edx = -1044393868, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0,
tf_eip = -1068176275, tf_cs = 8, tf_eflags = 66050, tf_esp = -1044409808,
tf_ss = -1044393868}) at ../../../i386/i386/trap.c:425
#6  0xc0656a1a in calltrap () at ../../../i386/i386/exception.s:140
#7  0xe6860018 in ?? ()
#8  0xc0630010 in zone_timeout (zone=0xc1bf9200)
at ../../../vm/uma_core.c:418
#9  0xc05b44ad in tcp_output (tp=0xc23b6534)
at ../../../netinet/tcp_output.c:811
#10 0xc05bc5ab in tcp_usr_send (so=0x0, flags=0, m=0xc1bf9200, nam=0x0,
control=0x0, td=0xc1e33a80)
    at ../../../netinet/tcp_usrreq.c:699
#11 0xc0550fb4 in sosend (so=0xc228d8dc, addr=0x0, uio=0xe686fc80,
top=0xc1bf9200, control=0x0,
    flags=0, td=0xc1e33a80) at ../../../kern/uipc_socket.c:835
#12 0xc053ed99 in soo_write (fp=0x0, uio=0xe686fc80, active_cred=0xc1fd9980,
flags=0,
    td=0xc1e33a80) at ../../../kern/sys_socket.c:118
#13 0xc0537c15 in dofilewrite (td=0xc1e33a80, fp=0xc1fd7110, fd=0, buf=0x0,
nbyte=56, offset=Unhandled dwarf expression opcode 0x93
)
    at file.h:245
#14 0xc0537ea8 in write (td=0xc1e33a80, uap=0xe686fd14)
at ../../../kern/sys_generic.c:282
#15 0xc06681fa in syscall (frame      {tf_fs = -1078001617, tf_es = 47, tf_ds =
-1078001617, tf_edi 138645504, tf_esi = 56, tf_ebp = -1077957448, tf_isp =
-427360908, tf_ebx 675435700, tf_edx = 0, tf_ecx = 0, tf_eax = 4, tf_trapno =
22, tf_err = 2,
tf_eip = 675424571, tf_cs = 31, tf_eflags = 646, tf_esp = -1077957476,
tf_ss = 47}) at ../../../i386/i386/trap.c:1009
#16 0xc0656a6f in Xint0x80_syscall () at ../../../i386/i386/exception.s:201
#17 0xbfbf002f in ?? ()
#18 0x0000002f in ?? ()
#19 0xbfbf002f in ?? ()
#20 0x08439000 in ?? ()
#21 0x00000038 in ?? ()
 ........... a bunch more of these ................
============================================================


-- 
Best regards,
  Alexander.

Robert Watson

2005-Jul-28 23:18 UTC

head link

panic: sbflush_locked on 5.4-p5/i386

On Mon, 25 Jul 2005, Rong-En Fan wrote:
> I have a 5.4-p5 running on i386. Got a panic: panic: sbflush_locked: cc 
> 0 || mb 0xc33bf000 || mbcnt 4294967040 It is an web server running 
> Apache and Postfix as a backup MX. I'm using gmirror on all partitions 
> and thus cannot get a dump (swap is on gmirror). Some ddb outputs are 
> below.
Is this system an SMP and/or HTT system?

If this problem is reproduceable, could I ask you to capture the following 
serial console output from DDB:

show pcpu
show pcpu 0
show pcpu 1
show pcpu 2
show pcpu 3                 # continue until out of CPU's
ps

And then traces of interesting threads -- in particular, threads mentioned 
in the pcpu output, the current thread, and threads of network-related 
processors (most importantly, the netisr thread, but also other active 
threads -- i.e., without a wchan listed).

This sounds like a race between two threads in the TCP code, but to 
diagnose it further, I'll need to know what else is running.  If you have 
access to serial gdb, I'd be quite interested in seeing the output of
"l
*so" in the sofree() frame, *tp in a tcp-related frame, and *inp if
it's
available in one of those frames, likely the in_pcbdetach() frame or 
tcp_close() frame if it's there.

Would it be possible to add an extra ATA disk to use for swap and 
capturing a core dump?

Robert N M Watson

>
> Google told me that 
>
http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044535.html
> looks related. But the code path is different. Note that the patch in 
> that mail is already in 5.4.
>
> If needed, I can provide kernel conf. I also tuned following sysctls:
> vfs.hirunningspace=2097152
> kern.ipc.somaxconn=4096
> kern.maxfiles=30000
> kern.maxfilesperproc=30000
> net.inet.ip.random_id=1
> machdep.hyperthreading_allowed=1
>
> The DDB messages go here:
> cpuid = 3
> KDB: enter: panic
> [thread pid 61 tid 100061 ]
> Stopped at      kdb_enter+0x2b: nop
> db> wh
> Tracing pid 61 tid 100061 td 0xc311e180
> kdb_enter(c05f3bc6) at kdb_enter+0x2b
> panic(c05f6f09,0,c33bf000,ffffff00,c3a1970c) at panic+0x127
> sbflush_locked(c3a1970c,c3a19654,e74aeba4,c04e4cb4,c3a1970c) at
> sbflush_locked+0x6f
> sbrelease_locked(c3a1970c,c3a19654) at sbrelease_locked+0xd
> sofree(c3a19654) at sofree+0x26c
> in_pcbdetach(c371d870,c3e996f0,c3e996f0,e74aec9c,c05355df) at
in_pcbdetach+0xb6
> tcp_close(c3e996f0,1,1,1042e,1) at tcp_close+0x16
> tcp_input(c4513400,14,1c1e708c,0,0) at tcp_input+0x2297
> ip_input(c4513400) at ip_input+0x4f1
> netisr_processqueue(c0643298) at netisr_processqueue+0xa3
> swi_net(0) at swi_net+0xf2
> ithread_loop(c3094c80,e74aed48) at ithread_loop+0x159
> fork_exit(c049c138,c3094c80,e74aed48) at fork_exit+0x75
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xe74aed7c, ebp = 0 ---
> db> ps
>   61 c311ce20    0     0     0 0000204 [CPU 3] swi1: net
>
> Regards,
> Rong-En Fan
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
"freebsd-stable-unsubscribe@freebsd.org"
>

Rong-En Fan

2005-Aug-08 15:51 UTC

head link

panic: sbflush_locked on 5.4-p5/i386

Hi

After upgrading to 5-STABLE (about Aug 6), it works very good.
With mpsafenet=1, it can work more than one day without
panic. For 5.4-p5, it will panic at most half day or so. This bug seems
fixed after 5.4 is released. I'll keep watching this machine. Will let
you know if it still have similar panics ;-)

Regards,
Rong-En Fan

On 7/29/05, Robert Watson <rwatson@freebsd.org>
wrote:> On Mon, 25 Jul 2005, Rong-En Fan wrote:
> > I have a 5.4-p5 running on i386. Got a panic: panic: sbflush_locked:
cc
> > 0 || mb 0xc33bf000 || mbcnt 4294967040 It is an web server running
> > Apache and Postfix as a backup MX. I'm using gmirror on all
partitions
> > and thus cannot get a dump (swap is on gmirror). Some ddb outputs are
> > below.
>

freebsd stable - Jul 2005 - panic: sbflush_locked on 5.4-p5/i386

panic: sbflush_locked on 5.4-p5/i386

panic: sbflush_locked on 5.4-p5/i386

panic: sbflush_locked on 5.4-p5/i386

panic: sbflush_locked on 5.4-p5/i386