Hi Robert and others, I just upgraded from 5.5 (stable btw.) to 6.1 and after 10 hours I got a nice panic. Does this look like some tty problem ? This is the machine which made that many problems with PREEMTION enabled in earlier releases of 5.x. Is it possible that I'm hitting now again the same bugs or is it clearly a tty related problem ? kgdb /var/core/kernel.debug /var/core/vmcore.6 #0 0xc0663002 in doadump () #1 0xc066355e in boot () #2 0xc06638b5 in panic () #3 0xc085c6b6 in trap_fatal () #4 0xc085c3bf in trap_pfault () #5 0xc085bfb5 in trap () #6 0xc0848bea in calltrap () #7 0xc0693b51 in ttymodem () #8 0xc0698362 in ptcclose () #9 0xc0638a6f in giant_close () #10 0xc06162bf in devfs_close () #11 0xc086dc1c in VOP_CLOSE_APV () #12 0xc06c87e2 in vn_close () #13 0xc06c974a in vn_closefile () #14 0xc06162e7 in devfs_close_f () #15 0xc0642cdc in fdrop_locked () #16 0xc0642c29 in fdrop () #17 0xc06411c7 in closef () #18 0xc063e329 in close () #19 0xc085c9f7 in syscall () #20 0xc0848c3f in Xint0x80_syscall () #21 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) Unfortunaltly I get this with the debug kernel. Does one have to boot with the debug.kernel itself to get a trace which is usable ? kgdb /var/core/kernel.debug /var/core/vmcore.6 kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) kgdb: kvm_read: invalid address (0x21) Martin Blapp, <mb@imp.ch> <mbr@FreeBSD.org> ------------------------------------------------------------------ ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 61 826 93 00 Fax: +41 61 826 93 01 PGP: <finger -l mbr@freebsd.org> PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E ------------------------------------------------------------------
On Wed, 21 Jun 2006, Martin Blapp wrote:> I just upgraded from 5.5 (stable btw.) to 6.1 and after 10 hours I got a > nice panic. Does this look like some tty problem ?It looks like a tty or devfs problem.> This is the machine which made that many problems with PREEMTION enabled in > earlier releases of 5.x. Is it possible that I'm hitting now again the same > bugs or is it clearly a tty related problem ?I'm not sure there's evidence it's caused by preemption, but it's not impossible that preemption makes it more likely to happen, or facilitates it happening on non-SMP systems. Wojciech Koszek has recently been looking into devfs-related races that trigger for pty's, which could be relevant to what you're seeing, so I've CC'd him. Robert N M Watson Computer Laboratory University of Cambridge> > kgdb /var/core/kernel.debug /var/core/vmcore.6 > > #0 0xc0663002 in doadump () > #1 0xc066355e in boot () > #2 0xc06638b5 in panic () > #3 0xc085c6b6 in trap_fatal () > #4 0xc085c3bf in trap_pfault () > #5 0xc085bfb5 in trap () > #6 0xc0848bea in calltrap () > #7 0xc0693b51 in ttymodem () > #8 0xc0698362 in ptcclose () > #9 0xc0638a6f in giant_close () > #10 0xc06162bf in devfs_close () > #11 0xc086dc1c in VOP_CLOSE_APV () > #12 0xc06c87e2 in vn_close () > #13 0xc06c974a in vn_closefile () > #14 0xc06162e7 in devfs_close_f () > #15 0xc0642cdc in fdrop_locked () > #16 0xc0642c29 in fdrop () > #17 0xc06411c7 in closef () > #18 0xc063e329 in close () > #19 0xc085c9f7 in syscall () > #20 0xc0848c3f in Xint0x80_syscall () > #21 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > Unfortunaltly I get this with the debug kernel. > Does one have to boot with the debug.kernel itself > to get a trace which is usable ? > > kgdb /var/core/kernel.debug /var/core/vmcore.6 > > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > kgdb: kvm_read: invalid address (0x21) > > Martin Blapp, <mb@imp.ch> <mbr@FreeBSD.org> > ------------------------------------------------------------------ > ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH > Phone: +41 61 826 93 00 Fax: +41 61 826 93 01 > PGP: <finger -l mbr@freebsd.org> > PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E > ------------------------------------------------------------------ >
Hi, Yesterday I've installed 6.1. After rebooting, I got the beastie loader screen. I've typed 8 (reboot) to get into the raid bios again, but then the server freezed. Does this happen only to me or is it a common problem ? The server affected is a IBM 346 X-Series server with 2 CPUS. Martin Martin Blapp, <mb@imp.ch> <mbr@FreeBSD.org> ------------------------------------------------------------------ ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 61 826 93 00 Fax: +41 61 826 93 01 PGP: <finger -l mbr@freebsd.org> PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E ------------------------------------------------------------------
On Thu, 2006-06-22 at 23:27 +0200, Martin Blapp wrote:> Hi, > > >>> Unfortunaltly I get this with the debug kernel. > >>> Does one have to boot with the debug.kernel itself > >>> to get a trace which is usable ? > > Sigh. A recompile helped ! > > (kgdb) where > #0 doadump () at pcpu.h:165 > #1 0xc066355e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc06638b5 in panic (fmt=0xc0891732 "%s") at /usr/src/sys/kern/kern_shutdown.c:565 > #3 0xc085c6b6 in trap_fatal (frame=0xed6e4ab8, eva=4) at /usr/src/sys/i386/i386/trap.c:836 > #4 0xc085c3bf in trap_pfault (frame=0xed6e4ab8, usermode=0, eva=4) at /usr/src/sys/i386/i386/trap.c:744 > #5 0xc085bfb5 in trap (frame> {tf_fs = 8, tf_es = 40, tf_ds = -1063714776, tf_edi = -1064042304, tf_esi > = 0, tf_ebp = -311538944, tf_isp = -311538972, tf_ebx = -967615488, tf_edx = > -1063651212, tf_ecx = -941099136, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066845359, tf_cs > = 32, tf_eflags = 66194, tf_esp = -967615488, tf_ss = 0}) > at /usr/src/sys/i386/i386/trap.c:434 > #6 0xc0848bea in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > #7 0xc0693b51 in ttymodem (tp=0xc6535c00, flag=-1063651212) at /usr/src/sys/kern/tty.c:1659 > #8 0xc0698362 in ptcclose (dev=0x0, flags=3, fmt=8192, td=0xc7e7f780) at linedisc.h:136 > #9 0xc0638a6f in giant_close (dev=0xcb3c1100, fflag=3, devtype=8192, td=0xc7e7f780) at /usr/src/sys/kern/kern_conf.c:266 > #10 0xc06162bf in devfs_close (ap=0xed6e4b7c) at /usr/src/sys/fs/devfs/devfs_vnops.c:287 > #11 0xc086dc1c in VOP_CLOSE_APV (vop=0x0, a=0xc099f874) at vnode_if.c:426 > #12 0xc06c87e2 in vn_close (vp=0xc9cdf660, flags=3, file_cred=0x0, td=0xc7e7f780) at vnode_if.h:227 > #13 0xc06c974a in vn_closefile (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/vfs_vnops.c:865 > #14 0xc06162e7 in devfs_close_f (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/fs/devfs/devfs_vnops.c:297 > #15 0xc0642cdc in fdrop_locked (fp=0xc6fc5438, td=0xc7e7f780) at file.h:295 > #16 0xc0642c29 in fdrop (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:2122 > #17 0xc06411c7 in closef (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:1942 > #18 0xc063e329 in close (td=0xc7e7f780, uap=0x0) at /usr/src/sys/kern/kern_descrip.c:1007 > #19 0xc085c9f7 in syscall (frame> {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 0, tf_esi = 673925920, > tf_ebp = -1077941928, tf_isp = -311538332, tf_ebx = 673852332, tf_edx = > 673925920, tf_ecx = 673925920, > tf_eax = 6, tf_trapno = 12, tf_err = 2, tf_eip = 673354727, tf_cs = 51, > tf_eflags = 518, tf_esp = -1077941956, tf_ss = 59}) at > /usr/src/sys/i386/i386/trap.c:981 > #20 0xc0848c3f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 > #21 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > (kgdb) frame 5 > #5 0xc085bfb5 in trap (frame> {tf_fs = 8, tf_es = 40, tf_ds = -1063714776, tf_edi = -1064042304, tf_esi > = 0, tf_ebp = -311538944, tf_isp = -311538972, tf_ebx = -967615488, tf_edx = > -1063651212, tf_ecx > -941099136, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066845359, tf_cs > = 32, tf_eflags = 66194, tf_esp = -967615488, tf_ss = 0}) > at /usr/src/sys/i386/i386/trap.c:434 > > (kgdb) frame 8 > #8 0xc0698362 in ptcclose (dev=0x0, flags=3, fmt=8192, td=0xc7e7f780) at > linedisc.h:136 > > 136 return ((*linesw[tp->t_line]->l_modem)(tp, flag)); > (kgdb) list > 131 > 132 static __inline int > 133 ttyld_modem(struct tty *tp, int flag) > 134 { > 135 > 136 return ((*linesw[tp->t_line]->l_modem)(tp, flag)); > 137 } > 138 > 139 #endif /* _KERNEL */ > > (kgdb) frame 7 > > (kgdb) p *tp->t_session > Cannot access memory at address 0x0 > > (kgdb) frame 7 > #7 0xc0693b51 in ttymodem (tp=0xc6535c00, flag=-1063651212) at /usr/src/sys/kern/tty.c:1659 > 1659 if (tp->t_session->s_leader) { > (kgdb) list > 1654 !ISSET(tp->t_cflag, CLOCAL)) { > 1655 SET(tp->t_state, TS_ZOMBIE); > 1656 CLR(tp->t_state, TS_CONNECTED); > 1657 if (tp->t_session) { > 1658 sx_slock(&proctree_lock); > 1659 if (tp->t_session->s_leader) { > 1660 struct proc *p; > 1661 > 1662 p = tp->t_session->s_leader; > 1663 PROC_LOCK(p); > > (kgdb) p *tp->t_session > Cannot access memory at address 0x0 > > So here the problem is. Why is tp->t_session empty ? Maybe it has been already > free() earlier and we have some race here ?"Race" was exactly my conclusion last time I looked into this. http://docs.FreeBSD.org/cgi/mid.cgi?20041204110815.E80797 Something, somewhere is playing with t_session without locking... Gavin
Hi, Maybe this is the solution ? IMHO there is a race window open between the first tp->t_session test and the locking of the proc tree. Martin +++ src/sys/kern/tty.c --- src/sys/kern/tty.c + sx_slock(&proctree_lock); if (tp->t_session) { - sx_slock(&proctree_lock); if (tp->t_session->s_leader) { struct proc *p; p = tp->t_session->s_leader; PROC_LOCK(p); psignal(p, SIGHUP); PROC_UNLOCK(p); } - sx_sunlock(&proctree_lock); } + sx_sunlock(&proctree_lock);