Hi all, I have a mail server with 4 disks. 1 disc with the operating system and 3 disks in raid 0 which is mounted my /var. After 16 or 19 days, I get a kernel panic. Have swapped memories and even then after some time the kernel panic happens. Looking dump below I could be sure that the /dev/ada1is in trouble? I've done the fsck -yin single mode several times and did not solve the problem. Wed Jan 8 08:13:00 BRST 2014 FreeBSD mail.xxx.com.br 9.2-STABLE FreeBSD 9.2-STABLE #2 r258208: Thu Dec 19 09:33:43 BRST 2013 root at mail.xxx.com.br:/usr/obj/usr/src/sys/TITAN amd64 panic: softdep_deallocate_dependencies: unrecovered I/O error GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: <WDC WD10EZEX-00RKKA0 80.00A80> s/n WD-WCC1S2515766 detached g_vfs_done():stripe/var[WRITE(offset=426073161728, length=32768)]error = 6 /dev: got error 6 while accessing filesystem panic: softdep_deallocate_dependencies: unrecovered I/O error cpuid = 3 KDB: stack backtrace: #0 0xffffffff8053e1c6 at kdb_backtrace+0x66 #1 0xffffffff80503ede at panic+0x1ce #2 0xffffffff8070d290 at clear_remove+0 #3 0xffffffff80585705 at brelse+0x75 #4 0xffffffff80588599 at bufwrite+0x109 #5 0xffffffff807011f3 at ffs_update+0x273 #6 0xffffffff80726b63 at ffs_syncvnode+0x4f3 #7 0xffffffff8072775e at ffs_fsync+0x2e #8 0xffffffff807cf768 at VOP_FSYNC_APV+0x78 #9 0xffffffff805adf5b at sys_fsync+0x18b #10 0xffffffff8077b44a at amd64_syscall+0x5ea #11 0xffffffff80765b67 at Xfast_syscall+0xf7 Uptime: 19d14h7m37s Dumping 705 out of 7904 MB:..3%..12%..21%..32%..41%..53%..62%..71%..82%..91% Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from /boot/kernel/geom_stripe.ko.symbols...done. done. Loaded symbols for /boot/kernel/geom_stripe.ko Reading symbols from /boot/kernel/aio.ko...Reading symbols from /boot/kernel/aio.ko.symbols...done. done. Loaded symbols for /boot/kernel/aio.ko Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from /boot/kernel/accf_data.ko.symbols...done. done. Loaded symbols for /boot/kernel/accf_data.ko Reading symbols from /boot/kernel/accf_dns.ko...Reading symbols from /boot/kernel/accf_dns.ko.symbols...done. done. Loaded symbols for /boot/kernel/accf_dns.ko Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /boot/kernel/accf_http.ko.symbols...done. done. Loaded symbols for /boot/kernel/accf_http.ko Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from /boot/kernel/coretemp.ko.symbols...done. done. Loaded symbols for /boot/kernel/coretemp.ko Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from /boot/kernel/cc_htcp.ko.symbols...done. done. Loaded symbols for /boot/kernel/cc_htcp.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:234 234 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:234 #1 0xffffffff805039b6 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:449 #2 0xffffffff80503eb7 in panic (fmt=0x1 <Address 0x1 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:637 #3 0xffffffff8070d290 in softdep_deallocate_dependencies ( bp=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:13593 #4 0xffffffff80585705 in brelse (bp=0xffffff81de2f0210) at buf.h:430 #5 0xffffffff80588599 in bufwrite (bp=0xffffff81de2f0210) at /usr/src/sys/kern/vfs_bio.c:1123 #6 0xffffffff807011f3 in ffs_update (vp=0xfffffe01ad068dc8, waitfor=1) at buf.h:397 #7 0xffffffff80726b63 in ffs_syncvnode (vp=0xfffffe01ad068dc8, waitfor=1, flags=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_vnops.c:341 #8 0xffffffff8072775e in ffs_fsync (ap=0xffffff823d21c970) at /usr/src/sys/ufs/ffs/ffs_vnops.c:188 #9 0xffffffff807cf768 in VOP_FSYNC_APV (vop=0xffffffff80ae3220, a=0xffffff823d21c970) at vnode_if.c:1309 #10 0xffffffff805adf5b in sys_fsync (td=0xfffffe0155ac2920, uap=<value optimized out>) at vnode_if.h:549 #11 0xffffffff8077b44a in amd64_syscall (td=0xfffffe0155ac2920, traced=0) at subr_syscall.c:135 #12 0xffffffff80765b67 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:391 #13 0x00000008024f975c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) Thanks and best regards
Marcelo Gondim wrote this message on Wed, Jan 08, 2014 at 08:33 -0200:> I have a mail server with 4 disks. 1 disc with the operating system and > 3 disks in raid 0 which is mounted my /var. After 16 or 19 days, I get a > kernel panic. Have swapped memories and even then after some time the > kernel panic happens. Looking dump below I could be sure that the > /dev/ada1is in trouble? > I've done the fsck -yin single mode several times and did not solve the > problem.This is simply a problem of one of your hd's detaching:> ada1: <WDC WD10EZEX-00RKKA0 80.00A80> s/n WD-WCC1S2515766 detached > g_vfs_done():stripe/var[WRITE(offset=426073161728, length=32768)]error = 6 > /dev: got error 6 while accessing filesystem > panic: softdep_deallocate_dependencies: unrecovered I/O errorAnd so g_stripe passes the error to ffs and ffs panics because it can't handle the write error... I'm not sure if there is work to make FFS handle disk departers sanely or not... I've heard that some cables can be bad, if it's the same drive that keeps detaching, try replacing that cable... It could also be a bad drive, I have a pair of drives w/ the same firmware, and one would like to detach itself every month or so, but the other drive runs fine... I know it's the drive since I hotswap them, and they both have been in the same bay and it's still the one drive that disconnects... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."