Alexey Tarasov
2013-Oct-26 09:47 UTC
FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error
Hello. I've upgraded server to 9.2 and now it hangs every 2-3 hours of intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a half of a year. g_vfs_done():da1.eli[WRITE(offset=614630752256, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614631211008, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614634815488, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614642319360, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614642909184, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614643007488, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=614644875264, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=550691995648, length=98304)]error = 11 g_vfs_done():da1.eli[WRITE(offset=550692519936, length=32768)]error = 11 g_vfs_done():da1.eli[WRITE(offset=550704152576, length=32768)]error = 11 /data/pgsql/data/base: got error 11 while accessing filesystem panic: softdep_deallocate_dependencies: unrecovered I/O error cpuid = 10 KDB: stack backtrace: #0 0xffffffff80947986 at kdb_backtrace+0x66 #1 0xffffffff8090d9ae at panic+0x1ce #2 0xffffffff80b3ff90 at clear_remove+0 #3 0xffffffff8098fb65 at brelse+0x75 #4 0xffffffff80990978 at bufdone+0x68 #5 0xffffffff8098c83e at biodone+0xae #6 0xffffffff80872f4c at g_io_schedule_up+0xac #7 0xffffffff808736ac at g_up_procbody+0x5c #8 0xffffffff808db67f at fork_exit+0x11f #9 0xffffffff80cdc23e at fork_trampoline+0xe Uptime: 6d15h5m7s Dumping 7664 out of 196573 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Full core.txt is here: http://lexasoft.ru/core.txt.1 Server is HP Proliant DL180 G6 with P410 RAID controller. -- Alexey Tarasov
Franz Schwartau
2013-Oct-26 22:35 UTC
FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error
Hi! I see this kind of messages, too, after upgrading from 9.1 to 9.2. E. g.: g_vfs_done():label/var[WRITE(offset=1147863040, length=32768)]error = 11 g_vfs_done():label/var[WRITE(offset=979927040, length=32768)]error = 11 label/var is not encrypted. No panic occurs on my machine. Best regards Franz On 26.10.2013 11:47, Alexey Tarasov wrote:> Hello. > > I've upgraded server to 9.2 and now it hangs every 2-3 hours of intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a half of a year. > > g_vfs_done():da1.eli[WRITE(offset=614630752256, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614631211008, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614634815488, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614642319360, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614642909184, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614643007488, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614644875264, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550691995648, length=98304)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550692519936, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550704152576, length=32768)]error = 11 > /data/pgsql/data/base: got error 11 while accessing filesystem > panic: softdep_deallocate_dependencies: unrecovered I/O error > cpuid = 10 > KDB: stack backtrace: > #0 0xffffffff80947986 at kdb_backtrace+0x66 > #1 0xffffffff8090d9ae at panic+0x1ce > #2 0xffffffff80b3ff90 at clear_remove+0 > #3 0xffffffff8098fb65 at brelse+0x75 > #4 0xffffffff80990978 at bufdone+0x68 > #5 0xffffffff8098c83e at biodone+0xae > #6 0xffffffff80872f4c at g_io_schedule_up+0xac > #7 0xffffffff808736ac at g_up_procbody+0x5c > #8 0xffffffff808db67f at fork_exit+0x11f > #9 0xffffffff80cdc23e at fork_trampoline+0xe > Uptime: 6d15h5m7s > Dumping 7664 out of 196573 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Full core.txt is here: http://lexasoft.ru/core.txt.1 > > Server is HP Proliant DL180 G6 with P410 RAID controller.
Konstantin Belousov
2013-Oct-27 18:46 UTC
FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error
On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote:> Hello. > > I've upgraded server to 9.2 and now it hangs every 2-3 hours of intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a half of a year. > > g_vfs_done():da1.eli[WRITE(offset=614630752256, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614631211008, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614634815488, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614642319360, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614642909184, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614643007488, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=614644875264, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550691995648, length=98304)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550692519936, length=32768)]error = 11 > g_vfs_done():da1.eli[WRITE(offset=550704152576, length=32768)]error = 11 > /data/pgsql/data/base: got error 11 while accessing filesystem > panic: softdep_deallocate_dependencies: unrecovered I/O error > cpuid = 10 > KDB: stack backtrace: > #0 0xffffffff80947986 at kdb_backtrace+0x66 > #1 0xffffffff8090d9ae at panic+0x1ce > #2 0xffffffff80b3ff90 at clear_remove+0 > #3 0xffffffff8098fb65 at brelse+0x75 > #4 0xffffffff80990978 at bufdone+0x68 > #5 0xffffffff8098c83e at biodone+0xae > #6 0xffffffff80872f4c at g_io_schedule_up+0xac > #7 0xffffffff808736ac at g_up_procbody+0x5c > #8 0xffffffff808db67f at fork_exit+0x11f > #9 0xffffffff80cdc23e at fork_trampoline+0xe > Uptime: 6d15h5m7s > Dumping 7664 out of 196573 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Full core.txt is here: http://lexasoft.ru/core.txt.1 > > Server is HP Proliant DL180 G6 with P410 RAID controller.Look for your current value of the kern.bio_transient_maxcnt and increase it by 4-8 times, using the same tunable. If this helps, fine. If not, disable unmapped i/o with the vfs.unmapped_buf_allowed tunable. Real solution is to convert geom classes like geli to use limited transient mapping windows to access the data, thus adding support for unmapped i/o to them. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20131027/021aa9b0/attachment.sig>
J. Porter Clark
2013-Nov-06 14:34 UTC
FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error
On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote:> > I've upgraded server to 9.2 and now it hangs every 2-3 hours > of intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was > good for a half of a year.I'm glad this is not just my problem. I've been collecting data on it for a couple of months and still can't figure out how to fill out a bug report. Too much data, so little of which is potentially useful. My situation is: A Windows 7 PC running either Outlook 2010 or SCANPST.EXE therefrom, attempting to "repair" a .pst file that will exceed 2^31-1 bytes on completion. i386 9.2-STABLE server (currently r256846) Samba 3.6 UFS (either 1 or 2) GELI GPT partition No other combination can be made to produce this behavior. In particular, changing from UFS to ZFS (even on this 2 GB i386 system) fixes it. I cannot reproduce the problem by running a program on the server; apparently only smbd has the necessary mojo. Adding data authentication to GELI doesn't help. Tweaking block sizes in UFS or GELI doesn't help. Turning off soft updates doesn't help. Samba AIO is as off as I can get it. During the repair (writing) phase, the g_vfs_dones will hit, and the system will be useless until either they stop or the system panics--about equal likelihood.> g_vfs_done():da1.eli[WRITE(offset=614630752256, length=32768)]error = 11Oh, man, have I got a fine collection of these! I'll try the tunable. Seems odd that a tunable would fix it, though. Yes, it has driven me completely crazy. -- J. Porter Clark <jpc2112 at inbox.com>