Hello everybody,
Yesterday I installed FreeBSD 8.0-BETA4 on an IBM 3650, having a
ServerRaid 8k adapter, and 6 sata disks on raid-6. The raid-6 volume was
"synchronizing" for a day, so this syncing process was happening while
I
was installing fbsd on the server. During the installation I was
understanding that (writing) performance was low, but this was rational,
considering the fact that the raid controller had to synchronize its
disks. After the system got installed, and while the controller was
still syncing, I ran "portsnap fetch extract" to get the latest ports.
During this process, all my terminals kept lagging when I was opening
files, browsing directories, etc, and the following kernel message
appeared in dmesg:
lock order reversal:
1st 0xffffff807c133540 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559
2nd 0xffffff0003deb200 dirhash (dirhash) @
/usr/src/sys/ufs/ufs/ufs_dirhash.c:285
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x81e
_sx_xlock() at _sx_xlock+0x55
ufsdirhash_acquire() at ufsdirhash_acquire+0x33
ufsdirhash_add() at ufsdirhash_add+0x19
ufs_direnter() at ufs_direnter+0x88b
ufs_makeinode() at ufs_makeinode+0x2a7
VOP_CREATE_APV() at VOP_CREATE_APV+0x8d
vn_open_cred() at vn_open_cred+0x468
kern_openat() at kern_openat+0x179
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (5, FreeBSD ELF64, open), rip = 0x800e32dfc, rsp =
0x7fffffffe688, rbp = 0x1a4 ---
lock order reversal:
1st 0xffffff00352cbd80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083
2nd 0xffffff807c133540 bufwait (bufwait) @
/usr/src/sys/ufs/ffs/ffs_softdep.c:6177
3rd 0xffffff00352cbba8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x81e
__lockmgr_args() at __lockmgr_args+0xcf3
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x47
vget() at vget+0x7b
vfs_hash_get() at vfs_hash_get+0xd5
ffs_vgetf() at ffs_vgetf+0x48
softdep_sync_metadata() at softdep_sync_metadata+0x456
ffs_syncvnode() at ffs_syncvnode+0x210
ffs_fsync() at ffs_fsync+0x43
ufs_direnter() at ufs_direnter+0x315
ufs_makeinode() at ufs_makeinode+0x2a7
VOP_CREATE_APV() at VOP_CREATE_APV+0x8d
vn_open_cred() at vn_open_cred+0x468
kern_openat() at kern_openat+0x179
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (5, FreeBSD ELF64, open), rip = 0x800e32dfc, rsp =
0x7fffffffe688, rbp = 0x1a4 ---
aac0: COMMAND 0xffffff80003e08a0 TIMEOUT AFTER 40 SECONDS
aac0: COMMAND 0xffffff80003d5070 TIMEOUT AFTER 40 SECONDS
aac0: COMMAND 0xffffff80003e0d00 TIMEOUT AFTER 40 SECONDS
aac0: COMMAND 0xffffff80003d9440 TIMEOUT AFTER 40 SECONDS
....
...and kept on like that, for many many lines, with decreasing timeouts.
Once the syncing process stopped, everything came back to normal (not
that I have stress-tested the machine, to be honest...). But since it
happened once, during this specific procedure, then maybe it could also
happen when the raid controller is reconstructing its volumes; and this
would be very annoying, as far as the server's efficiency (and/or maybe
stability) is concerned.
The kernel is GENERIC-amd64 and untouched, if someone may need more
information (eg, dmesg output) please do not hesitate to say so.
Thank you all for your time in advance,
mamalos
--
George Mamalakis
IT Officer
Electrical and Computer Engineer (Aristotle Un. of Thessaloniki),
MSc (Imperial College of London)
Department of Electrical and Computer Engineering
Faculty of Engineering
Aristotle University of Thessaloniki
phone number : +30 (2310) 994379