My FreeBSD 10.3-RELEASE-p16 server crashes in the middle of a Poudriere
bulk run (see below). This crash happens even if I lower
vfs.zfs.arc_max or tweak vm.v_free_min/target/reserved/severe. I'm
looking for configuration advice in case I missed something obvious,
since this seems to work on Illumos- and Linux-derived O/Ses, but
failing that, I'd like to get some advice as to how to go about
debugging this. I doubt the deadman timer causes the system to stop
responding. It's more likely a race condition elsewhere.
The pool itself uses 4k sectors and is geli-encrypted. I configured the
swap zvol based on root-on-ZFS install instructions found in the FreeBSD
wiki:
zfs create -V 6G -o org.freebsd:swap=on -o checksum=off -o
compression=off -o dedup=off -o sync=disabled -o primarycache=none
zroot/swap
The ZoL wiki recommends a slightly different zvol configuration:
zfs create -V 4G -b $(getconf PAGESIZE) -o logbias=throughput -o
sync=always -o primarycache=metadata -o com.sun:auto-snapshot=false
rpool/swap
I'm not sure how much of this applies to FreeBSD due to differences in
kernel design/implementation. Does anyone have an idea of what might be
going on and how I might get this working?
last pid: 35097; load averages: 0.54, 4.38, 5.99 up
0+05:23:35 03:27:19
94 processes: 1 running, 89 sleeping, 4 waiting
CPU: 0.1% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.9% idle
Mem: 911M Active, 1983M Inact, 979M Wired, 772K Cache, 320K Buf, 14M
Free
ARC: 220M Total, 12M MFU, 45M MRU, 34M Anon, 6645K Header, 122M Other
Swap: 6144M Total, 574M Used, 5570M Free, 9% Inuse
panic: I/O to pool 'zroot' appears to be hung on vdev guid
13314812526404996608
at '/dev/da0p3.eli'.
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff8098e3e0 at kdb_backtrace+0x60
#1 0xffffffff809510b6 at vpanic+0x126
#2 0xffffffff80950f83 at panic+0x43
#3 0xffffffff81a3ddd3 at vdev_deadman+0x123
#4 0xffffffff81a3dce0 at vdev_deadman+0x30
#5 0xffffffff81a3dce0 at vdev_deadman+0x30
#6 0xffffffff81a325a5 at spa_deadman+0x85
#7 0xffffffff80966c2b at softclock_call_cc+0x17b
#8 0xffffffff80967054 at softclock+0x94
#9 0xffffffff8091c9eb at intr_event_execute_handlers+0xab
#10 0xffffffff8091ce36 at ithread_loop+0x96
#11 0xffffffff8091a53a at fork_exit+0x9a
#12 0xffffffff80d3be0e at fork_trampoline+0xe
Uptime: 1h8m24s
--
"The lyf so short, the craft so longe to lerne."