Hi,
I'm under the illusion that I've found a bug in the FreeBSD kernel, but
since I'm new to FreeBSD, a quiet voice tells me it's probably a case of
"you're doing it wrong".
Also, I'm not sure if this is the right place to complain. So feel free
to redirect me.
I'll start with some context:
* FreeBSD storage.[...] 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3
07:46:30 UTC 2012
root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
* There are 5 expansion units attached via SAS, daisy-chained. Each unit
has 12 disks, totalling at 60 disks. To provide path redundancy, the
units are connected HBA-1-2-3-4-5 and HBA-5-4-3-2-1.
* I've configured a ZFS on top, with 6 RAID-Z2 arrays of 8+2 disks each.
This setup should be able to survive a disk failure. However, manually
ejecting one of the disks causes a kernel panic. I've manually OCR'd it
below. The panic is not triggered by the ejection itself. I can see that
fact in the kernel log a few seconds after the ejection. I think the
panic is triggered by access to the (now ejected) disk.
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff807ced68
> stack pointer = 0x28:0xffffff80002ecb70
> frame pointer = 0x28:0xffffff80002ecbc0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 13 (g_down)
> trap number = 12
> panic: page fault
> cpuid = 0
> KDB: stack backtrace:
> #0 0xffffffff808680fe at kdb_backtrace+0x5e
> #1 0xffffffff80832cb7 at panic+0x184
> #2 0xffffffff80b18400 at trap_fatal+0x290
> #3 0xffffffff80b18749 at trap_pfault+0x1f9
> #4 0xffffffff80b18c0f at trap+0x3df
> #5 0xffffffff80b0313f at calltrap+0x8
> #6 0xffffffff80g3f874 at g_io_schedule_down+0x1d4
> #7 0xffffffff807cfb7c at g_down_procbody+0x5c
> #8 0xffffffff8080682f at fork_exit+0x11f
> #9 0xffffffff80b0366e at fork_trampoline+0xe
> Uptime: 7m16s
> Automatic reboot in 15 seconds - press a key on the console to abort
So the question is either "what am I doing wrong?" or "can anyone
confirm this is a bug?"
thanks in advance,
Niels
PS: I'm trying to post via email and read via nntp://gmane, I'm not sure
how well this works.