Dominic Marks
2005-Jun-29 15:41 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
Hello,
I'm trying to use graid3 to create a raid volume from three
250GB SATA discs. I can successfully label, format, and mount
the disc. The problem arises when I try and migrate some data
on to the new volume. I'm using rsync to do this from over the
local network, unfortunately this seems to be produce an
immediate and reproduceable panic (hand copied):
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0xc30f8000
fault code = supervisor write, page not present
instruction pointer = 0x8:0xc05e9783
stack pointer = 0x10:0xd8030c38
frame pointer = 0x10:0xd8030c80
code segment = base 0x0, limit 0xfffff type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 617 (g_raid3 raid)
trap number = 12
panic: page fault
Other programs (touch, ls, diskinfo, etc) do not seem to provoke
the panic, but rsync will kill the system within a second.
I got a dump (once), but I think it is corrupt in some way
because I have not been able to get a backtrace or any other
useful data from it.
# kgdb kernel.debug /usr/crash/vmcore.0
kgdb: kvm_read: invalid address (f9)
(This line is printed again, and again, and again ...)
This may be because I compiled my debugging kernel after I had
installed the system, although it should have been an identical
source tree ... I'm currently rebuilding the system to
the freshest available -STABLE in the hope that may give a
full backtrace.
FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0
Mon Jun 27 09:34:02 BST 2005
root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386
The only thing slightly odd about the machine is that each
disc is one its own SATA controller. One disc is attached to an
Intel ICH6 the other two are attached two Silicon Image (3112)
based cards. The root device is ad2, since the additional cards
have pushed themselves to the front. This is a temporary setup
to facilitate migration of data from system to system.
If I can do anything to help track the problem down, please say.
I really want this to work, and I have some time in which to run
tests.
* A side note, I have noticed that the panic is often accompanied by
a ATA DMA timeout (ad1). Could this cause the panic to occur?
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005
root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV
WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant
WARNING: MPSAFE network stack disabled, expect reduced performance.
ACPI APIC Table: <DELL PESC420>
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0xf41 Stepping = 1
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
real memory = 526958592 (502 MB)
avail memory = 509628416 (486 MB)
ioapic0: Changing APIC ID to 8
ioapic0 <Version 2.0> irqs 0-23 on motherboard
lapic0: Forcing LINT1 to edge trigger
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <DELL PESC420> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pci0: <display, VGA> at device 2.0 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib2
bge0: <Broadcom BCM5751 Gigabit Ethernet, ASIC rev. 0x4001> mem
0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2
miibus0: <MII bus> on bge0
brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge0: Ethernet address: 00:11:11:c3:2c:91
pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
pci3: <ACPI PCI bus> on pcib3
pci0: <serial bus, USB> at device 29.0 (no driver attached)
pci0: <serial bus, USB> at device 29.1 (no driver attached)
pci0: <serial bus, USB> at device 29.2 (no driver attached)
pci0: <serial bus, USB> at device 29.3 (no driver attached)
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci4: <ACPI PCI bus> on pcib4
atapci0: <SiI 3112 SATA150 controller> port
0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7
mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4
ata2: channel #0 on atapci0
ata3: channel #1 on atapci0
atapci1: <SiI 3112 SATA150 controller> port
0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7
mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4
ata4: channel #0 on atapci1
ata5: channel #1 on atapci1
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci2: <Intel ICH6 UDMA100 controller> port
0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device 31.1
on pci0
ata0: channel #0 on atapci2
ata1: channel #1 on atapci2
atapci3: <Intel ICH6 SATA150 controller> port
0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07
irq 20 at device 31.2 on pci0
ata6: channel #0 on atapci3
ata7: channel #1 on atapci3
ichsmb0: <SMBus controller> port 0xece0-0xecff irq 17 at device 31.3 on
pci0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
ppc0: parallel port not found.
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
Timecounter "TSC" frequency 2527010839 Hz quality 800
Timecounters tick every 1.250 msec
IPsec: Initialized Security Association Processing.
ad0: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at
ata4-master SATA150
ad1: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at
ata5-master SATA150
ad2: 76319MB <WDC WD800JD-60JRA0/05.01C05> [155061/16/63] at ata6-master
SATA150
ad3: 238475MB <WDC WD2500JD-00HBB0/08.02D08> [484521/16/63] at
ata7-master SATA150
Mounting root from ufs:/dev/ad2s1a
WARNING: / was not properly dismounted
WARNING: /usr was not properly dismounted
WARNING: /var was not properly dismounted
Thanks very much,
--
Dominic
GoodforBusiness.co.uk
I.T. Services for SMEs in the UK.
Kris Kennaway
2005-Jun-29 17:14 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
On Wed, Jun 29, 2005 at 04:42:49PM +0100, Dominic Marks wrote:> This may be because I compiled my debugging kernel after I had > installed the system, although it should have been an identical > source tree ... I'm currently rebuilding the system to > the freshest available -STABLE in the hope that may give a > full backtrace.Yes, you can have this kind of problem if you compile a kernel.debug "after the fact" and it doesn't quite match. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050629/f3f4cae0/attachment.bin
Dominic Marks
2005-Jun-29 18:43 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
On Wednesday 29 June 2005 16:42, Dominic Marks wrote:> Hello, > > I'm trying to use graid3 to create a raid volume from three > 250GB SATA discs. I can successfully label, format, and mount > the disc. The problem arises when I try and migrate some data > on to the new volume. I'm using rsync to do this from over the > local network, unfortunately this seems to be produce an > immediate and reproduceable panic (hand copied): > > Fatal trap 12: page fault while in kernel mode > > fault virtual address = 0xc30f8000 > fault code = supervisor write, page not present > instruction pointer = 0x8:0xc05e9783 > stack pointer = 0x10:0xd8030c38 > frame pointer = 0x10:0xd8030c80 > code segment = base 0x0, limit 0xfffff type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 617 (g_raid3 raid) > trap number = 12 > panic: page faultHaving recompiled I can no longer produce the panic. I think I may have caused it myself, I had forgotten that I had been tinkering with some values in sys/sys/param.h last week, but it didn't ring a bell when the system went down. I'd been running with MAXPHYS and DFLTPHYS at 256 and it seems graid3 does not like one of those paramters being raised, I suspect its DFLTPHYS and that perhaps graid3 depends on its value for some calculations. This is pure speculation. My apologies for the incorrect report.> Other programs (touch, ls, diskinfo, etc) do not seem to provoke > the panic, but rsync will kill the system within a second. > > I got a dump (once), but I think it is corrupt in some way > because I have not been able to get a backtrace or any other > useful data from it. > > # kgdb kernel.debug /usr/crash/vmcore.0 > kgdb: kvm_read: invalid address (f9) > > (This line is printed again, and again, and again ...) > > This may be because I compiled my debugging kernel after I had > installed the system, although it should have been an identical > source tree ... I'm currently rebuilding the system to > the freshest available -STABLE in the hope that may give a > full backtrace. > > FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0 > Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386 > > The only thing slightly odd about the machine is that each > disc is one its own SATA controller. One disc is attached to an > Intel ICH6 the other two are attached two Silicon Image (3112) > based cards. The root device is ad2, since the additional cards > have pushed themselves to the front. This is a temporary setup > to facilitate migration of data from system to system. > > If I can do anything to help track the problem down, please say. > I really want this to work, and I have some time in which to run > tests. > > * A side note, I have noticed that the panic is often accompanied by > a ATA DMA timeout (ad1). Could this cause the panic to occur? > > Copyright (c) 1992-2005 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, > 1994 The Regents of the University of California. All rights > reserved. FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV > WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant > WARNING: MPSAFE network stack disabled, expect reduced performance. > ACPI APIC Table: <DELL PESC420> > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR, >PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB >E> real memory = 526958592 (502 MB) > avail memory = 509628416 (486 MB) > ioapic0: Changing APIC ID to 8 > ioapic0 <Version 2.0> irqs 0-23 on motherboard > lapic0: Forcing LINT1 to edge trigger > npx0: <math processor> on motherboard > npx0: INT 16 interface > acpi0: <DELL PESC420> on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > cpu0: <ACPI CPU> on acpi0 > acpi_button0: <Power Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pci0: <display, VGA> at device 2.0 (no driver attached) > pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > bge0: <Broadcom BCM5751 Gigabit Ethernet, ASIC rev. 0x4001> mem > 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2 > miibus0: <MII bus> on bge0 > brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:11:11:c3:2c:91 > pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pci0: <serial bus, USB> at device 29.0 (no driver attached) > pci0: <serial bus, USB> at device 29.1 (no driver attached) > pci0: <serial bus, USB> at device 29.2 (no driver attached) > pci0: <serial bus, USB> at device 29.3 (no driver attached) > pci0: <serial bus, USB> at device 29.7 (no driver attached) > pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci4: <ACPI PCI bus> on pcib4 > atapci0: <SiI 3112 SATA150 controller> port > 0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7 > mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4 > ata2: channel #0 on atapci0 > ata3: channel #1 on atapci0 > atapci1: <SiI 3112 SATA150 controller> port > 0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7 > mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4 > ata4: channel #0 on atapci1 > ata5: channel #1 on atapci1 > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > atapci2: <Intel ICH6 UDMA100 controller> port > 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device > 31.1 on pci0 > ata0: channel #0 on atapci2 > ata1: channel #1 on atapci2 > atapci3: <Intel ICH6 SATA150 controller> port > 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 > irq 20 at device 31.2 on pci0 > ata6: channel #0 on atapci3 > ata7: channel #1 on atapci3 > ichsmb0: <SMBus controller> port 0xece0-0xecff irq 17 at device 31.3 > on pci0 > atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 > on acpi0 > sio0: type 16550A > pmtimer0 on isa0 > orm0: <ISA Option ROMs> at iomem > 0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on > isa0 sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on > isa0 > ppc0: parallel port not found. > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > Timecounter "TSC" frequency 2527010839 Hz quality 800 > Timecounters tick every 1.250 msec > IPsec: Initialized Security Association Processing. > ad0: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at > ata4-master SATA150 > ad1: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at > ata5-master SATA150 > ad2: 76319MB <WDC WD800JD-60JRA0/05.01C05> [155061/16/63] at > ata6-master SATA150 > ad3: 238475MB <WDC WD2500JD-00HBB0/08.02D08> [484521/16/63] at > ata7-master SATA150 > Mounting root from ufs:/dev/ad2s1a > WARNING: / was not properly dismounted > WARNING: /usr was not properly dismounted > WARNING: /var was not properly dismounted > > Thanks very much,-- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.