Dominic Marks
2005-Jun-29 15:41 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
Hello, I'm trying to use graid3 to create a raid volume from three 250GB SATA discs. I can successfully label, format, and mount the disc. The problem arises when I try and migrate some data on to the new volume. I'm using rsync to do this from over the local network, unfortunately this seems to be produce an immediate and reproduceable panic (hand copied): Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc30f8000 fault code = supervisor write, page not present instruction pointer = 0x8:0xc05e9783 stack pointer = 0x10:0xd8030c38 frame pointer = 0x10:0xd8030c80 code segment = base 0x0, limit 0xfffff type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 617 (g_raid3 raid) trap number = 12 panic: page fault Other programs (touch, ls, diskinfo, etc) do not seem to provoke the panic, but rsync will kill the system within a second. I got a dump (once), but I think it is corrupt in some way because I have not been able to get a backtrace or any other useful data from it. # kgdb kernel.debug /usr/crash/vmcore.0 kgdb: kvm_read: invalid address (f9) (This line is printed again, and again, and again ...) This may be because I compiled my debugging kernel after I had installed the system, although it should have been an identical source tree ... I'm currently rebuilding the system to the freshest available -STABLE in the hope that may give a full backtrace. FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0 Mon Jun 27 09:34:02 BST 2005 root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386 The only thing slightly odd about the machine is that each disc is one its own SATA controller. One disc is attached to an Intel ICH6 the other two are attached two Silicon Image (3112) based cards. The root device is ad2, since the additional cards have pushed themselves to the front. This is a temporary setup to facilitate migration of data from system to system. If I can do anything to help track the problem down, please say. I really want this to work, and I have some time in which to run tests. * A side note, I have noticed that the panic is often accompanied by a ATA DMA timeout (ad1). Could this cause the panic to occur? Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005 root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant WARNING: MPSAFE network stack disabled, expect reduced performance. ACPI APIC Table: <DELL PESC420> Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> real memory = 526958592 (502 MB) avail memory = 509628416 (486 MB) ioapic0: Changing APIC ID to 8 ioapic0 <Version 2.0> irqs 0-23 on motherboard lapic0: Forcing LINT1 to edge trigger npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <DELL PESC420> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pci0: <display, VGA> at device 2.0 (no driver attached) pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 pci2: <ACPI PCI bus> on pcib2 bge0: <Broadcom BCM5751 Gigabit Ethernet, ASIC rev. 0x4001> mem 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2 miibus0: <MII bus> on bge0 brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:11:11:c3:2c:91 pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0 pci3: <ACPI PCI bus> on pcib3 pci0: <serial bus, USB> at device 29.0 (no driver attached) pci0: <serial bus, USB> at device 29.1 (no driver attached) pci0: <serial bus, USB> at device 29.2 (no driver attached) pci0: <serial bus, USB> at device 29.3 (no driver attached) pci0: <serial bus, USB> at device 29.7 (no driver attached) pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci4: <ACPI PCI bus> on pcib4 atapci0: <SiI 3112 SATA150 controller> port 0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7 mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4 ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 atapci1: <SiI 3112 SATA150 controller> port 0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7 mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4 ata4: channel #0 on atapci1 ata5: channel #1 on atapci1 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci2: <Intel ICH6 UDMA100 controller> port 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device 31.1 on pci0 ata0: channel #0 on atapci2 ata1: channel #1 on atapci2 atapci3: <Intel ICH6 SATA150 controller> port 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 20 at device 31.2 on pci0 ata6: channel #0 on atapci3 ata7: channel #1 on atapci3 ichsmb0: <SMBus controller> port 0xece0-0xecff irq 17 at device 31.3 on pci0 atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: parallel port not found. sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled Timecounter "TSC" frequency 2527010839 Hz quality 800 Timecounters tick every 1.250 msec IPsec: Initialized Security Association Processing. ad0: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at ata4-master SATA150 ad1: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at ata5-master SATA150 ad2: 76319MB <WDC WD800JD-60JRA0/05.01C05> [155061/16/63] at ata6-master SATA150 ad3: 238475MB <WDC WD2500JD-00HBB0/08.02D08> [484521/16/63] at ata7-master SATA150 Mounting root from ufs:/dev/ad2s1a WARNING: / was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted Thanks very much, -- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.
Kris Kennaway
2005-Jun-29 17:14 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
On Wed, Jun 29, 2005 at 04:42:49PM +0100, Dominic Marks wrote:> This may be because I compiled my debugging kernel after I had > installed the system, although it should have been an identical > source tree ... I'm currently rebuilding the system to > the freshest available -STABLE in the hope that may give a > full backtrace.Yes, you can have this kind of problem if you compile a kernel.debug "after the fact" and it doesn't quite match. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050629/f3f4cae0/attachment.bin
Dominic Marks
2005-Jun-29 18:43 UTC
graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)
On Wednesday 29 June 2005 16:42, Dominic Marks wrote:> Hello, > > I'm trying to use graid3 to create a raid volume from three > 250GB SATA discs. I can successfully label, format, and mount > the disc. The problem arises when I try and migrate some data > on to the new volume. I'm using rsync to do this from over the > local network, unfortunately this seems to be produce an > immediate and reproduceable panic (hand copied): > > Fatal trap 12: page fault while in kernel mode > > fault virtual address = 0xc30f8000 > fault code = supervisor write, page not present > instruction pointer = 0x8:0xc05e9783 > stack pointer = 0x10:0xd8030c38 > frame pointer = 0x10:0xd8030c80 > code segment = base 0x0, limit 0xfffff type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 617 (g_raid3 raid) > trap number = 12 > panic: page faultHaving recompiled I can no longer produce the panic. I think I may have caused it myself, I had forgotten that I had been tinkering with some values in sys/sys/param.h last week, but it didn't ring a bell when the system went down. I'd been running with MAXPHYS and DFLTPHYS at 256 and it seems graid3 does not like one of those paramters being raised, I suspect its DFLTPHYS and that perhaps graid3 depends on its value for some calculations. This is pure speculation. My apologies for the incorrect report.> Other programs (touch, ls, diskinfo, etc) do not seem to provoke > the panic, but rsync will kill the system within a second. > > I got a dump (once), but I think it is corrupt in some way > because I have not been able to get a backtrace or any other > useful data from it. > > # kgdb kernel.debug /usr/crash/vmcore.0 > kgdb: kvm_read: invalid address (f9) > > (This line is printed again, and again, and again ...) > > This may be because I compiled my debugging kernel after I had > installed the system, although it should have been an identical > source tree ... I'm currently rebuilding the system to > the freshest available -STABLE in the hope that may give a > full backtrace. > > FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0 > Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386 > > The only thing slightly odd about the machine is that each > disc is one its own SATA controller. One disc is attached to an > Intel ICH6 the other two are attached two Silicon Image (3112) > based cards. The root device is ad2, since the additional cards > have pushed themselves to the front. This is a temporary setup > to facilitate migration of data from system to system. > > If I can do anything to help track the problem down, please say. > I really want this to work, and I have some time in which to run > tests. > > * A side note, I have noticed that the panic is often accompanied by > a ATA DMA timeout (ad1). Could this cause the panic to occur? > > Copyright (c) 1992-2005 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, > 1994 The Regents of the University of California. All rights > reserved. FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV > WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant > WARNING: MPSAFE network stack disabled, expect reduced performance. > ACPI APIC Table: <DELL PESC420> > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR, >PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB >E> real memory = 526958592 (502 MB) > avail memory = 509628416 (486 MB) > ioapic0: Changing APIC ID to 8 > ioapic0 <Version 2.0> irqs 0-23 on motherboard > lapic0: Forcing LINT1 to edge trigger > npx0: <math processor> on motherboard > npx0: INT 16 interface > acpi0: <DELL PESC420> on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > cpu0: <ACPI CPU> on acpi0 > acpi_button0: <Power Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pci0: <display, VGA> at device 2.0 (no driver attached) > pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > bge0: <Broadcom BCM5751 Gigabit Ethernet, ASIC rev. 0x4001> mem > 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2 > miibus0: <MII bus> on bge0 > brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:11:11:c3:2c:91 > pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pci0: <serial bus, USB> at device 29.0 (no driver attached) > pci0: <serial bus, USB> at device 29.1 (no driver attached) > pci0: <serial bus, USB> at device 29.2 (no driver attached) > pci0: <serial bus, USB> at device 29.3 (no driver attached) > pci0: <serial bus, USB> at device 29.7 (no driver attached) > pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci4: <ACPI PCI bus> on pcib4 > atapci0: <SiI 3112 SATA150 controller> port > 0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7 > mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4 > ata2: channel #0 on atapci0 > ata3: channel #1 on atapci0 > atapci1: <SiI 3112 SATA150 controller> port > 0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7 > mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4 > ata4: channel #0 on atapci1 > ata5: channel #1 on atapci1 > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > atapci2: <Intel ICH6 UDMA100 controller> port > 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device > 31.1 on pci0 > ata0: channel #0 on atapci2 > ata1: channel #1 on atapci2 > atapci3: <Intel ICH6 SATA150 controller> port > 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 > irq 20 at device 31.2 on pci0 > ata6: channel #0 on atapci3 > ata7: channel #1 on atapci3 > ichsmb0: <SMBus controller> port 0xece0-0xecff irq 17 at device 31.3 > on pci0 > atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 > on acpi0 > sio0: type 16550A > pmtimer0 on isa0 > orm0: <ISA Option ROMs> at iomem > 0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on > isa0 sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on > isa0 > ppc0: parallel port not found. > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > Timecounter "TSC" frequency 2527010839 Hz quality 800 > Timecounters tick every 1.250 msec > IPsec: Initialized Security Association Processing. > ad0: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at > ata4-master SATA150 > ad1: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at > ata5-master SATA150 > ad2: 76319MB <WDC WD800JD-60JRA0/05.01C05> [155061/16/63] at > ata6-master SATA150 > ad3: 238475MB <WDC WD2500JD-00HBB0/08.02D08> [484521/16/63] at > ata7-master SATA150 > Mounting root from ufs:/dev/ad2s1a > WARNING: / was not properly dismounted > WARNING: /usr was not properly dismounted > WARNING: /var was not properly dismounted > > Thanks very much,-- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.