Charles Sprickman
2010-Feb-26 08:36 UTC
7.2-p4: panic: ufsdirhash_lookup: bad offset in hash array
I have a box that has paniced two nights in a row with this error. I have a corefile from last night, but tonight's failed: Uptime: 23h55m22s Physical memory: 6130 MB Dumping 759 MB: 744 728 712 696 680 664 648 ** DUMP FAILED (ERROR 16) ** Here's some info from the core I do have: #0 doadump () at pcpu.h:195 195 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:195 #1 0x0000000000000004 in ?? () #2 0xffffffff8034c799 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0xffffffff8034cba2 in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0xffffffff8052545f in ufsdirhash_lookup (ip=0xffffff0012530398, name=0xffffff012788b000 "1266473205.M123372P75411V0000005BI08EE5F75_0.xena.bway.net,S=43650:2,S", namelen=70, offp=0xffffffff285f474c, bpp=0xffffffff285f4738, prevoffp=0x0) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:599 #5 0xffffffff805278a0 in ufs_lookup (ap=0xffffffff285f4790) at /usr/src/sys/ufs/ufs/ufs_lookup.c:224 #6 0xffffffff803be024 in vfs_cache_lookup (ap=Variable "ap" is not available.) at vnode_if.h:83 #7 0xffffffff805a08bf in VOP_LOOKUP_APV (vop=0xffffffff807945c0, a=0xffffffff285f4850) at vnode_if.c:99 #8 0xffffffff803c4a4f in lookup (ndp=0xffffffff285f4960) at vnode_if.h:57 #9 0xffffffff803c58ba in namei (ndp=0xffffffff285f4960) at /usr/src/sys/kern/vfs_lookup.c:215 #10 0xffffffff803d2c94 in kern_lstat (td=0xffffff007693aa50, path=Variable "path" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2184 #11 0xffffffff803d2f07 in lstat (td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2167 #12 0xffffffff80574e77 in syscall (frame=0xffffffff285f4c80) at /usr/src/sys/amd64/amd64/trap.c:900 #13 0xffffffff805598ab in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:330 #14 0x000000080071063c in ?? () Previous frame inner to this frame (corrupt stack?) Previous to this, I had one panic while this box was being stress-tested before it went into production. It's a new Dell box with a Dell/LSI RAID card (mfi driver). It's a mail server, and the ufs dirhash sysctl is pushed up to "vfs.ufs.dirhash_maxmem=33554432". Previous post on the previous panic back in November is here: http://marc.info/?l=freebsd-stable&m=125901173424554&w=2 Before last night's crash, it was up for 93 days. Nothing has changed in the past few days as far as software or overall load. The crash did happen during or shortly after the daily periodic run. Any interest in this one? Is it something to file a PR on? dmesg is below... Thanks, Charles ___ Charles Sprickman NetEng/SysAdmin Bway.net - New York's Best Internet - www.bway.net spork@bway.net - 212.655.9344 Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.2-RELEASE-p4 #2: Mon Nov 2 21:55:12 EST 2009 spork@bigmail.bway.net:/usr/obj/usr/src/sys/BWAY7-64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Quad-Core AMD Opteron(tm) Processor 2372 HE (2094.76-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f42 Stepping = 2 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x802009<SSE3,MON,CX16,<b23>> AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> AMD Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,<b5>,<b6>,<b7>,Prefetch,<b9>,<b10>,<b12>,<b13>> TSC: P-state invariant Cores per package: 4 usable memory = 6427951104 (6130 MB) avail memory = 6198829056 (5911 MB) ACPI APIC Table: <DELL PE_SC3 > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0: Changing APIC ID to 4 ioapic1: Changing APIC ID to 5 ioapic2: Changing APIC ID to 6 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 1.1> irqs 0-15 on motherboard ioapic1 <Version 1.1> irqs 32-47 on motherboard ioapic2 <Version 1.1> irqs 64-79 on motherboard kbd0 at kbdmux0 acpi0: <DELL PE_SC3> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) ipmi0: KCS mode found at io 0xca8 on acpi Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 pci8: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci8 pci9: <ACPI PCI bus> on pcib2 atapci0: <ServerWorks HT1000 SATA150 controller> port 0xdcb0-0xdcb7,0xdca0-0xdca3,0xdcb8-0xdcbf,0xdca4-0xdca7,0xdce0-0xdcef mem 0xee2fe000-0xee2fffff irq 11 at device 14.0 on pci8 atapci0: [ITHREAD] ata2: <ATA channel 0> on atapci0 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci0 ata3: [ITHREAD] ata4: <ATA channel 2> on atapci0 ata4: [ITHREAD] ata5: <ATA channel 3> on atapci0 ata5: [ITHREAD] isab0: <PCI-ISA bridge> at device 2.2 on pci0 isa0: <ISA bus> on isab0 ohci0: <OHCI (generic) USB controller> port 0xc000-0xc0ff mem 0xee0ed000-0xee0edfff irq 11 at device 3.0 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: <OHCI (generic) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: <(0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered ohci1: <OHCI (generic) USB controller> port 0xc400-0xc4ff mem 0xee0ee000-0xee0eefff irq 11 at device 3.1 on pci0 ohci1: [GIANT-LOCKED] ohci1: [ITHREAD] usb1: OHCI version 1.0, legacy support usb1: <OHCI (generic) USB controller> on ohci1 usb1: USB revision 1.0 uhub1: <(0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> port 0xc800-0xc8ff mem 0xee0ef000-0xee0effff irq 11 at device 3.2 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb2: EHCI version 1.0 usb2: companion controllers, 2 ports each: usb0 usb1 usb2: <EHCI (generic) USB 2.0 controller> on ehci0 usb2: USB revision 2.0 uhub2: <(0x1166) EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb2 uhub2: 4 ports with 4 removable, self powered uhub3: <vendor 0x04b4 product 0x6560, class 9/0, rev 2.00/0.0b, addr 2> on uhub2 uhub3: multiple transaction translators uhub3: 4 ports with 4 removable, self powered uhub4: <vendor 0x04b4 product 0x6560, class 9/0, rev 2.00/0.0b, addr 3> on uhub3 uhub4: multiple transaction translators uhub4: 4 ports with 4 removable, self powered vgapci0: <VGA-compatible display> port 0xcc00-0xccff mem 0xe0000000-0xe7ffffff,0xee0f0000-0xee0fffff irq 39 at device 4.0 on pci0 pcib3: <ACPI PCI-PCI bridge> irq 32 at device 7.0 on pci0 pci10: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> irq 33 at device 8.0 on pci0 pci11: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> irq 37 at device 9.0 on pci0 pci1: <ACPI PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> irq 37 at device 0.0 on pci1 pci2: <ACPI PCI bus> on pcib6 pcib7: <ACPI PCI-PCI bridge> irq 37 at device 1.0 on pci2 pci3: <ACPI PCI bus> on pcib7 pcib8: <PCI-PCI bridge> at device 0.0 on pci3 pci4: <PCI bus> on pcib8 bce0: <Broadcom NetXtreme II BCM5708 1000Base-T (B2)> mem 0xec000000-0xedffffff irq 37 at device 0.0 on pci4 miibus0: <MII bus> on bce0 brgphy0: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce0: Ethernet address: 00:22:19:64:e4:d1 bce0: [ITHREAD] bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (0x05000405); Flags( MFW MSI ) pcib9: <ACPI PCI-PCI bridge> irq 37 at device 2.0 on pci2 pci5: <ACPI PCI bus> on pcib9 pcib10: <PCI-PCI bridge> at device 0.0 on pci5 pci6: <PCI bus> on pcib10 bce1: <Broadcom NetXtreme II BCM5708 1000Base-T (B2)> mem 0xea000000-0xebffffff irq 37 at device 0.0 on pci6 miibus1: <MII bus> on bce1 brgphy1: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce1: Ethernet address: 00:22:19:64:e4:d3 bce1: [ITHREAD] bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (0x05000405); Flags( MFW MSI ) pcib11: <ACPI PCI-PCI bridge> irq 37 at device 3.0 on pci2 pci7: <ACPI PCI bus> on pcib11 mfi0: <Dell PERC 6> port 0xec00-0xecff mem 0xe9f80000-0xe9fbffff,0xe9fc0000-0xe9ffffff irq 37 at device 0.0 on pci7 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 3047 (320486575s/0x0020/info) - Shutdown command received from host mfi0: 3048 (boot + 3s/0x0020/info) - Firmware initialization started (PCI ID 0060/1000/1f0c/1028) mfi0: 3049 (boot + 3s/0x0020/info) - Firmware version 1.22.02-0612 mfi0: 3050 (boot + 23s/0x0008/info) - Battery Present mfi0: 3051 (boot + 23s/0x0020/info) - Controller hardware revision ID (0x0) mfi0: 3052 (boot + 23s/0x0020/info) - Package version 6.2.0-0013 mfi0: 3053 (boot + 23s/0x0020/info) - Board Revision mfi0: 3054 (boot + 37s/0x0004/info) - Enclosure PD 20(c None/p0) communication restored mfi0: 3055 (boot + 38s/0x0002/info) - Inserted: Encl PD 20 mfi0: 3056 (boot + 38s/0x0002/info) - Inserted: PD 20(c None/p0) Info: enclPd=20, scsiType=d, portMap=09, sasAddr=5002408070005000,0000000000000000 mfi0: 3057 (boot + 38s/0x0002/info) - Inserted: PD 00(e0x20/s0) mfi0: 3058 (boot + 38s/0x0002/info) - Inserted: PD 00(e0x20/s0) Info: enclPd=20, scsiType=0, portMap=00, sasAddr=1221000000000000,0000000000000000 mfi0: 3059 (boot + 38s/0x0002/WARN) - PD 00(e0x20/s0) is not a certified drive mfi0: 3060 (boot + 38s/0x0002/info) - Inserted: PD 01(e0x20/s1) mfi0: 3061 (boot + 38s/0x0002/info) - Inserted: PD 01(e0x20/s1) Info: enclPd=20, scsiType=0, portMap=01, sasAddr=1221000001000000,0000000000000000 mfi0: 3062 (boot + 38s/0x0002/WARN) - PD 01(e0x20/s1) is not a certified drive mfi0: 3063 (boot + 38s/0x0002/info) - Inserted: PD 02(e0x20/s2) mfi0: 3064 (boot + 38s/0x0002/info) - Inserted: PD 02(e0x20/s2) Info: enclPd=20, scsiType=0, portMap=02, sasAddr=1221000002000000,0000000000000000 mfi0: 3065 (boot + 38s/0x0002/WARN) - PD 02(e0x20/s2) is not a certified drive mfi0: 3066 (boot + 38s/0x0002/info) - Inserted: PD 03(e0x20/s3) mfi0: 3067 (boot + 38s/0x0002/info) - Inserted: PD 03(e0x20/s3) Info: enclPd=20, scsiType=0, portMap=03, sasAddr=1221000003000000,0000000000000000 mfi0: 3068 (boot + 38s/0x0002/WARN) - PD 03(e0x20/s3) is not a certified drive mfi0: 3069 (boot + 38s/0x0002/info) - Inserted: PD 04(e0x20/s4) mfi0: 3070 (boot + 38s/0x0002/info) - Inserted: PD 04(e0x20/s4) Info: enclPd=20, scsiType=0, portMap=04, sasAddr=1221000004000000,0000000000000000 mfi0: 3071 (boot + 38s/0x0002/WARN) - PD 04(e0x20/s4) is not a certified drive mfi0: 3072 (boot + 39s/0x0042/info) - Global Hot Spare created on PD 04(e0x20/s4) (global,rev) mfi0: 3073 (boot + 39s/0x0020/info) - Cache data recovered successfully mfi0: 3074 (320486641s/0x0020/info) - Time established as 02/26/10 8:04:01; (40 seconds since power on) mfi0: 3075 (320486689s/0x0008/info) - Battery temperature is normal mfi0: [ITHREAD] pcib12: <ACPI PCI-PCI bridge> irq 35 at device 10.0 on pci0 pci12: <ACPI PCI bus> on pcib12 pcib13: <PCI-PCI bridge> irq 36 at device 11.0 on pci0 pci13: <PCI bus> on pcib13 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: does not respond device_attach: fdc0 attach returned 6 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: does not respond device_attach: fdc0 attach returned 6 ipmi1: <IPMI System Interface> on isa0 device_attach: ipmi1 attach returned 16 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc8fff,0xec000-0xeffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: DVDR <TSSTcorp DVD+/-RW TS-L633C/D150> at ata3-master SATA150 ipmi0: IPMI device rev. 0, firmware rev. 2.2, version 2.0 ipmi0: Number of channels 4 ipmi0: Attached watchdog mfi0: 3076 (320486689s/0x0008/info) - Current capacity of the battery is above threshold mfid0: <MFI Logical Disk> on mfi0 mfid0: 1429760MB (2928148480 sectors) RAID volume 'Virtual Disk 0' is optimal SMP: AP CPU #3 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! GEOM_LABEL: Label for provider mfid0s1a is ufsid/4aef7c53bd684719. GEOM_LABEL: Label for provider mfid0s1d is ufsid/4aef7c70156cf640. GEOM_LABEL: Label for provider mfid0s1e is ufsid/4aef7c70118f4010. GEOM_LABEL: Label for provider mfid0s1f is ufsid/4aef7c71b1b1ea99. GEOM_LABEL: Label for provider mfid0s1g is ufsid/4aef7c53dcd24191. Trying to mount root from ufs:/dev/mfid0s1a WARNING: / was not properly dismounted GEOM_LABEL: Label ufsid/4aef7c53bd684719 removed. GEOM_LABEL: Label for provider mfid0s1a is ufsid/4aef7c53bd684719. GEOM_LABEL: Label ufsid/4aef7c53dcd24191 removed. mfi0: 3077 (320486754s/0x0008/info) - Battery started charging GEOM_LABEL: Label for provider mfid0s1g is ufsid/4aef7c53dcd24191. GEOM_LABEL: Label ufsid/4aef7c70156cf640 removed. GEOM_LABEL: Label for provider mfid0s1d is ufsid/4aef7c70156cf640. GEOM_LABEL: Label ufsid/4aef7c70118f4010 removed. GEOM_LABEL: Label for provider mfid0s1e is ufsid/4aef7c70118f4010. GEOM_LABEL: Label ufsid/4aef7c71b1b1ea99 removed. GEOM_LABEL: Label for provider mfid0s1f is ufsid/4aef7c71b1b1ea99. GEOM_LABEL: Label ufsid/4aef7c53bd684719 removed. GEOM_LABEL: Label ufsid/4aef7c53dcd24191 removed. GEOM_LABEL: Label ufsid/4aef7c70156cf640 removed. GEOM_LABEL: Label ufsid/4aef7c70118f4010 removed. GEOM_LABEL: Label ufsid/4aef7c71b1b1ea99 removed.