John L. Templer
2008-Sep-27 18:48 UTC
7.1-PRELEASE sporadically panicking with fatal trap 12
I'm running 7.1-PRERELEASE, with /usr/src and /usr/ports last csup-ed just a few days ago. After being up for about a day or so the system will panic because of a page fault. I'm not completely sure, but it seems that the system is more stable when gdm and gnome are disabled in rc.conf. At least it stayed up for several days when I did that. I've run memtest several times, so I'm pretty confident it's not a memory problem. Also the stack trace is always the same, so I'm thinking it's not hardware related. I've attached a stack trace from kgdb, and the output from dmesg. I'd appreciate any help you could give me with this. -------------- next part -------------- /var/crash# kgdb -n 5 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: acd1: WARNING - READ_TOC read data overrun 18>12 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x188 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0782714 stack pointer = 0x28:0xe52aec00 frame pointer = 0x28:0xe52aec18 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 18 (swi6: task queue) trap number = 12 panic: page fault cpuid = 0 Uptime: 8h10m38s Physical memory: 1779 MB Dumping 195 MB: 180 164 148 132 116 100 84 68 52 36 20 4 Reading symbols from /boot/kernel/sound.ko...Reading symbols from /boot/kernel/sound.ko.symbols...done. done. Loaded symbols for /boot/kernel/sound.ko Reading symbols from /boot/kernel/snd_cmi.ko...Reading symbols from /boot/kernel/snd_cmi.ko.symbols...done. done. Loaded symbols for /boot/kernel/snd_cmi.ko Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko Reading symbols from /boot/kernel/linux.ko...Reading symbols from /boot/kernel/linux.ko.symbols...done. done. Loaded symbols for /boot/kernel/linux.ko Reading symbols from /usr/local/modules/fuse.ko...done. Loaded symbols for /usr/local/modules/fuse.ko Reading symbols from /boot/kernel/mach64.ko...Reading symbols from /boot/kernel/mach64.ko.symbols...done. done. Loaded symbols for /boot/kernel/mach64.ko Reading symbols from /boot/kernel/drm.ko...Reading symbols from /boot/kernel/drm.ko.symbols...done. done. Loaded symbols for /boot/kernel/drm.ko #0 doadump () at pcpu.h:196 196 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:196 #1 0xc078fae7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc078fda9 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc0aa174c in trap_fatal (frame=0xe52aebc0, eva=392) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc0aa19d0 in trap_pfault (frame=0xe52aebc0, usermode=0, eva=392) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc0aa238c in trap (frame=0xe52aebc0) at /usr/src/sys/i386/i386/trap.c:530 #6 0xc0a8827b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #7 0xc0782714 in _mtx_lock_sleep (m=0xc4ff804c, tid=3302734576, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 #8 0xc078ed66 in _sema_post (sema=0xc4ff804c, file=0x0, line=0) at /usr/src/sys/kern/kern_sema.c:79 #9 0xc0513350 in ata_completed (context=0xc4ff8000, dummy=1) at /usr/src/sys/dev/ata/ata-queue.c:481 #10 0xc07c2e15 in taskqueue_run (queue=0xc4dbab80) at /usr/src/sys/kern/subr_taskqueue.c:282 #11 0xc07c3123 in taskqueue_swi_run (dummy=0x0) at /usr/src/sys/kern/subr_taskqueue.c:324 #12 0xc076f8db in ithread_loop (arg=0xc4dadb30) at /usr/src/sys/kern/kern_intr.c:1088 #13 0xc076c449 in fork_exit (callout=0xc076f720 <ithread_loop>, arg=0xc4dadb30, frame=0xe52aed38) at /usr/src/sys/kern/kern_fork.c:804 ---Type <return> to continue, or q <return> to quit--- #14 0xc0a882f0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 (kgdb) up 7 #7 0xc0782714 in _mtx_lock_sleep (m=0xc4ff804c, tid=3302734576, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 339 owner = (struct thread *)(v & ~MTX_FLAGMASK); (kgdb) list 334 * If the owner is running on another CPU, spin until the 335 * owner stops running or the state of the lock changes. 336 */ 337 v = m->mtx_lock; 338 if (v != MTX_UNOWNED) { 339 owner = (struct thread *)(v & ~MTX_FLAGMASK); 340 #ifdef ADAPTIVE_GIANT 341 if (TD_IS_RUNNING(owner)) { 342 #else 343 if (m != &Giant && TD_IS_RUNNING(owner)) { (kgdb) q /var/crash# dmesg Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.1-PRERELEASE #3: Tue Sep 23 00:01:44 EDT 2008 root@tigger:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(TM) XP2000+ (1666.74-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x662 Stepping = 2 Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> AMD Features=0xc0400800<SYSCALL,MMX+,3DNow!+,3DNow!> real memory = 1878966272 (1791 MB) avail memory = 1828048896 (1743 MB) kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: <ASUS A7V266-E> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 6ff00000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <VIA 8367 (KT266/KY266x/KT333) host to PCI bridge> on hostb0 agp0: aperture size is 256M pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcm0: <CMedia CMI8738> port 0xd800-0xd8ff irq 10 at device 5.0 on pci0 pcm0: [ITHREAD] ohci0: <NEC uPD 9210 USB controller> mem 0xf9000000-0xf9000fff irq 5 at device 12.0 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: <NEC uPD 9210 USB controller> on ohci0 usb0: USB revision 1.0 uhub0: <NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 3 ports with 3 removable, self powered ohci1: <NEC uPD 9210 USB controller> mem 0xf8800000-0xf8800fff irq 11 at device 12.1 on pci0 ohci1: [GIANT-LOCKED] ohci1: [ITHREAD] usb1: OHCI version 1.0, legacy support usb1: <NEC uPD 9210 USB controller> on ohci1 usb1: USB revision 1.0 uhub1: <NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered ehci0: <NEC uPD 720100 USB 2.0 controller> mem 0xf8000000-0xf80000ff irq 10 at device 12.2 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb2: EHCI version 1.0 usb2: companion controllers, 3 ports each: usb0 usb1 usb2: <NEC uPD 720100 USB 2.0 controller> on ehci0 usb2: USB revision 2.0 uhub2: <NEC EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb2 uhub2: 5 ports with 5 removable, self powered umass0: <Sunplus Technology Inc. USB to Serial-ATA bridge, class 0/0, rev 2.00/c6.83, addr 2> on uhub2 uhub3: <vendor 0x050d product 0x0304, class 9/0, rev 2.00/7.02, addr 3> on uhub2 uhub3: single transaction translator uhub3: 4 ports with 4 removable, self powered dc0: <ADMtek AN985 10/100BaseTX> port 0xd400-0xd4ff mem 0xf7800000-0xf78003ff irq 11 at device 13.0 on pci0 miibus0: <MII bus> on dc0 ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dc0: Ethernet address: 00:14:bf:5d:74:f6 dc0: [ITHREAD] vgapci0: <VGA-compatible display> port 0xd000-0xd0ff mem 0xfa000000-0xfaffffff,0xf7000000-0xf7000fff irq 10 at device 14.0 on pci0 ahc0: <Adaptec 2940 Ultra2 SCSI adapter> port 0xb800-0xb8ff mem 0xf6800000-0xf6800fff irq 15 at device 15.0 on pci0 ahc0: [ITHREAD] aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 8233 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xb400-0xb40f at device 17.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] fdc0: <floppy drive controller> port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> flags 0x200 irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model Generic PS/2 mouse, device ID 0 cpu0: <ACPI CPU> on acpi0 pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcd7ff pnpid ORM0000 on isa0 ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: <Parallel port bus> on ppc0 ppbus0: [ITHREAD] plip0: <PLIP network interface> on ppbus0 plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ugen0: <American Power Conversion Back-UPS RS 1500 LCD FW:839.H7 .D USB FW:H7, class 0/0, rev 1.10/1.01, addr 2> on uhub0 ulpt0: <Hewlett-Packard DeskJet 840C, class 0/0, rev 1.00/1.00, addr 2> on uhub1 ulpt0: using bi-directional mode Timecounter "TSC" frequency 1666738536 Hz quality 800 Timecounters tick every 1.000 msec acd0: DVDROM <LITE-ON DVD SHD-16P1S/GS03> at ata0-master UDMA33 acd1: DVDR <HL-DT-STDVD-RAM GSA-H22N/1.01> at ata0-slave UDMA66 Waiting 5 seconds for SCSI devices to settle da1 at ahc0 bus 0 target 0 lun 0 da1: <QUANTUM ATLAS_V__9_WLS 0230> Fixed Direct Access SCSI-3 device da1: 40.000MB/s transfers (20.000MHz, offset 63, 16bit) da1: Command Queueing Enabled da1: 8755MB (17930694 512 byte sectors: 255H 63S/T 1116C) da2 at ahc0 bus 0 target 12 lun 0 da2: <MAXTOR ATLAS10K4_36WLS DFV0> Fixed Direct Access SCSI-3 device da2: 40.000MB/s transfers (20.000MHz, offset 127, 16bit) da2: Command Queueing Enabled da2: 35074MB (71833096 512 byte sectors: 255H 63S/T 4471C) cd0 at ahc0 bus 0 target 6 lun 0 cd0: <PLEXTOR CD-R PX-W4012S 1.04> Removable CD-ROM SCSI-2 device cd0: 3.300MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed da0 at umass-sim0 bus 0 target 0 lun 0 da0: <ST332062 0AS > Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: 305245MB (625142448 512 byte sectors: 255H 63S/T 38913C) GEOM_LABEL: Label for provider da0s2 is msdosfs/ . Trying to mount root from ufs:/dev/da2s4a WARNING: / was not properly dismounted WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 drm0: <3D Rage Pro 215GP> on vgapci0 info: [drm] Initialized mach64 1.0.0 20020904
On Saturday 27 September 2008 02:37:55 pm John L. Templer wrote:> I'm running 7.1-PRERELEASE, with /usr/src and /usr/ports last csup-ed > just a few days ago. After being up for about a day or so the system > will panic because of a page fault. I'm not completely sure, but it > seems that the system is more stable when gdm and gnome are disabled in > rc.conf. At least it stayed up for several days when I did that. > > I've run memtest several times, so I'm pretty confident it's not a > memory problem. Also the stack trace is always the same, so I'm > thinking it's not hardware related. > > I've attached a stack trace from kgdb, and the output from dmesg. I'd > appreciate any help you could give me with this.Generally when I see this panic (at this source line in mtx_lock() and with an offset of 0x188 or 0x18c), it is because the mutex is destroyed (mtx_lock == 6 (MTX_DESTROYED). In this case since you got it in a task, I'm guessing ata destroyed a structure w/o draining the task, so the task executed after the structure containing the mutex was destroyed. -- John Baldwin