Scott Lambert
2008-Nov-24 20:59 UTC
FreeBSD 7.0-STABLE Jul 23: panic: ffs_blkfree: freeing free frag
I have a box I am using for hosting jailed web servers. I did a test move of a jail from a FreeBSD 6 box to the FreeBSD 7 server, web1.hosting. It took forever, 30 minutes to be exact, to create the jail with the 3GB image file and restore the data from the FreeBSD 6 box into it. I created a test archive of the running jail with ezjail-admin on the FreeBSD 6 box and scp'd it to web1.hosting. That took 5 minutes. (I was timing all of this to estimate how long it would really take later.) Once the archive was on web1.hosting, I created the new jail using the archive to populate it. sudo ezjail-admin create -a test_host_tcworks_net-200811241856.40.tar.gz \ -s 3G -i testhost.tcworks.net 192.168.1.238 That step took 40 minutes. According to 'systat -vm 1', da0 tended to show around 90% utilization, da1 was about 23% and MB/s was about 1.6 for both during the creation of the jail. After about 20 to 40 minutes of ensuring that the jail was working properly with the compat6x libs, I decided to erase the test jail and get ready for doing the transfer for real during the next maintenance window. Just before the box stopped responding to me, I had run: sudo ezjail-admin delete -w testhost.tcworks.net It might have been about 30 seconds after that I noticed it wasn't responding. According to Nagios, it took about 25 minutes to panic, reboot, fsck and come back up. Funny, it felt a lot longer. The gmirror is currently degraded and 'systat -vm 1' is showing 98% utilization on da0 and 23% utilization on da1 with 35 to 50MB/s on both da0 and da1. I hadn't looked at the mirror status before the crash. 21:42:09 Mon Nov 24 $ gmirror status Name Status Components mirror/gm0 DEGRADED da0 (84%) da1 I think I'll wait for it to complete the rebuild before I put any disk load on it looking for when it degraded, if not during the crash. 21:53:34 Mon Nov 24 # gmirror status Name Status Components mirror/gm0 COMPLETE da0 da1 The disks show to be quiet in systat, as expected. I don't find any messages except for when it booted up. I think the mirror was whole before the crash. The console log files go back to July 21 2008. The messages log files only go back to Nov 22. I need to fix that. The syslog messages about gm0, the kgdb output, and /var/run/dmesg.boot are below. If you want anything else, please let me know. 22:15:02 Mon Nov 24 $ gmirror list Geom name: gm0 State: COMPLETE Components: 2 Balance: round-robin Slice: 4096 Flags: NONE GenID: 0 SyncID: 1 ID: 149269652 Providers: 1. Name: mirror/gm0 Mediasize: 146815737344 (137G) Sectorsize: 512 Mode: r6w6e7 Consumers: 1. Name: da0 Mediasize: 146815737856 (137G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: DIRTY GenID: 0 SyncID: 1 ID: 779766152 2. Name: da1 Mediasize: 146815737856 (137G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: DIRTY GenID: 0 SyncID: 1 ID: 1224070577 21:58:06 Mon Nov 24 $ sudo cat /var/log/console.log | grep gm0 Nov 24 21:01:23 web1 kernel: kernel dumps on /dev/mirror/gm0s1b Nov 24 21:01:23 web1 kernel: swapon: adding /dev/mirror/gm0s1b as swap device Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1a: 3505 files, 133813 used, 120002 free (2498 frags, 14688 blocks, 1.0% fragmentation) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=967747 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=1073741824 MTIME=Oct 15 18:51 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=967754 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=1073741824 MTIME=Oct 15 19:02 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=1978373 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=3221225472 MTIME=Nov 24 20:42 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: ZERO LENGTH DIR I=1978491 OWNER=root MODE=40755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Oct 28 18:38 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=1978492 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=1073741824 MTIME=Oct 15 18:58 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=2596868 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=3109973 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=3109974 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=3109975 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=3109976 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: UNREF FILE I=3109977 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SIZE=0 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: SUMMARY INFORMATION BAD (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: BLK(S) MISSING IN BIT MAPS (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1g: 180957 files, 3981413 used, 51484012 free (25284 frags, 6432341 blocks, 0.0% fragmentation) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: UNREF FILE I=117761 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: SIZE=730 MTIME=Nov 22 02:54 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: SUMMARY INFORMATION BAD (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: BLK(S) MISSING IN BIT MAPS (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1d: 18 files, 13 used, 1013002 free (74 frags, 126616 blocks, 0.0% fragmentation) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=518537 OWNER=root MODE=100555 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=24640 MTIME=May 11 22:53 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=519794 OWNER=root MODE=100755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=305039 MTIME=Jun 3 22:32 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=519797 OWNER=root MODE=100755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=671546 MTIME=Jun 3 22:32 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=519801 OWNER=root MODE=100755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=153266 MTIME=Jun 3 22:32 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=519814 OWNER=root MODE=100755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=649739 MTIME=Jun 3 22:32 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: UNREF FILE I=520303 OWNER=root MODE=100755 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SIZE=29360 MTIME=Jun 3 22:32 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: SUMMARY INFORMATION BAD (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: BLK(S) MISSING IN BIT MAPS (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1f: 248702 files, 1841720 used, 1203327 free (100279 frags, 137881 blocks, 3.3% fragmentation) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=376847 OWNER=root MODE=100640 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=728 MTIME=Jul 27 21:55 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=612359 OWNER=root MODE=100644 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 17:56 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=612363 OWNER=root MODE=140666 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 19:03 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=683013 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 18:14 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=683014 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 18:14 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=683015 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 18:14 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=683016 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 18:14 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: UNREF FILE I=683017 OWNER=mysql MODE=100600 Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SIZE=0 MTIME=Oct 28 18:14 2008 (CLEARED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: SUMMARY INFORMATION BAD (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: BLK(S) MISSING IN BIT MAPS (SALVAGED) Nov 24 21:01:23 web1 kernel: /dev/mirror/gm0s1e: 25966 files, 735604 used, 6880035 free (11179 frags, 858607 blocks, 0.1% fragmentation) Nov 24 21:01:23 web1 kernel: Checking for core dump on /dev/mirror/gm0s1b...>From the last time it rebooted:21:55:08 Mon Nov 24 $ sudo bzcat /var/log/console.log.* | grep gm0 Jul 23 07:05:53 web1 kernel: kernel dumps on /dev/mirror/gm0s1b Jul 23 07:05:53 web1 kernel: swapon: adding /dev/mirror/gm0s1b as swap device Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1a: clean, 116377 free (2345 frags, 14254 blocks, 0.9% fragmentation) Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1g: FILE SYSTEM CLEAN; SKIPPING CHECKS Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1g: clean, 45920082 free (46434 frags, 5734206 blocks, 0.1% fragmentation) Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1d: clean, 1013003 free (67 frags, 126617 blocks, 0.0% fragmentation) Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1f: clean, 1241438 free (102950 frags, 142311 blocks, 3.4% fragmentation) Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS Jul 23 07:05:53 web1 kernel: /dev/mirror/gm0s1e: clean, 6913824 free (12232 frags, 862699 blocks, 0.2% fragmentation) Jul 23 07:05:53 web1 kernel: Checking for core dump on /dev/mirror/gm0s1b... 21:56:34 Mon Nov 24 $ sudo cat /var/log/messages | grep gm0 Nov 24 21:01:23 web1 kernel: GEOM_MIRROR: Device mirror/gm0 launched (1/2). Nov 24 21:01:23 web1 kernel: GEOM_MIRROR: Device gm0: rebuilding provider da0. Nov 24 21:01:23 web1 kernel: Trying to mount root from ufs:/dev/mirror/gm0s1a Nov 24 21:51:29 web1 kernel: GEOM_MIRROR: Device gm0: rebuilding provider da0 finished. FreeBSD web1.hosting.tcworks.net 7.0-STABLE FreeBSD 7.0-STABLE #1: Wed Jul 23 03:09:31 CDT 2008 root@web1.hosting.tcworks.net:/usr/obj/usr/src/sys/GENERIC i386 I changed the fs = part of this line to avoid privacy issues for my customer whose web host jail I've been working with. I just changed the hostname: dev = md14, block = 1320903, fs = /home/ezjails/testhost.tcworks.net 21:23:17 Mon Nov 24 # kgdb /boot/kernel/kernel vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: dev = md14, block = 1320903, fs = /home/ezjails/testhost.tcworks.net panic: ffs_blkfree: freeing free frag cpuid = 0 Uptime: 124d14h34m47s Physical memory: 1011 MB Dumping 199 MB: 184 168 152 136 120 104 88 72 56 40 24 8 Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. done. Loaded symbols for /boot/kernel/geom_mirror.ko Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko Reading symbols from /boot/kernel/pflog.ko...Reading symbols from /boot/kernel/pflog.ko.symbols...done. done. Loaded symbols for /boot/kernel/pflog.ko Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kernel/pf.ko.symbols...done. done. Loaded symbols for /boot/kernel/pf.ko Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /boot/kernel/fdescfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/fdescfs.ko #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc077fd27 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc077ffe9 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc09667a3 in ffs_blkfree (ump=0xc865d400, fs=0xc4946800, devvp=0xc7dc4880, bno=1320903, size=2048, inum=336842) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1918 #4 0xc097a1af in handle_workitem_freeblocks (freeblks=0xc5aab600, flags=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2764 #5 0xc097b1b8 in process_worklist_item (mp=0xc4859598, flags=Variable "flags" is not available. ) at /usr/src/sys/ufs/ffs/ffs_softdep.c:962 #6 0xc097cc62 in softdep_process_worklist (mp=0xc4859598, full=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:845 #7 0xc097f687 in softdep_flush () at /usr/src/sys/ufs/ffs/ffs_softdep.c:756 #8 0xc075d479 in fork_exit (callout=0xc097f210 <softdep_flush>, arg=0x0, frame=0xe44edd38) at /usr/src/sys/kern/kern_fork.c:781 #9 0xc0a6e170 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:205 (kgdb) dmesg: Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #1: Wed Jul 23 03:09:31 CDT 2008 root@web1.hosting.tcworks.net:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2392.29-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4400<CNXT-ID,xTPR> Logical CPUs per core: 2 real memory = 1073676288 (1023 MB) avail memory = 1037049856 (989 MB) ACPI APIC Table: <INTEL SWV25 > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard ioapic2 <Version 2.0> irqs 48-71 on motherboard lapic0: Forcing LINT1 to edge trigger kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: <INTEL SWV25> on motherboard ACPI Warning (dswload-0794): Type override - [DEB_] had invalid type (Integer) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [MLIB] had invalid type (Integer) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [DATA] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [SIO_] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [LEDP] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [GPEN] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [GPST] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [WUES] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [WUSE] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [SBID] had invalid type (String) for Scope operator, changed to (Scope) [20070320] ACPI Warning (dswload-0794): Type override - [SWCE] had invalid type (String) for Scope operator, changed to (Scope) [20070320] acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_ec0: <Embedded Controller: GPE 0x8> port 0xca6,0xca7 on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <unknown> at device 0.1 (no driver attached) pcib1: <ACPI PCI-PCI bridge> at device 3.0 on pci0 pci2: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 29.0 on pci2 pci4: <ACPI PCI bus> on pcib2 ahd0: <Adaptec AIC7902 Ultra320 SCSI adapter> port 0x3800-0x38ff,0x4000-0x40ff mem 0xfe9e0000-0xfe9e1fff irq 50 at device 7.0 on pci4 ahd0: [ITHREAD] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs ahd1: <Adaptec AIC7902 Ultra320 SCSI adapter> port 0x3000-0x30ff,0x3400-0x34ff mem 0xfe9f0000-0xfe9f1fff irq 49 at device 7.1 on pci4 ahd1: [ITHREAD] aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs pcib3: <ACPI PCI-PCI bridge> at device 31.0 on pci2 pci3: <ACPI PCI bus> on pcib3 em0: <Intel(R) PRO/1000 Network Connection 6.9.5> port 0x2040-0x207f mem 0xfe6c0000-0xfe6dffff irq 30 at device 7.0 on pci3 em0: [FILTER] em0: Ethernet address: 00:07:e9:05:61:0a em1: <Intel(R) PRO/1000 Network Connection 6.9.5> port 0x2000-0x203f mem 0xfe6e0000-0xfe6fffff irq 31 at device 7.1 on pci3 em1: [FILTER] em1: Ethernet address: 00:07:e9:05:61:0b pci0: <unknown> at device 3.1 (no driver attached) uhci0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> port 0x5020-0x503f irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> on uhci0 usb0: USB revision 1.0 uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> port 0x5000-0x501f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> on uhci1 usb1: USB revision 1.0 uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci1: <ACPI PCI bus> on pcib4 vgapci0: <VGA-compatible display> port 0x1000-0x10ff mem 0xfd000000-0xfdffffff,0xfe5f0000-0xfe5f0fff irq 17 at device 12.0 on pci1 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel ICH3 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x3a0-0x3af at device 31.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] pci0: <serial bus, SMBus> at device 31.3 (no driver attached) cpu0: <ACPI CPU> on acpi0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 p4tcc1: <CPU Frequency Thermal Control> on cpu1 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] fdc0: <floppy drive controller (FDE)> port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff pnpid ORM0000 on isa0 ppc0: parallel port not found. sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec Waiting 5 seconds for SCSI devices to settle acd0: CDROM <SAMSUNG CD-ROM SN-124/N101> at ata1-master UDMA33 da0 at ahd1 bus 0 target 0 lun 0 da0: <SEAGATE ST3146707LC 0003> Fixed Direct Access SCSI-3 device da0: 320.000MB/s transfers (160.000MHz DT, offset 63, 16bit) da0: Command Queueing Enabled da0: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) da1 at ahd1 bus 0 target 3 lun 0 da1: <SEAGATE ST3146707LC 0003> Fixed Direct Access SCSI-3 device da1: 320.000MB/s transfers (160.000MHz DT, offset 63, 16bit) da1: Command Queueing Enabled da1: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) ses0 at ahd1 bus 0 target 6 lun 0 ses0: <ESG-SHV SCA HSBP M16 0.05> Fixed Processor SCSI-2 device ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device lapic1: Forcing LINT1 to edge trigger SMP: AP CPU #1 Launched! GEOM_MIRROR: Device mirror/gm0 launched (1/2). GEOM_MIRROR: Device gm0: rebuilding provider da0. Trying to mount root from ufs:/dev/mirror/gm0s1a WARNING: / was not properly dismounted -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org
Scott Lambert
2008-Nov-26 11:32 UTC
FreeBSD 7.0-STABLE Jul 23: panic: ffs_blkfree: freeing free frag
On Mon, Nov 24, 2008 at 10:31:24PM -0600, Scott Lambert wrote:> I have a box I am using for hosting jailed web servers. I did a > test move of a jail from a FreeBSD 6 box to the FreeBSD 7 server, > web1.hosting. It took forever, 30 minutes to be exact, to create the > jail with the 3GB image file and restore the data from the FreeBSD 6 box > into it. > > I created a test archive of the running jail with ezjail-admin on the > FreeBSD 6 box and scp'd it to web1.hosting. That took 5 minutes. (I > was timing all of this to estimate how long it would really take later.) > > Once the archive was on web1.hosting, I created the new jail using the > archive to populate it. > > sudo ezjail-admin create -a test_host_tcworks_net-200811241856.40.tar.gz \ > -s 3G -i testhost.tcworks.net 192.168.1.238 > > That step took 40 minutes. According to 'systat -vm 1', da0 tended to > show around 90% utilization, da1 was about 23% and MB/s was about 1.6 > for both during the creation of the jail. > > After about 20 to 40 minutes of ensuring that the jail was working > properly with the compat6x libs, I decided to erase the test jail and > get ready for doing the transfer for real during the next maintenance > window. > > Just before the box stopped responding to me, I had run: > sudo ezjail-admin delete -w testhost.tcworks.net > > It might have been about 30 seconds after that I noticed it wasn't > responding. > > FreeBSD web1.hosting.tcworks.net 7.0-STABLE FreeBSD > 7.0-STABLE #1: Wed Jul 23 03:09:31 CDT 2008 > root@web1.hosting.tcworks.net:/usr/obj/usr/src/sys/GENERIC i386 > > I changed the fs = part of this line to avoid privacy issues for my > customer whose web host jail I've been working with. I just changed the > hostname: > > dev = md14, block = 1320903, fs = /home/ezjails/testhost.tcworks.net > > 21:23:17 Mon Nov 24 # kgdb /boot/kernel/kernel vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd"... > > Unread portion of the kernel message buffer: > dev = md14, block = 1320903, fs = /home/ezjails/testhost.tcworks.net > panic: ffs_blkfree: freeing free frag > cpuid = 0 > Uptime: 124d14h34m47s > Physical memory: 1011 MB > Dumping 199 MB: 184 168 152 136 120 104 88 72 56 40 24 8 ><snip reading symbols>> #0 doadump () at pcpu.h:195 > 195 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump () at pcpu.h:195 > #1 0xc077fd27 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 > #2 0xc077ffe9 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:572 > #3 0xc09667a3 in ffs_blkfree (ump=0xc865d400, fs=0xc4946800, > devvp=0xc7dc4880, bno=1320903, size=2048, inum=336842) > at /usr/src/sys/ufs/ffs/ffs_alloc.c:1918 > #4 0xc097a1af in handle_workitem_freeblocks (freeblks=0xc5aab600, flags=0) > at /usr/src/sys/ufs/ffs/ffs_softdep.c:2764 > #5 0xc097b1b8 in process_worklist_item (mp=0xc4859598, flags=Variable "flags" is not available. > ) > at /usr/src/sys/ufs/ffs/ffs_softdep.c:962 > #6 0xc097cc62 in softdep_process_worklist (mp=0xc4859598, full=0) > at /usr/src/sys/ufs/ffs/ffs_softdep.c:845 > #7 0xc097f687 in softdep_flush () at /usr/src/sys/ufs/ffs/ffs_softdep.c:756 > #8 0xc075d479 in fork_exit (callout=0xc097f210 <softdep_flush>, arg=0x0, > frame=0xe44edd38) at /usr/src/sys/kern/kern_fork.c:781 > #9 0xc0a6e170 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:205 > (kgdb)My non-developer eye thinks this looks like some of the other ffs_blkfree panics in the PR database which tend to involve deleteing large amounts of data in FreeBSD 5 and 6 and 7-CURRENT. http://www.freebsd.org/cgi/query-pr-summary.cgi?text=ffs_blkfree Some refer to "freeing free block", some to "freeing free frag". Would disabling soft-updates be a good temporary workaround? I can't afford to attempt to reproduce the problem, on this box, until I get some production stuff shifted. I am going to try on a different box, but everything about that box is different except that it uses a gmirror. Thanks for your help and Happy Thanksgiving, for those of you who celebrate it! -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org
Clifton Royston
2008-Nov-30 12:28 UTC
FreeBSD 7.0-STABLE Jul 23: panic: ffs_blkfree: freeing free frag
On Mon, Nov 24, 2008 at 10:31:24PM -0600, Scott Lambert wrote: ...> According to Nagios, it took about 25 minutes to panic, reboot, fsck and > come back up. Funny, it felt a lot longer. > > The gmirror is currently degraded and 'systat -vm 1' is showing 98% > utilization on da0 and 23% utilization on da1 with 35 to 50MB/s on both > da0 and da1. I hadn't looked at the mirror status before the crash. > > 21:42:09 Mon Nov 24 $ gmirror status > Name Status Components > mirror/gm0 DEGRADED da0 (84%) > da1Just so you know, if you don't shut down normally, the mirror is not marked into a synchronized state, so it usually has to rebuild after a panic or other crash. That's probably unrelated to the crash. -- Clifton -- Clifton Royston -- cliftonr@iandicomputing.com / cliftonr@lava.net President - I and I Computing * http://www.iandicomputing.com/ Custom programming, network design, systems and network consulting services