This weekend I had a very interesting experience with gstripe(8) on RELENG_6 on amd64. Details of my setup: machine has 4 disks, connected to a standard SATA300 controller (nForce 4 chipset): ad4: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata2-master SATA300 ad6: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata3-master SATA300 ad8: 190782MB <WDC WD2000JD-00HBB0 08.02D08> at ata4-master SATA150 ad10: 476940MB <Seagate ST3500630AS 3.AAE> at ata5-master SATA300 /dev/ad8s1a 507630 66956 400064 14% / /dev/ad8s1d 16244334 87212 14857576 1% /var /dev/ad8s1e 4058062 1778 3731640 0% /tmp /dev/ad8s1f 32494668 2335866 27559230 8% /usr /dev/ad8s1g 127763620 6422 117536110 0% /home /dev/stripe/st0a 946030390 71642044 798705916 8% /storage /dev/ad10s1d 473009638 70446308 364722560 16% /backups ad4 = drive #1 in gstripe set (makes /dev/stripe/st0) ad6 = drive #2 in gstripe set (makes /dev/stripe/st0) ad8 = boot/OS drive ad10 = drive used for periodic backups (dump(8) dumps to this disk) All filesystems, except /, have softupdates enabled. I did not pick custom block sizes when newfs'ing /storage and /backups. I have a set of automated backups which run at 02:45 every day. Full level 0 backups are on Sunday, and increments 1-6 are Mon-Sat. Backups are done using the following command set: /sbin/dump -{level} -a -h0 -u -C16 -L -f- /backups/foo.{level}.dump The incident I'm about to describe happened on Sunday. I was dealing with an unrelated issue (some Ethernet problems), and I had to reboot the FreeBSD box in the process. I rebooted it using reboot(8). This was around 03:05 -- in the middle of the backups. The first thing I noticed was that the ATA "flush-to-disk" stuff was taking a long time to hit repetitions of zero (that is: 4 4 4 3 4 2 2 1 1 1 0 0 0). After a few seconds, I saw "0 1 0 1 0 1" start flying by on the screen over and over at a very fast rate, and after a few more seconds, I saw the system say "Giving up..." or something like that. Then it reboot. When the machine came back up, every filesystem on every disk was marked dirty. fsck(8) ran in the background, but took an *incredible* amount of time to complete on /dev/stripe/st0a (the gstripe set). "Incredible" means at least an hour, maybe more. I was running gstat during that time, and the gstripe set was pretty much at 100% utilisation, split 50/50 between ad4 and ad6; nothing odd there. The reason I'm mailing -stable about this is because it seems there may be some sort of "deadlock" condition which can happen when using dump -L on a system and then shutting it down. Maybe all of this points back to the ATA subsystem and how long it'll wait for buffers to be flushed to disk before actually shutting down. In my case, it obviously did not wait long enough. There don't seem to be any tunables for how long to continue trying/waiting either. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |