Gavin Atkinson
2006-Jul-26 12:52 UTC
Filesystem full: out of inodes - possibly ataraid bug?
Hi all, I'm seeing a strange problem on two of my servers (both 6.0-RELEASE). Both seem to have corrupted their filesystems in such a way that no more files can be created because they are apparently out of inodes, even though "df -i" suggests otherwise... A reboot doesn't fix this: system1: system1# touch /usr/test /usr: create/symlink failed, no inodes free touch: /usr/test: No space left on device system1# uptime 12:42PM up 6 mins, 1 user, load averages: 0.00, 0.07, 0.05 system1# df -i Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/ar0s1a 594926 64174 483158 12% 931 76123 1% / devfs 1 1 0 100% 0 0 100% /dev /dev/ar0s1h 47164554 4 43391386 0% 2 6099964 0% /spare /dev/ar0s1d 507630 22 466998 0% 11 65779 0% /tmp /dev/ar0s1g 15231278 322924 13689852 2% 18411 1959955 1% /usr /dev/ar0s1f 6090094 1626 5601262 0% 158 800608 0% /var /dev/ar0s1e 2026030 6 1863942 0% 3 282619 0% /var/tmp system1# tail -20 /var/log/messages Jul 26 12:42:03 system1 kernel: pid 539 (touch), uid 0 inumber 2 on /usr: out of inodes system1# uname -a FreeBSD xxx 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Wed Nov 2 19:07:38 UTC 2005 root@rat.samsco.home:/usr/obj/usr/src/sys/GENERIC amd64 system2: root@system2:/root# touch /test /: create/symlink failed, no inodes free touch: /test: No space left on device root@system2:/root# df -i / Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/ar0s1a 495726 132474 323594 29% 1519 62735 2% / root@system2:/root# uname -a FreeBSD xxx 6.0-RELEASE-p5 FreeBSD 6.0-RELEASE-p5 #3: Thu Mar 2 18:26:48 GMT 2006 root@xxx:/usr/obj/usr/src/sys/XXX i386 The only connection between these two systems is that both are mirrored using ataraid - system1 uses FreeBSD PseudoRAID which is currently running in DEGRADED mode (another disk is on order), and system2 has an Adaptec 1200A soft-RAID card (Promise chipset), which lost a disk (but has since been repaired) at some point in the past. With system2, the problems started at the same time as the disk replacement, I have no idea if the times correlate with system1 though. I run many FreeBSD machines without ataraid, and several with ataraid that have never lost a disk, and none of these systems have had any issues. Hence me wondering if some aspect of ataraid is flawed when it comes to handling degraded arrays. With system1, background fscks during reboot fail with: Jul 21 10:24:23 system1 kernel: pid 525 (fsck_ufs), uid 0 inumber 3 on /usr: out of inodes A full fsck (by setting background_fsck="NO" and fsck_y_enable="YES" in /etc/rc.conf and reboot) fixed the problems. I can't take system2 down to perform a full fsck at the moment. At the moment, system2 can't be taken to 6.1-R due to the NFS slowdown bug. I will take system1 to 6.1-R, although I guess it won't make any difference as presumably the damage is already done. I can provide dumps and images of the affected filesystems to trusted developers. Gavin