I had a failure over the weekend of my ext3 filesystems. Hardware: Dell 2550 (dual PIII 1 Ghz, 1GB ram, Acenic GigE card) Software: linux-2.4.14 ext3-2.4-0.9.15-2414 Trond's NFS patches (pathconf, tune,read,rpc_blk) The filesystem is RAID0 using LVM (1.0.1-rc4). The disks are connected using fiber channel (qlogic 2200, qlax00-4.27beta). I was generating high load on my nfs server that is using ext3 filesystems. I started to get the following error: EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc k 290754 EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc k 289980 EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc k 289981 (Many more errors than this) And then I lost that NFS filesystem. The load was generated by 16 processes each from a different client doing the following repetitively: fn=/scratchx/file.$$ dd if=/dev/zero of=$fn bs=1024k count=1024 dd if=$fn of=/dev/null bs=1024k count=1024 rm -f $fn The nfs clients are running linux-2.4.12 under alpha with all of Trond's NFS patches applied. Each client job is submitted through PBS (queuing system) so it could be entirely possible that after days of this test the two processes on separate nodes had the same process id (so $$ is the same). I read that before ext3-0.9.9 that this could happen if the file was deleted underneath the process that is writing to the file. Is this still true? Craig -- Craig Tierney (ctierney@hpti.com)
Stephen C. Tweedie
2001-Nov-26 21:54 UTC
Re: EXT3 crash: ext3_free_blocks: bit already cleared
Hi, On Mon, Nov 26, 2001 at 09:05:38AM -0700, Craig Tierney wrote:> I had a failure over the weekend of > my ext3 filesystems. > > Hardware: > Dell 2550 (dual PIII 1 Ghz, 1GB ram, Acenic GigE card) > > Software: > linux-2.4.14 > ext3-2.4-0.9.15-2414 > Trond's NFS patches (pathconf, tune,read,rpc_blk) > > The filesystem is RAID0 using LVM (1.0.1-rc4).Have you moved the LVM volume around at all while live? LVM still has data-corrupting bugs affecting ext3 in its volume-move code.> I was generating high load on my nfs server > that is using ext3 filesystems. I started to > get the following error: > > EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc > k 290754 > EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc > k 289980 > EXT3-fs error (device lvm(58,6)): ext3_free_blocks: bit already cleared for bloc > k 289981 > > (Many more errors than this)Is it repeatable?> I read that before ext3-0.9.9 that this could > happen if the file was deleted underneath the > process that is writing to the file. Is this still > true?It shouldn't be, no. Cheers, Stephen