Maurice Volaski
2007-Sep-11 00:32 UTC
Spontaneous development of supremely large files on different ext3 filesystems
I have come across two files, essentially untouched in years, on two different ext3 filesystems on the same server, Gentoo AMD 64-bit with kernel 2.6.22 and fsck version 1.40.2 currently, spontaneously becoming supremely large: Filesystem one Inode 16257874, i_size is 18014398562775391, should be 53297152 Filesystem two Inode 2121855, i_size is 35184386120704, should be 14032896. Both were discovered during an ordinary backup operation (via EMC Insiginia's Retrospect Linux client). The backup runs daily and so one day, one file must have grew spontaneously to this size and then on another day, it happened to the second file, which is on a second filesystem. The backup attempt generated repeated errors: EXT3-fs warning (device dm-2): ext3_block_to_path: block > big Both filesystems are running on different logical volumes, but underlying that is are drbd network raid devices and underlying that is a RAID 6-based SATA disk array. -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
Stephen Samuel
2007-Sep-12 00:01 UTC
Spontaneous development of supremely large files on different ext3 filesystems
One simple point: bash-3.2$bc -ql obase=16 35184386120704; 14032896 200000D61C00 D62000 18014398562775391; 53297152 400000032D315F 32D4000 The filesize is basically the same, except for the addition of a stray bit, way off in left field. (( Note that both of the 'old' file sizes are multiples of 8K )) On 9/10/07, Maurice Volaski <mvolaski at aecom.yu.edu> wrote:> I have come across two files, essentially untouched in years, on two > different ext3 filesystems on the same server, Gentoo AMD 64-bit with > kernel 2.6.22 and fsck version 1.40.2 currently, spontaneously > becoming supremely large: > > Filesystem one > Inode 16257874, i_size is 18014398562775391, should be 53297152 > > Filesystem two > Inode 2121855, i_size is 35184386120704, should be 14032896. > > Both were discovered during an ordinary backup operation (via EMC > Insiginia's Retrospect Linux client). > > The backup runs daily and so one day, one file must have grew > spontaneously to this size and then on another day, it happened to > the second file, which is on a second filesystem. The backup attempt > generated repeated errors: > > EXT3-fs warning (device dm-2): ext3_block_to_path: block > big > > Both filesystems are running on different logical volumes, but > underlying that is are drbd network raid devices and underlying that > is a RAID 6-based SATA disk array. > -- > > Maurice Volaski, mvolaski at aecom.yu.edu > Computing Support, Rose F. Kennedy Center > Albert Einstein College of Medicine of Yeshiva University > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users >-- Stephen Samuel http://www.bcgreen.com 778-861-7641
Maurice Volaski
2007-Sep-12 07:05 UTC
Spontaneous development of supremely large files on different ext3 filesystems
> > (( Note that both of the 'old' file sizes are multiples of 8K )) > >That is because e2fsck doesn't know the correct size, so just uses >the end of the last valid block (it isn't possible to have a "hole" >at the end of the file).It looks like more than 1 bit was different and if I understand this correctly, those other bit changes are the result of this after fact padding by e2fsck.>The filesize is basically the same, except for the addition of a stray >bit, way off in left field. (( Note that both of the 'old' file>Yes, it looks like single-bit corruption of some kind.So does this imply a spontaneous bit flip on a platter? Shouldn't that have been picked by the RAID and twice because there is dual parity (RAID 6)? -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
Stephen Samuel
2007-Sep-12 07:30 UTC
Spontaneous development of supremely large files on different ext3 filesystems
It's not clear where the error occured. It may actually be that there was a multi-bit error, and that it was incorrectly 'fixed'. It's also still possible that The spurious bit was flipped somewhere in software -- which wouldn'b be picked up by the RAID parity, because the RAID parity took that flipped bit into account. If you have hardware raid, then it's possible that the bit was flipped during or after transmission, but before parity was calculated.. Check for disk errors in your log files. Generically speaking, I'd be inclined to believe that the lower bits in the large file size are actually the precise size of the file... Check if the size minus the high-order flipped bit is consistent with a logical place to end the file. Note that the bit could have been flipped when a nearby inode (on the same disk/RAID block) was updated The block was read, modified and re-written and during that process, the bit could have been magically flipped. On 9/12/07, Maurice Volaski <mvolaski at aecom.yu.edu> wrote:> > > (( Note that both of the 'old' file sizes are multiples of 8K )) > > > >That is because e2fsck doesn't know the correct size, so just uses > >the end of the last valid block (it isn't possible to have a "hole" > >at the end of the file). > > It looks like more than 1 bit was different and if I understand this > correctly, those other bit changes are the result of this after fact > padding by e2fsck. > > > >The filesize is basically the same, except for the addition of a stray > >bit, way off in left field. (( Note that both of the 'old' file > > >Yes, it looks like single-bit corruption of some kind. > > So does this imply a spontaneous bit flip on a platter? Shouldn't > that have been picked by the RAID and twice because there is dual > parity (RAID 6)? > -- > > Maurice Volaski, mvolaski at aecom.yu.edu > Computing Support, Rose F. Kennedy Center > Albert Einstein College of Medicine of Yeshiva University >-- Stephen Samuel http://www.bcgreen.com 778-861-7641