Hi - We have an ext3 file system which is 3.5TB in size (on top of lvm). Free are 172049011 out of 854473728 4096K blocks, and 396540654 out of 427245568 inodes. This is using Scientific Linux 4.4 (a RHEL clone). The filesystem consists of multiple backups created with rsync using --link-dest, which hard links files which haven't been modified to the previous copy. There are several hundred days worth of these backups. I decided to fsck the file system, but unfortunately fsck is extremely slow. It has been going now for 67 hours and appears to be completely cpu bound (no obvious disk access) and stuck at the "Pass 2: Checking directory structure" stage. It doesn't respond to a normal kill or ctrl+c. Does anybody know whether it has got stuck in a loop, or does it really take so long to check so many hardlinks? Would it help moving to a newer e2fsck than RHEL provides (it has version number e2fsprogs-1.35-12.4.EL4). Thanks Jeremy
Jeremy Sanders wrote:> Does anybody know whether it has got stuck in a loop, or does it really > take so long to check so many hardlinks? Would it help moving to a newer > e2fsck than RHEL provides (it has version number e2fsprogs-1.35-12.4.EL4).I should also add that strace produces no output on the process, so it's apparently not making any system calls. Jeremy -- Jeremy Sanders <jss at ast.cam.ac.uk> http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053
On Thu, Feb 22, 2007 at 10:34:26AM +0000, Jeremy Sanders wrote:> Hi - > > We have an ext3 file system which is 3.5TB in size (on top of lvm). Free are > 172049011 out of 854473728 4096K blocks, and 396540654 out of 427245568 > inodes. This is using Scientific Linux 4.4 (a RHEL clone). The filesystem > consists of multiple backups created with rsync using --link-dest, which > hard links files which haven't been modified to the previous copy. There > are several hundred days worth of these backups. > > I decided to fsck the file system, but unfortunately fsck is extremely slow. > It has been going now for 67 hours and appears to be completely cpu bound > (no obvious disk access) and stuck at the "Pass 2: Checking directory > structure" stage. It doesn't respond to a normal kill or ctrl+c.Did you run fsck out of a command-line? It should respond to a normal kill or ctrl-c. If it isn't I have to wonder whether the device driver is locked up for some reason. Can you login via ssh or a second console? If so, run "ps aux" and "ps lx" and report back what the e2fsck ps lines shows. Also, how much memory do you have? 3.5TB is pretty big, and if you don't have enough memory, it could just simply be a matter of the system paging its brains out. - Ted
Jeremy Sanders wrote:> We have an ext3 file system which is 3.5TB in size (on top of lvm). Free > are 172049011 out of 854473728 4096K blocks, and 396540654 out of > 427245568 inodes. This is using Scientific Linux 4.4 (a RHEL clone). The > filesystem consists of multiple backups created with rsync using > --link-dest, which hard links files which haven't been modified to the > previous copy. There are several hundred days worth of these backups.Just to say I've also tried with e2fsprogs-1.39 e2fsck and it hangs indefinitely too :-( Jeremy -- Jeremy Sanders <jss at ast.cam.ac.uk> http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053