I'm trying to run e2fsck on a ~6TB filesystem which is about 90% full. We're doing backup to disk to this filesystem, and have a number of hard links (link counts up to 90). strace shows: write(1, "Pass 2: Checking ", 17) = 17 write(1, "directory", 9) = 9 write(1, " structure\n", 11) = 11 mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b4299dbd000 mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b429f512000 mmap(NULL, 506724352, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = 0x2b42a4c67000 mmap(NULL, 596029440, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0x23e56000) = 0x5eb000 mmap(NULL, 596164608, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS| MAP_NORESERVE, -1, 0) = 0x2b430a09e000 munmap(0x2b430a09e000, 401408) = 0 munmap(0x2b430a200000, 647168) = 0 mprotect(0x2b430a100000, 135168, PROT_READ|PROT_WRITE) = 0 mmap(NULL, 596029440, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) lseek(3, 6303744, SEEK_SET) = 6303744 read(3, "\2\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\f\0\2\2..\0\0\v\0\0\0 \24"..., 4096) = 4096 lseek(3, 6307840, SEEK_SET) = 6307840 read(3, "\v\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\364\17\2\2..\0\0\0\0\0"..., 4096) = 4096 lseek(3, 6311936, SEEK_SET) = 6311936 read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 6316032, SEEK_SET) = 6316032 read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 6320128, SEEK_SET) = 6320128 read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 41709568, SEEK_SET) = 41709568 read(3, "\323\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\324"..., 4096) = 4096 lseek(3, 41713664, SEEK_SET) = 41713664 read(3, "\324\0\0\0\f\0\1\2.\0\0\0\323\0\0\0\f\0\2\2..\0\0\214 \300"..., 4096) = 4096 lseek(3, 41717760, SEEK_SET) = 41717760 read(3, "\325\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\326"..., 4096) = 4096 And, that's it. No more output. A backtrace from gdb shows: (gdb) bt #0 0x0000000000418aa5 in get_icount_el (icount=0x5cf170, ino=732562070, create=1) at icount.c:251 #1 0x0000000000418dd7 in ext2fs_icount_increment (icount=0x5cf170, ino=732562070, ret=0x7fffffa79a96) at icount.c:339 #2 0x000000000040a3cf in check_dir_block (fs=0x5af560, db=0x2b7070cc6064, priv_data=0x7fffffa79c90) at pass2.c:1021 #3 0x0000000000416c69 in ext2fs_dblist_iterate (dblist=0x5c3f20, func=0x409980 <check_dir_block>, priv_data=0x7fffffa79c90) at dblist.c:234 #4 0x0000000000408d9d in e2fsck_pass2 (ctx=0x5ae700) at pass2.c:149 #5 0x0000000000403102 in e2fsck_run (ctx=0x5ae700) at e2fsck.c:193 #6 0x0000000000401e50 in main (argc=Variable "argc" is not available. ) at unix.c:1075 It's stuck inside the while loop in get_icount_el() (line 251). I've added more memory to the server (up to 6 GB now), and am re- running e2fsck. Additionally, I upped /proc/sys/vm/max_map_count to 20,000,000 (just pulled that number out of the air). It takes 6 or 7 hours to get the part where it locks up, so I'm not sure if this is going to help or not. I figured while it's running I would post here to see if anyone has any additional insights. Thanks! Brian Davidson George Mason University
Here's strace when running w/ 6GB of memory & with max_map_count set to 20000000. It looks like that got rid of the ENOMEM's from mmap, but it's still hanging in the same place... write(1, "Pass 2: Checking ", 17) = 17 write(1, "directory", 9) = 9 write(1, " structure\n", 11) = 11 mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b1078c55000 mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b107e3aa000 mmap(NULL, 501645312, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = 0x2b1083aff000 mmap(NULL, 588230656, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = 0x2b10a1967000 munmap(0x2b10a1967000, 588230656) = 0 lseek(5, 6303744, SEEK_SET) = 6303744 read(5, "\2\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\f\0\2\2..\0\0\v\0\0\0 \24"..., 4096) = 4096 lseek(5, 6307840, SEEK_SET) = 6307840 read(5, "\v\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\364\17\2\2..\0\0\0\0\0"..., 4096) = 4096 lseek(5, 6311936, SEEK_SET) = 6311936 read(5, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(5, 6316032, SEEK_SET) = 6316032 read(5, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(5, 6320128, SEEK_SET) = 6320128 read(5, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(5, 41709568, SEEK_SET) = 41709568 read(5, "\323\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\324"..., 4096) = 4096 lseek(5, 41713664, SEEK_SET) = 41713664 read(5, "\324\0\0\0\f\0\1\2.\0\0\0\323\0\0\0\f\0\2\2..\0\0\214 \300"..., 4096) = 4096 lseek(5, 41717760, SEEK_SET) = 41717760 read(5, "\325\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\326"..., 4096) = 4096 The backtrace seems to be essentially the same: (gdb) bt #0 0x0000000000418aa5 in get_icount_el (icount=0x5cf170, ino=732562070, create=1) at icount.c:251 #1 0x0000000000418dd7 in ext2fs_icount_increment (icount=0x5cf170, ino=732562070, ret=0x7fffffad6e06) at icount.c:339 #2 0x000000000040a3cf in check_dir_block (fs=0x5af560, db=0x2b1011a88064, priv_data=0x7fffffad7000) at pass2.c:1021 #3 0x0000000000416c69 in ext2fs_dblist_iterate (dblist=0x5c3f20, func=0x409980 <check_dir_block>, priv_data=0x7fffffad7000) at dblist.c:234 #4 0x0000000000408d9d in e2fsck_pass2 (ctx=0x5ae700) at pass2.c:149 #5 0x0000000000403102 in e2fsck_run (ctx=0x5ae700) at e2fsck.c:193 #6 0x0000000000401e50 in main (argc=Variable "argc" is not available. ) at unix.c:1075 #7 0x0000000000421161 in __libc_start_main () #8 0x000000000040018a in _start () #9 0x00007fffffad7508 in ?? () #10 0x0000000000000000 in ?? () Additional info: $ cat /etc/redhat-release Red Hat Enterprise Linux AS release 4 (Nahant Update 4) $ uname -a Linux XXXXX.gmu.edu 2.6.16 #1 SMP Mon Mar 27 16:56:51 EST 2006 x86_64 x86_64 x86_64 GNU/Linux $ e2fsck -V e2fsck 1.35 (28-Feb-2004) Using EXT2FS Library version 1.35, 28-Feb-2004 $ rpm -q e2fsprogs e2fsprogs-1.35-12.4.EL4 Brian Davidson George Mason University
There are are few issues with the get_icount_el() code. First a simple binary search may be sufficient. Also, We now know the float type is not sufficient to handle the large or small values handled by this code. One problem with using float is it does not have the precision to divide two sufficently large numbers with a small enough difference. The other issue is with float value approximation that causes 'mid' to be larger than 'high'. The approximation is due to float single-precision 23 bit mantissa. Values up to integer 16,777,215 are handled as expected but starting at 16,777,216 the least significant bits are truncated producing an approximation. The approximation could be more or less than what is expected. This is a feature of using float. Double type for IEEE 754 double-precision 64 bit provides a 52 bit mantissa to play with. That is a large number. Since the e2fsck code must handle large numbers the use of float type should be used with caution. Reference http://steve.hollasch.net/cgindex/coding/ieeefloat.html http://en.wikipedia.org/wiki/IEEE_754