Neal McBurnett
2002-Aug-15 16:26 UTC
sys_ftruncate call lasting 17 hours on ext3 filesystem from mutt
Several times recently my "mutt" email program has looped for hours at a time in the middle of a sys_ftruncate call. This happens when I use the "$" command to write changes out to my mailbox. It does eventually return from the call and everything seems to have worked ok. But in the meantime the CPU is pegged, $MAIL is locked so I can't receive new mail, and signals to the program (like kill -9) don't take effect for hours. Once it was 17 hours, once 3, etc. The problem showed up shortly after upgrading from Red Hat 7.1 and converting the file systems to ext3. I'm running Red Hat 7.3, kernel 2.4.18-3, mutt-1.2.5.1-1. Strace didn't help at all, but thanks to a tip from Kevin Fenzi I learned how to use sysrq to find out where the process was, viz: 18:03:10 kernel: mutt R current 1024 8893 7929 (NOTLB) 18:03:10 kernel: Call Trace: [<c0127061>] truncate_list_pages [kernel] 0x79 18:03:10 kernel: [<c01271ff>] truncate_inode_pages [kernel] 0x3b 18:03:10 kernel: [<c0124f2e>] vmtruncate [kernel] 0x96 18:03:10 kernel: [<c01491f0>] inode_setattr [kernel] 0x24 18:03:10 kernel: [<d401f963>] ext3_setattr [ext3] 0x1c3 18:03:10 kernel: [<d401d810>] ext3_get_block [ext3] 0x0 18:03:10 kernel: [<c01281db>] do_generic_file_read [kernel] 0x2c3 18:03:10 kernel: [<c0149359>] notify_change [kernel] 0x5d 18:03:10 kernel: [<c012a2aa>] generic_file_write [kernel] 0x5c2 18:03:10 kernel: [<c01348ce>] do_truncate [kernel] 0x46 18:03:10 kernel: [<c0134bd1>] sys_ftruncate [kernel] 0x12d 18:03:10 kernel: [<c01085f7>] system_call [kernel] 0x33 I noticed that an fsck hadn't been done for months, so I did one with this result, indicating some sort of problem with $MAIL: 13:25:31 fsck: /var: 13:25:31 fsck: Truncating orphaned inode 44891 (uid=6265, gid=6265, mode=0100600, size=175526062) 13:25:36 fsck: /var has gone 69 days without being checked, check forced. 13:25:43 fsck: /var: 1057/104040 files (24.0% non-contiguous), 281356/415768 blocks The file in question is large: 44891 -rw------- 1 neal neal 175694250 Aug 13 13:50 /var/mail/neal I haven't seen the problem in the last day, but I've had successful days in the past also. I would hope that even in the face of a file system problem, the kernel shouldn't take so long to do a system call. Any ideas? Is there a bug-tracking system (bugzilla?) for ext3 or the kernel? Thanks, Neal McBurnett http://bcn.boulder.co.us/~neal/ GPG/PGP signed and/or sealed mail encouraged. Keyid: 2C9EBA60
Seemingly Similar Threads
- update: sys_ftruncate call lasting 17 hours on ext3 filesystem from mutt
- Fat32 - 1Gb file copy is failing
- Assertion Failure
- mutt-users@mutt.org: Non-member submission from openssh-unix-dev@mindrot.org
- CESA-2007:0386 Moderate CentOS 3 i386 mutt - security update