Hi all,
I have a bunch of mail servers running postfix (external smtp),
qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes
into read-only mode forcing recovery using fsck.
Below are the errors we have seen so far on these systems and those
systems config. The ext3 errors are common in many cases.
1.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.10.el5 #1 SMP x86_64
Adaptec 2420SA RAID controller
Logical Disks's write cache disabled, Physical disk write cache enabled
Battery not installed
2 * 500GB ST373307LW in RAID1
Instances of EXT3 errors which caused read only FS :
a. ?EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 8766158 in dir
#8765708?
2.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.10.el5 #1 SMP x86_64
Dell PERC 6/i - Write back cache enabled
Battery available
2 * ST3500630AS 500GB in RAID1
Instances of EXT3 errors which caused read only FS :
a. ?EXT3-fs error (device sda3): ext3_lookup: unlinked inode 89065027 in dir
#89065024?
b. ?EST3-fs error (device sda3): htree_dirblock_to_tree: bad entry #65077525:
rec_len is smaller than minimal - offset=0, inode=0, rc_len=0, name_len=0?
c. ?EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory
#65077525: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0 ?
d. ?kernel: EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in
directory #65077525: rec_len is smaller than minimal - offset=0, inode=0,
rec_len=0, name_len=0?
3.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.13.el5 #1 SMP x86_64
Dell PERC 6/i - Write back cache enabled
Battery available
2 * ST3500630AS 500GB in RAID1
Instances of EXT3 errors which caused read only FS :
a. ?EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 26968135 in dir
#35127737?
b. "EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 9994260 in dir
#39518393"
* Iam not sure whats causing this errors. These crashes have been happening
since over 6 months now on a fairly regular basis - but on different servers.
* The disks on these systems seem to be fine - i haven't checked for
badblocks
on them yet.
* Each of those servers have their own disks for mail storage - there is no NFS
or cluster FS involved.
* The inode numbers in "ext3_lookup: unlinked inode" seem to be
referring to a
non existent courier pop/imap servers cache file (courierpop3dsizelist).
At this point, iam trying to figure out the possible causes for such ext3
errors. Any pointers/recommendations will be of great help.
P.S : Logs provided here is not complete and i would be glad to dig & post
complete logs of these events as required.
TIA
Dushyanth