I am testing EXT3 as a filesystem for a server whose
power supply is failure prone.
In order to do the test, I have a lever that I can
control from PC1 that can press the reset button on
PC2. PC2's reset button is automatically pressed once
every 120 seconds (the boot sequence on PC2 takes 80
seconds).
While PC2 is booted, PC1 directs email and web
requests at PC2, so that the PC2 disks are busy
PC1 o=lever==== PC2 (Raid1, Journaled EXT3)
| |
------network-----
PC2 setup:
.2* RAID-1 (mirrored) IDE Disks (md - s/w
raid)
.redhat
.kernel 2.4.18-10
.e2fsck 1.23
/etc/fstab:
-----------
/dev/md2 / ext3 defaults 1 1
/dev/md0 /home ext3 data=journal 1 2
/dev/md1 /var ext3 date=journal 1 2
/dev/md3 /usr ext3 data=journal 1 2
Results:
-------
.After 82 successfull resets, e2fsck reports during
boot that /var was damaged and dropped into a 'repair'
shell.
.I ran fsck and it reported reported lots of the
following types of errors:
'Directory inode does not contain ..'
'Orphaned inode'
While 82 resets might sound good, its not perfect, and
the problem may
increase in line with disk i/o load.
Questions
----------
Would anyone have any advise on why these errors might
be happening? They look to me like 'meta-data' errors
- which journalling should prevent.
Might there be an interaction occuring between RAID1
and Ext3 perhaps? Raid was setup using the RedHat
Installation CD option 'Make Raid'.
Might the errors not be 'real' - with fsck not
understanding Ext3 journalling sufficiently. If this
is the case, should the last parameter (fs_passno) in
/etc/fstab be '0', preventing filesystem checks.
Might there be a bug in ext3?
Might Ext-3 not cope with power resets in the same way
as I imagined?
Thanks in advance for any comments.
/Mark
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com