Hi,
I'm encountering a strange situation. 
I'm writing to an ext3 filesystem mounted like this:
/dev/scsi/host2/bus0/target0/lun0/part3 on /tmp/volumes/md0 type ext3
(rw,noatime,data=ordered)
with the physical device being a FC-connected RAID (IBM DF4000)
no software raid personality is used
I'm using 2.4.18-19.7
My test program writes files in a loop, with the essential system calls being:
[pid 13335] mkdir("/tmp/volumes/md0/1/13335/00028", 0777) = 0
[pid 13335] fsync(4)                    = 0
[pid 13335] time(NULL)                  = 1044515538
[pid 13335] gettimeofday({1044515538, 96401}, {4294967176, 0}) = 0
[pid 13335] getpid()                    = 13335
[pid 13335] sendto(3, "[Feb  6  9:12:18.096401 13335]: mkdir
/tmp/volumes/md0/1/13335/00028\n", 69, 0, {sin_family=AF_INET,
sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 69
[pid 13335] open("/tmp/volumes/md0/1/13335/00028", O_RDONLY|O_SYNC) =
5
[pid 13335] open("/tmp/volumes/md0/1/13335/00028/file",
O_RDWR|O_CREAT|O_SYNC, 027777774120) = 6
[pid 13335] fsync(5)                    = 0
[pid 13335] time(NULL)                  = 1044515538
[pid 13335] gettimeofday({1044515538, 112558}, {4294967176, 0}) = 0
[pid 13335] getpid()                    = 13335
[pid 13335] sendto(3, "[Feb  6  9:12:18.112558 13335]: open
/tmp/volumes/md0/1/13335/00028/file\n", 73, 0, {sin_family=AF_INET,
sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 73
[pid 13335] getpid()                    = 13335
[pid 13335] write(6,
"\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5"...,
8192) = 8192
[pid 13335] fsync(6)                    = 0
[pid 13335] time(NULL)                  = 1044515538
[pid 13335] gettimeofday({1044515538, 160822}, {4294967176, 0}) = 0
[pid 13335] getpid()                    = 13335
[pid 13335] sendto(3, "[Feb  6  9:12:18.160822 13335]: write
/tmp/volumes/md0/1/13335/00028/file\n", 74, 0, {sin_family=AF_INET,
sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 74
[pid 13335] close(6)                    = 0
[pid 13335] close(5)                    = 0
[pid 13335] time(NULL)                  = 1044515538
[pid 13335] gettimeofday({1044515538, 161148}, {4294967176, 0}) = 0
[pid 13335] getpid()                    = 13335
[pid 13335] sendto(3, "[Feb  6  9:12:18.161148 13335]: finished
/tmp/volumes/md0/1/13335/00028/file\n", 77, 0, {sin_family=AF_INET,
sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 77
[
In the middle of this loop of writing I turn down the power to my RAID
in the first minute the last writes/actions seem to be pending. After that the
write actions return successfully (which is very strange) and all following
actions return with errors.
When I turn on the power again, the fs seems to restore, and the file is
available with the correct data.
But when I do e2fsck to the device I get:
[root@node2-integ /]#  e2fsck -y -f /dev/scsi/host2/bus0/target0/lun0/part3
e2fsck 1.27 (8-Mar-2002)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry '8075' in /1 (1687558) has an incorrect filetype (was 2, should be
1).
Fix? yes
Entry '00036' in /1/8022 (1736710) has deleted/unused inode 1179660. 
Clear? yes
Entry '00037' in /1/8024 (1802246) has deleted/unused inode 1146892. 
Clear? yes
Entry '00036' in /1/8027 (1998854) has deleted/unused inode 2654219. 
Clear? yes
Entry '00036' in /1/8035 (2097158) has deleted/unused inode 2588683. 
Clear? yes
Entry '00036' in /1/8033 (2228230) has deleted/unused inode 1212428. 
Clear? yes
Entry '00037' in /1/8038 (2326534) has deleted/unused inode 1114124. 
Clear? yes
Entry '00009' in /2/8179 (2981900) is a link to directory /1/8069
(1785868).
Clear? yes
Entry '00013' in /2/8176 (2998284) is a link to directory /1/8045/00036
(1261580).
Clear? yes
Entry '00013' in /2/8178 (3047436) is a link to directory /1/8039/00036
(1196044).
Clear? yes
Entry '00009' in /2/8184 (3227660) is a link to directory /1/8071
(1949708).
Clear? yes
Entry '00012' in /2/8182 (3293196) is a link to directory /1/8029/00037
(1277964).
Clear? yes
Entry '00012' in /2/8193 (3375116) is a link to directory /1/8021/00037
(1228812).
Clear? yes
Entry '00008' in /2/8188 (3522572) is a link to directory /1/8065
(1523724).
Clear? yes
Entry '00009' in /2/8186 (3588108) is a link to directory /1/8070
(1884172).
Clear? yes
Entry '00012' in /2/8198 (3637260) is a link to directory /1/8025/00038
(1294348).
Clear? yes
Entry '00008' in /2/8195 (3735564) is a link to directory /1/8066
(1556492).
Clear? yes
Entry '00009' in /2/8204 (3964940) is a link to directory /1/8068
(1753100).
Clear? yes
Entry '00008' in /2/8205 (3981319) is a link to directory /1/8067
(1589260).
Clear? yes
Entry '00012' in /2/8205 (3981319) is a link to directory /1/8034/00036
(1245196).
Clear? yes
Pass 3: Checking directory connectivity
'..' in /1/8039/00036 (1196044) is /2/8178 (3047436), should be /1/8039
(2359302).
Fix? yes
'..' in /1/8021/00037 (1228812) is /2/8193 (3375116), should be /1/8021
(1703942).
Fix? yes
'..' in /1/8034/00036 (1245196) is /2/8205 (3981319), should be /1/8034
(2179078).
Fix? yes
'..' in /1/8045/00036 (1261580) is /2/8176 (2998284), should be /1/8045
(2703366).
Fix? yes
'..' in /1/8029/00037 (1277964) is /2/8182 (3293196), should be /1/8029
(1900550).
Fix? yes
'..' in /1/8025/00038 (1294348) is /2/8198 (3637260), should be /1/8025
(1867782).
Fix? yes
'..' in /1/8065 (1523724) is /2/8188 (3522572), should be /1 (1687558).
Fix? yes
'..' in /1/8066 (1556492) is /2/8195 (3735564), should be /1 (1687558).
Fix? yes
'..' in /1/8067 (1589260) is /2/8205 (3981319), should be /1 (1687558).
Fix? yes
'..' in /1/8068 (1753100) is /2/8204 (3964940), should be /1 (1687558).
Fix? yes
'..' in /1/8069 (1785868) is /2/8179 (2981900), should be /1 (1687558).
Fix? yes
Unconnected directory inode 1835020 (/1/8069/???)
Connect to /lost+found? yes
'..' in /1/8070 (1884172) is /2/8186 (3588108), should be /1 (1687558).
Fix? yes
'..' in /1/8071 (1949708) is /2/8184 (3227660), should be /1 (1687558).
Fix? yes
Unconnected directory inode 2064396 (/1/8071/???)
Connect to /lost+found? yes
Pass 4: Checking reference counts
Inode 1671181 ref count is 1, should be 2.  Fix? yes
Inode 1687558 ref count is 73, should be 65.  Fix? yes
Inode 1703942 ref count is 41, should be 40.  Fix? yes
Inode 1736710 ref count is 39, should be 38.  Fix? yes
Inode 1785868 ref count is 1, should be 2.  Fix? yes
Inode 1802246 ref count is 40, should be 39.  Fix? yes
Inode 1835020 ref count is 3, should be 2.  Fix? yes
Inode 1867782 ref count is 42, should be 41.  Fix? yes
Inode 1900550 ref count is 41, should be 40.  Fix? yes
Inode 1949708 ref count is 1, should be 2.  Fix? yes
Inode 1998854 ref count is 39, should be 38.  Fix? yes
Inode 2064396 ref count is 3, should be 2.  Fix? yes
Inode 2097158 ref count is 39, should be 38.  Fix? yes
Inode 2179078 ref count is 40, should be 39.  Fix? yes
Inode 2228230 ref count is 39, should be 38.  Fix? yes
Inode 2326534 ref count is 40, should be 39.  Fix? yes
Inode 2359302 ref count is 40, should be 39.  Fix? yes
Inode 2703366 ref count is 40, should be 39.  Fix? yes
Pass 5: Checking group summary information
Block bitmap differences:  -2228755 -2294289 -2359827 -2392595 -2425363 -2458131
-2490899 -2523667 -2556435 -2589203 -(3441172--3441173) -(5177874--5177876)
-(5308946--5308948) -11862554 -12157466 -12223002 -12255770 -12288538
Fix? yes
Free blocks count wrong for group #68 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #70 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #72 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #73 (32232, counted=32235).
Fix? yes
<snipped groups >
Free blocks count wrong for group #74 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #75 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #76 (32229, counted=32232).
Fix? yes
Free blocks count wrong for group #77 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #78 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #79 (32232, counted=32235).
Fix? yes
Free blocks count wrong for group #81 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #82 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #83 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #84 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #85 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #86 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #87 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #88 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #89 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #90 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #91 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #92 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #93 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #94 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #95 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #96 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #97 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #101 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #102 (32233, counted=32235).
Fix? yes
Free blocks count wrong for group #105 (32231, counted=32234).
Fix? yes
Free blocks count wrong for group #107 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #109 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #113 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #115 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #118 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #119 (32233, counted=32234).
Fix? yes
Free blocks count wrong for group #158 (32233, counted=32236).
Fix? yes
Free blocks count wrong for group #162 (32233, counted=32236).
Fix? yes
Free blocks count wrong for group #256 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #257 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #258 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #259 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #260 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #261 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #262 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #263 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #264 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #265 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #266 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #267 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #268 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #269 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #270 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #271 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #273 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #274 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #275 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #276 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #277 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #278 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #279 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #280 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #281 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #282 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #283 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #284 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #285 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #286 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #287 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #288 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #289 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #290 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #291 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #292 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #293 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #294 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #295 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #296 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #297 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #298 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #299 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #300 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #301 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #302 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #303 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #304 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #305 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #306 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #307 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #308 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #309 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #310 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #311 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #312 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #313 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #314 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #315 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #316 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #317 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #318 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #319 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #320 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #321 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #322 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #323 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #324 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #325 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #326 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #327 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #328 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #329 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #330 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #331 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #332 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #333 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #334 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #335 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #336 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #337 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #338 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #339 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #340 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #341 (32234, counted=32231).
Fix? yes
Free blocks count wrong for group #342 (32234, counted=32231).
Fix? yes
<snipped again>
and then the file is missing.
This seems to be incorrect behaviour. How should my application know that there
was an error writing the file and provide the correct recovery ?
I've yet to test this on other filesystems or journal modes. 
Does anyone have any explanation for this behaviour, or suggestion for
workarounds ?
I can provide the test program if anyone is interested in reproducing. 
Regards,
Yuval
--
Yuval Yeret
Exanet
yuval@exanet.com
http://www.exanet.com
Tel.  972-9-9717782
Fax. 972-9-9717778
 <<Yuval Yeret (E-mail).vcf>>
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------>Hi, > >I'm encountering a strange situation. > >I'm writing to an ext3 filesystem mounted like this: >/dev/scsi/host2/bus0/target0/lun0/part3 on /tmp/volumes/md0 type ext3 >(rw,noatime,data=ordered) > >with the physical device being a FC-connected RAID (IBM DF4000) >no software raid personality is used >I'm using 2.4.18-19.7 > >My test program writes files in a loop, with the essential system calls being: >[pid 13335] mkdir("/tmp/volumes/md0/1/13335/00028", 0777) = 0 >[pid 13335] fsync(4) = 0 >[pid 13335] time(NULL) = 1044515538 >[pid 13335] gettimeofday({1044515538, 96401}, {4294967176, 0}) = 0 >[pid 13335] getpid() = 13335 >[pid 13335] sendto(3, "[Feb 6 9:12:18.096401 13335]: mkdir >/tmp/volumes/md0/1/13335/00028\n", 69, 0, {sin_family=AF_INET, >sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 69 >[pid 13335] open("/tmp/volumes/md0/1/13335/00028", O_RDONLY|O_SYNC) = 5 >[pid 13335] open("/tmp/volumes/md0/1/13335/00028/file", O_RDWR|O_CREAT|O_SYNC,>027777774120) = 6 >[pid 13335] fsync(5) = 0 >[pid 13335] time(NULL) = 1044515538 >[pid 13335] gettimeofday({1044515538, 112558}, {4294967176, 0}) = 0 >[pid 13335] getpid() = 13335 >[pid 13335] sendto(3, "[Feb 6 9:12:18.112558 13335]: open >/tmp/volumes/md0/1/13335/00028/file\n", 73, 0, {sin_family=AF_INET, >sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 73 >[pid 13335] getpid() = 13335 >[pid 13335] write(6, >"\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v \f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\2 1\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\2 6\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\ 2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\ t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17 \20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24 \25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31 \0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\ 7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\ 16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\ 23\24\25\26\27\30\31\0\1\2\3\4\5"...,> 8192) = 8192 > [pid 13335] fsync(6) = 0 > [pid 13335] time(NULL) = 1044515538 > [pid 13335] gettimeofday({1044515538, 160822}, {4294967176, 0}) = 0 > [pid 13335] getpid() = 13335 > [pid 13335] sendto(3, "[Feb 6 9:12:18.160822 13335]: write > /tmp/volumes/md0/1/13335/00028/file\n", 74, 0, {sin_family=AF_INET, > sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 74 > [pid 13335] close(6) = 0 > [pid 13335] close(5) = 0 > [pid 13335] time(NULL) = 1044515538 > [pid 13335] gettimeofday({1044515538, 161148}, {4294967176, 0}) = 0 > [pid 13335] getpid() = 13335 > [pid 13335] sendto(3, "[Feb 6 9:12:18.161148 13335]: finished > /tmp/volumes/md0/1/13335/00028/file\n", 77, 0, {sin_family=AF_INET, > sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 77 > [ > > > In the middle of this loop of writing I turn down the power to my RAID > > in the first minute the last writes/actions seem to be pending. After that > the > write actions return successfully (which is very strange) and all following > actions return with errors. > > When I turn on the power again, the fs seems to restore, and the file is > available with the correct data. > > But when I do e2fsck to the device I get: > [root@node2-integ /]# e2fsck -y -f /dev/scsi/host2/bus0/target0/lun0/part3 > e2fsck 1.27 (8-Mar-2002) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Entry '8075' in /1 (1687558) has an incorrect filetype (was 2, should be 1). > Fix? yes > > Entry '00036' in /1/8022 (1736710) has deleted/unused inode 1179660. Clear? > yes > > Entry '00037' in /1/8024 (1802246) has deleted/unused inode 1146892. Clear? > yes > > Entry '00036' in /1/8027 (1998854) has deleted/unused inode 2654219. Clear? > yes > > Entry '00036' in /1/8035 (2097158) has deleted/unused inode 2588683. Clear? > yes > > Entry '00036' in /1/8033 (2228230) has deleted/unused inode 1212428. Clear? > yes > > Entry '00037' in /1/8038 (2326534) has deleted/unused inode 1114124. Clear? > yes > > Entry '00009' in /2/8179 (2981900) is a link to directory /1/8069 (1785868). > Clear? yes > > Entry '00013' in /2/8176 (2998284) is a link to directory /1/8045/00036 > (1261580). > Clear? yes > > Entry '00013' in /2/8178 (3047436) is a link to directory /1/8039/00036 > (1196044). > Clear? yes > > Entry '00009' in /2/8184 (3227660) is a link to directory /1/8071 (1949708). > Clear? yes > > Entry '00012' in /2/8182 (3293196) is a link to directory /1/8029/00037 > (1277964). > Clear? yes > > Entry '00012' in /2/8193 (3375116) is a link to directory /1/8021/00037 > (1228812). > Clear? yes > > Entry '00008' in /2/8188 (3522572) is a link to directory /1/8065 (1523724). > Clear? yes > > Entry '00009' in /2/8186 (3588108) is a link to directory /1/8070 (1884172). > Clear? yes > > Entry '00012' in /2/8198 (3637260) is a link to directory /1/8025/00038 > (1294348). > Clear? yes > > Entry '00008' in /2/8195 (3735564) is a link to directory /1/8066 (1556492). > Clear? yes > > Entry '00009' in /2/8204 (3964940) is a link to directory /1/8068 (1753100). > Clear? yes > > Entry '00008' in /2/8205 (3981319) is a link to directory /1/8067 (1589260). > Clear? yes > > Entry '00012' in /2/8205 (3981319) is a link to directory /1/8034/00036 > (1245196). > Clear? yes > > Pass 3: Checking directory connectivity > '..' in /1/8039/00036 (1196044) is /2/8178 (3047436), should be /1/8039 > (2359302). > Fix? yes > > '..' in /1/8021/00037 (1228812) is /2/8193 (3375116), should be /1/8021 > (1703942). > Fix? yes > > '..' in /1/8034/00036 (1245196) is /2/8205 (3981319), should be /1/8034 > (2179078). > Fix? yes > > '..' in /1/8045/00036 (1261580) is /2/8176 (2998284), should be /1/8045 > (2703366). > Fix? yes > > '..' in /1/8029/00037 (1277964) is /2/8182 (3293196), should be /1/8029 > (1900550). > Fix? yes > > '..' in /1/8025/00038 (1294348) is /2/8198 (3637260), should be /1/8025 > (1867782). > Fix? yes > > '..' in /1/8065 (1523724) is /2/8188 (3522572), should be /1 (1687558). > Fix? yes > > '..' in /1/8066 (1556492) is /2/8195 (3735564), should be /1 (1687558). > Fix? yes > > '..' in /1/8067 (1589260) is /2/8205 (3981319), should be /1 (1687558). > Fix? yes > > '..' in /1/8068 (1753100) is /2/8204 (3964940), should be /1 (1687558). > Fix? yes > > '..' in /1/8069 (1785868) is /2/8179 (2981900), should be /1 (1687558). > Fix? yes > > Unconnected directory inode 1835020 (/1/8069/???) > Connect to /lost+found? yes > > '..' in /1/8070 (1884172) is /2/8186 (3588108), should be /1 (1687558). > Fix? yes > > '..' in /1/8071 (1949708) is /2/8184 (3227660), should be /1 (1687558). > Fix? yes > > Unconnected directory inode 2064396 (/1/8071/???) > Connect to /lost+found? yes > > Pass 4: Checking reference counts > Inode 1671181 ref count is 1, should be 2. Fix? yes > > Inode 1687558 ref count is 73, should be 65. Fix? yes > > Inode 1703942 ref count is 41, should be 40. Fix? yes > > Inode 1736710 ref count is 39, should be 38. Fix? yes > > Inode 1785868 ref count is 1, should be 2. Fix? yes > > Inode 1802246 ref count is 40, should be 39. Fix? yes > > Inode 1835020 ref count is 3, should be 2. Fix? yes > > Inode 1867782 ref count is 42, should be 41. Fix? yes > > Inode 1900550 ref count is 41, should be 40. Fix? yes > > Inode 1949708 ref count is 1, should be 2. Fix? yes > > Inode 1998854 ref count is 39, should be 38. Fix? yes > > Inode 2064396 ref count is 3, should be 2. Fix? yes > > Inode 2097158 ref count is 39, should be 38. Fix? yes > > Inode 2179078 ref count is 40, should be 39. Fix? yes > > Inode 2228230 ref count is 39, should be 38. Fix? yes > > Inode 2326534 ref count is 40, should be 39. Fix? yes > > Inode 2359302 ref count is 40, should be 39. Fix? yes > > Inode 2703366 ref count is 40, should be 39. Fix? yes > > Pass 5: Checking group summary information > Block bitmap differences: -2228755 -2294289 -2359827 -2392595 -2425363 > -2458131 -2490899 -2523667 -2556435 -2589203 -(3441172--3441173) > -(5177874--5177876) -(5308946--5308948) -11862554 -12157466 -12223002 > -12255770 > -12288538 > Fix? yes > > Free blocks count wrong for group #68 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #70 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #72 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #73 (32232, counted=32235). > Fix? yes > > <snipped groups > > > Free blocks count wrong for group #74 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #75 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #76 (32229, counted=32232). > Fix? yes > > Free blocks count wrong for group #77 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #78 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #79 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #81 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #82 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #83 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #84 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #85 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #86 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #87 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #88 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #89 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #90 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #91 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #92 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #93 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #94 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #95 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #96 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #97 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #101 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #102 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #105 (32231, counted=32234). > Fix? yes > > Free blocks count wrong for group #107 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #109 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #113 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #115 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #118 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #119 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #158 (32233, counted=32236). > Fix? yes > > Free blocks count wrong for group #162 (32233, counted=32236). > Fix? yes > > Free blocks count wrong for group #256 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #257 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #258 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #259 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #260 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #261 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #262 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #263 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #264 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #265 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #266 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #267 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #268 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #269 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #270 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #271 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #273 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #274 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #275 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #276 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #277 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #278 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #279 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #280 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #281 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #282 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #283 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #284 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #285 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #286 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #287 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #288 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #289 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #290 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #291 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #292 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #293 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #294 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #295 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #296 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #297 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #298 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #299 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #300 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #301 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #302 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #303 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #304 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #305 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #306 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #307 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #308 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #309 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #310 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #311 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #312 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #313 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #314 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #315 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #316 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #317 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #318 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #319 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #320 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #321 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #322 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #323 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #324 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #325 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #326 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #327 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #328 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #329 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #330 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #331 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #332 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #333 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #334 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #335 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #336 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #337 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #338 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #339 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #340 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #341 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #342 (32234, counted=32231). > Fix? yes > > <snipped again> > > > > and then the file is missing. > > This seems to be incorrect behaviour. How should my application know that > there > was an error writing the file and provide the correct recovery ? > > I've yet to test this on other filesystems or journal modes. > > Does anyone have any explanation for this behaviour, or suggestion for > workarounds ? > > I can provide the test program if anyone is interested in reproducing. > > > Regards, > Yuval > > -- > Yuval Yeret > Exanet > yuval@exanet.com > http://www.exanet.com > Tel. 972-9-9717782 > Fax. 972-9-9717778 > >
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------>Hi, > >I'm encountering a strange situation. >[Yuval Yeret] Snipped >> Regards, > Yuval > > -- > Yuval Yeret > Exanet > yuval@exanet.com > http://www.exanet.com > Tel. 972-9-9717782 > Fax. 972-9-9717778 > >
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------