Hi, I'm encountering a strange situation. I'm writing to an ext3 filesystem mounted like this: /dev/scsi/host2/bus0/target0/lun0/part3 on /tmp/volumes/md0 type ext3 (rw,noatime,data=ordered) with the physical device being a FC-connected RAID (IBM DF4000) no software raid personality is used I'm using 2.4.18-19.7 My test program writes files in a loop, with the essential system calls being: [pid 13335] mkdir("/tmp/volumes/md0/1/13335/00028", 0777) = 0 [pid 13335] fsync(4) = 0 [pid 13335] time(NULL) = 1044515538 [pid 13335] gettimeofday({1044515538, 96401}, {4294967176, 0}) = 0 [pid 13335] getpid() = 13335 [pid 13335] sendto(3, "[Feb 6 9:12:18.096401 13335]: mkdir /tmp/volumes/md0/1/13335/00028\n", 69, 0, {sin_family=AF_INET, sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 69 [pid 13335] open("/tmp/volumes/md0/1/13335/00028", O_RDONLY|O_SYNC) = 5 [pid 13335] open("/tmp/volumes/md0/1/13335/00028/file", O_RDWR|O_CREAT|O_SYNC, 027777774120) = 6 [pid 13335] fsync(5) = 0 [pid 13335] time(NULL) = 1044515538 [pid 13335] gettimeofday({1044515538, 112558}, {4294967176, 0}) = 0 [pid 13335] getpid() = 13335 [pid 13335] sendto(3, "[Feb 6 9:12:18.112558 13335]: open /tmp/volumes/md0/1/13335/00028/file\n", 73, 0, {sin_family=AF_INET, sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 73 [pid 13335] getpid() = 13335 [pid 13335] write(6, "\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5"..., 8192) = 8192 [pid 13335] fsync(6) = 0 [pid 13335] time(NULL) = 1044515538 [pid 13335] gettimeofday({1044515538, 160822}, {4294967176, 0}) = 0 [pid 13335] getpid() = 13335 [pid 13335] sendto(3, "[Feb 6 9:12:18.160822 13335]: write /tmp/volumes/md0/1/13335/00028/file\n", 74, 0, {sin_family=AF_INET, sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 74 [pid 13335] close(6) = 0 [pid 13335] close(5) = 0 [pid 13335] time(NULL) = 1044515538 [pid 13335] gettimeofday({1044515538, 161148}, {4294967176, 0}) = 0 [pid 13335] getpid() = 13335 [pid 13335] sendto(3, "[Feb 6 9:12:18.161148 13335]: finished /tmp/volumes/md0/1/13335/00028/file\n", 77, 0, {sin_family=AF_INET, sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 77 [ In the middle of this loop of writing I turn down the power to my RAID in the first minute the last writes/actions seem to be pending. After that the write actions return successfully (which is very strange) and all following actions return with errors. When I turn on the power again, the fs seems to restore, and the file is available with the correct data. But when I do e2fsck to the device I get: [root@node2-integ /]# e2fsck -y -f /dev/scsi/host2/bus0/target0/lun0/part3 e2fsck 1.27 (8-Mar-2002) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Entry '8075' in /1 (1687558) has an incorrect filetype (was 2, should be 1). Fix? yes Entry '00036' in /1/8022 (1736710) has deleted/unused inode 1179660. Clear? yes Entry '00037' in /1/8024 (1802246) has deleted/unused inode 1146892. Clear? yes Entry '00036' in /1/8027 (1998854) has deleted/unused inode 2654219. Clear? yes Entry '00036' in /1/8035 (2097158) has deleted/unused inode 2588683. Clear? yes Entry '00036' in /1/8033 (2228230) has deleted/unused inode 1212428. Clear? yes Entry '00037' in /1/8038 (2326534) has deleted/unused inode 1114124. Clear? yes Entry '00009' in /2/8179 (2981900) is a link to directory /1/8069 (1785868). Clear? yes Entry '00013' in /2/8176 (2998284) is a link to directory /1/8045/00036 (1261580). Clear? yes Entry '00013' in /2/8178 (3047436) is a link to directory /1/8039/00036 (1196044). Clear? yes Entry '00009' in /2/8184 (3227660) is a link to directory /1/8071 (1949708). Clear? yes Entry '00012' in /2/8182 (3293196) is a link to directory /1/8029/00037 (1277964). Clear? yes Entry '00012' in /2/8193 (3375116) is a link to directory /1/8021/00037 (1228812). Clear? yes Entry '00008' in /2/8188 (3522572) is a link to directory /1/8065 (1523724). Clear? yes Entry '00009' in /2/8186 (3588108) is a link to directory /1/8070 (1884172). Clear? yes Entry '00012' in /2/8198 (3637260) is a link to directory /1/8025/00038 (1294348). Clear? yes Entry '00008' in /2/8195 (3735564) is a link to directory /1/8066 (1556492). Clear? yes Entry '00009' in /2/8204 (3964940) is a link to directory /1/8068 (1753100). Clear? yes Entry '00008' in /2/8205 (3981319) is a link to directory /1/8067 (1589260). Clear? yes Entry '00012' in /2/8205 (3981319) is a link to directory /1/8034/00036 (1245196). Clear? yes Pass 3: Checking directory connectivity '..' in /1/8039/00036 (1196044) is /2/8178 (3047436), should be /1/8039 (2359302). Fix? yes '..' in /1/8021/00037 (1228812) is /2/8193 (3375116), should be /1/8021 (1703942). Fix? yes '..' in /1/8034/00036 (1245196) is /2/8205 (3981319), should be /1/8034 (2179078). Fix? yes '..' in /1/8045/00036 (1261580) is /2/8176 (2998284), should be /1/8045 (2703366). Fix? yes '..' in /1/8029/00037 (1277964) is /2/8182 (3293196), should be /1/8029 (1900550). Fix? yes '..' in /1/8025/00038 (1294348) is /2/8198 (3637260), should be /1/8025 (1867782). Fix? yes '..' in /1/8065 (1523724) is /2/8188 (3522572), should be /1 (1687558). Fix? yes '..' in /1/8066 (1556492) is /2/8195 (3735564), should be /1 (1687558). Fix? yes '..' in /1/8067 (1589260) is /2/8205 (3981319), should be /1 (1687558). Fix? yes '..' in /1/8068 (1753100) is /2/8204 (3964940), should be /1 (1687558). Fix? yes '..' in /1/8069 (1785868) is /2/8179 (2981900), should be /1 (1687558). Fix? yes Unconnected directory inode 1835020 (/1/8069/???) Connect to /lost+found? yes '..' in /1/8070 (1884172) is /2/8186 (3588108), should be /1 (1687558). Fix? yes '..' in /1/8071 (1949708) is /2/8184 (3227660), should be /1 (1687558). Fix? yes Unconnected directory inode 2064396 (/1/8071/???) Connect to /lost+found? yes Pass 4: Checking reference counts Inode 1671181 ref count is 1, should be 2. Fix? yes Inode 1687558 ref count is 73, should be 65. Fix? yes Inode 1703942 ref count is 41, should be 40. Fix? yes Inode 1736710 ref count is 39, should be 38. Fix? yes Inode 1785868 ref count is 1, should be 2. Fix? yes Inode 1802246 ref count is 40, should be 39. Fix? yes Inode 1835020 ref count is 3, should be 2. Fix? yes Inode 1867782 ref count is 42, should be 41. Fix? yes Inode 1900550 ref count is 41, should be 40. Fix? yes Inode 1949708 ref count is 1, should be 2. Fix? yes Inode 1998854 ref count is 39, should be 38. Fix? yes Inode 2064396 ref count is 3, should be 2. Fix? yes Inode 2097158 ref count is 39, should be 38. Fix? yes Inode 2179078 ref count is 40, should be 39. Fix? yes Inode 2228230 ref count is 39, should be 38. Fix? yes Inode 2326534 ref count is 40, should be 39. Fix? yes Inode 2359302 ref count is 40, should be 39. Fix? yes Inode 2703366 ref count is 40, should be 39. Fix? yes Pass 5: Checking group summary information Block bitmap differences: -2228755 -2294289 -2359827 -2392595 -2425363 -2458131 -2490899 -2523667 -2556435 -2589203 -(3441172--3441173) -(5177874--5177876) -(5308946--5308948) -11862554 -12157466 -12223002 -12255770 -12288538 Fix? yes Free blocks count wrong for group #68 (32232, counted=32235). Fix? yes Free blocks count wrong for group #70 (32232, counted=32235). Fix? yes Free blocks count wrong for group #72 (32232, counted=32235). Fix? yes Free blocks count wrong for group #73 (32232, counted=32235). Fix? yes <snipped groups > Free blocks count wrong for group #74 (32232, counted=32235). Fix? yes Free blocks count wrong for group #75 (32232, counted=32235). Fix? yes Free blocks count wrong for group #76 (32229, counted=32232). Fix? yes Free blocks count wrong for group #77 (32232, counted=32235). Fix? yes Free blocks count wrong for group #78 (32232, counted=32235). Fix? yes Free blocks count wrong for group #79 (32232, counted=32235). Fix? yes Free blocks count wrong for group #81 (32233, counted=32235). Fix? yes Free blocks count wrong for group #82 (32233, counted=32235). Fix? yes Free blocks count wrong for group #83 (32233, counted=32235). Fix? yes Free blocks count wrong for group #84 (32233, counted=32235). Fix? yes Free blocks count wrong for group #85 (32233, counted=32235). Fix? yes Free blocks count wrong for group #86 (32233, counted=32235). Fix? yes Free blocks count wrong for group #87 (32233, counted=32235). Fix? yes Free blocks count wrong for group #88 (32233, counted=32235). Fix? yes Free blocks count wrong for group #89 (32233, counted=32234). Fix? yes Free blocks count wrong for group #90 (32233, counted=32234). Fix? yes Free blocks count wrong for group #91 (32233, counted=32234). Fix? yes Free blocks count wrong for group #92 (32233, counted=32234). Fix? yes Free blocks count wrong for group #93 (32233, counted=32234). Fix? yes Free blocks count wrong for group #94 (32233, counted=32234). Fix? yes Free blocks count wrong for group #95 (32233, counted=32234). Fix? yes Free blocks count wrong for group #96 (32233, counted=32234). Fix? yes Free blocks count wrong for group #97 (32233, counted=32234). Fix? yes Free blocks count wrong for group #101 (32233, counted=32235). Fix? yes Free blocks count wrong for group #102 (32233, counted=32235). Fix? yes Free blocks count wrong for group #105 (32231, counted=32234). Fix? yes Free blocks count wrong for group #107 (32233, counted=32234). Fix? yes Free blocks count wrong for group #109 (32233, counted=32234). Fix? yes Free blocks count wrong for group #113 (32233, counted=32234). Fix? yes Free blocks count wrong for group #115 (32233, counted=32234). Fix? yes Free blocks count wrong for group #118 (32233, counted=32234). Fix? yes Free blocks count wrong for group #119 (32233, counted=32234). Fix? yes Free blocks count wrong for group #158 (32233, counted=32236). Fix? yes Free blocks count wrong for group #162 (32233, counted=32236). Fix? yes Free blocks count wrong for group #256 (32234, counted=32231). Fix? yes Free blocks count wrong for group #257 (32234, counted=32231). Fix? yes Free blocks count wrong for group #258 (32234, counted=32231). Fix? yes Free blocks count wrong for group #259 (32234, counted=32231). Fix? yes Free blocks count wrong for group #260 (32234, counted=32231). Fix? yes Free blocks count wrong for group #261 (32234, counted=32231). Fix? yes Free blocks count wrong for group #262 (32234, counted=32231). Fix? yes Free blocks count wrong for group #263 (32234, counted=32231). Fix? yes Free blocks count wrong for group #264 (32234, counted=32231). Fix? yes Free blocks count wrong for group #265 (32234, counted=32231). Fix? yes Free blocks count wrong for group #266 (32234, counted=32231). Fix? yes Free blocks count wrong for group #267 (32234, counted=32231). Fix? yes Free blocks count wrong for group #268 (32234, counted=32231). Fix? yes Free blocks count wrong for group #269 (32234, counted=32231). Fix? yes Free blocks count wrong for group #270 (32234, counted=32231). Fix? yes Free blocks count wrong for group #271 (32234, counted=32231). Fix? yes Free blocks count wrong for group #273 (32234, counted=32231). Fix? yes Free blocks count wrong for group #274 (32234, counted=32231). Fix? yes Free blocks count wrong for group #275 (32234, counted=32231). Fix? yes Free blocks count wrong for group #276 (32234, counted=32231). Fix? yes Free blocks count wrong for group #277 (32234, counted=32231). Fix? yes Free blocks count wrong for group #278 (32234, counted=32231). Fix? yes Free blocks count wrong for group #279 (32234, counted=32231). Fix? yes Free blocks count wrong for group #280 (32234, counted=32231). Fix? yes Free blocks count wrong for group #281 (32234, counted=32231). Fix? yes Free blocks count wrong for group #282 (32234, counted=32231). Fix? yes Free blocks count wrong for group #283 (32234, counted=32231). Fix? yes Free blocks count wrong for group #284 (32234, counted=32231). Fix? yes Free blocks count wrong for group #285 (32234, counted=32231). Fix? yes Free blocks count wrong for group #286 (32234, counted=32231). Fix? yes Free blocks count wrong for group #287 (32234, counted=32231). Fix? yes Free blocks count wrong for group #288 (32234, counted=32231). Fix? yes Free blocks count wrong for group #289 (32234, counted=32231). Fix? yes Free blocks count wrong for group #290 (32234, counted=32231). Fix? yes Free blocks count wrong for group #291 (32234, counted=32231). Fix? yes Free blocks count wrong for group #292 (32234, counted=32231). Fix? yes Free blocks count wrong for group #293 (32234, counted=32231). Fix? yes Free blocks count wrong for group #294 (32234, counted=32231). Fix? yes Free blocks count wrong for group #295 (32234, counted=32231). Fix? yes Free blocks count wrong for group #296 (32234, counted=32231). Fix? yes Free blocks count wrong for group #297 (32234, counted=32231). Fix? yes Free blocks count wrong for group #298 (32234, counted=32231). Fix? yes Free blocks count wrong for group #299 (32234, counted=32231). Fix? yes Free blocks count wrong for group #300 (32234, counted=32231). Fix? yes Free blocks count wrong for group #301 (32234, counted=32231). Fix? yes Free blocks count wrong for group #302 (32234, counted=32231). Fix? yes Free blocks count wrong for group #303 (32234, counted=32231). Fix? yes Free blocks count wrong for group #304 (32234, counted=32231). Fix? yes Free blocks count wrong for group #305 (32234, counted=32231). Fix? yes Free blocks count wrong for group #306 (32234, counted=32231). Fix? yes Free blocks count wrong for group #307 (32234, counted=32231). Fix? yes Free blocks count wrong for group #308 (32234, counted=32231). Fix? yes Free blocks count wrong for group #309 (32234, counted=32231). Fix? yes Free blocks count wrong for group #310 (32234, counted=32231). Fix? yes Free blocks count wrong for group #311 (32234, counted=32231). Fix? yes Free blocks count wrong for group #312 (32234, counted=32231). Fix? yes Free blocks count wrong for group #313 (32234, counted=32231). Fix? yes Free blocks count wrong for group #314 (32234, counted=32231). Fix? yes Free blocks count wrong for group #315 (32234, counted=32231). Fix? yes Free blocks count wrong for group #316 (32234, counted=32231). Fix? yes Free blocks count wrong for group #317 (32234, counted=32231). Fix? yes Free blocks count wrong for group #318 (32234, counted=32231). Fix? yes Free blocks count wrong for group #319 (32234, counted=32231). Fix? yes Free blocks count wrong for group #320 (32234, counted=32231). Fix? yes Free blocks count wrong for group #321 (32234, counted=32231). Fix? yes Free blocks count wrong for group #322 (32234, counted=32231). Fix? yes Free blocks count wrong for group #323 (32234, counted=32231). Fix? yes Free blocks count wrong for group #324 (32234, counted=32231). Fix? yes Free blocks count wrong for group #325 (32234, counted=32231). Fix? yes Free blocks count wrong for group #326 (32234, counted=32231). Fix? yes Free blocks count wrong for group #327 (32234, counted=32231). Fix? yes Free blocks count wrong for group #328 (32234, counted=32231). Fix? yes Free blocks count wrong for group #329 (32234, counted=32231). Fix? yes Free blocks count wrong for group #330 (32234, counted=32231). Fix? yes Free blocks count wrong for group #331 (32234, counted=32231). Fix? yes Free blocks count wrong for group #332 (32234, counted=32231). Fix? yes Free blocks count wrong for group #333 (32234, counted=32231). Fix? yes Free blocks count wrong for group #334 (32234, counted=32231). Fix? yes Free blocks count wrong for group #335 (32234, counted=32231). Fix? yes Free blocks count wrong for group #336 (32234, counted=32231). Fix? yes Free blocks count wrong for group #337 (32234, counted=32231). Fix? yes Free blocks count wrong for group #338 (32234, counted=32231). Fix? yes Free blocks count wrong for group #339 (32234, counted=32231). Fix? yes Free blocks count wrong for group #340 (32234, counted=32231). Fix? yes Free blocks count wrong for group #341 (32234, counted=32231). Fix? yes Free blocks count wrong for group #342 (32234, counted=32231). Fix? yes <snipped again> and then the file is missing. This seems to be incorrect behaviour. How should my application know that there was an error writing the file and provide the correct recovery ? I've yet to test this on other filesystems or journal modes. Does anyone have any explanation for this behaviour, or suggestion for workarounds ? I can provide the test program if anyone is interested in reproducing. Regards, Yuval -- Yuval Yeret Exanet yuval@exanet.com http://www.exanet.com Tel. 972-9-9717782 Fax. 972-9-9717778 <<Yuval Yeret (E-mail).vcf>>
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------>Hi, > >I'm encountering a strange situation. > >I'm writing to an ext3 filesystem mounted like this: >/dev/scsi/host2/bus0/target0/lun0/part3 on /tmp/volumes/md0 type ext3 >(rw,noatime,data=ordered) > >with the physical device being a FC-connected RAID (IBM DF4000) >no software raid personality is used >I'm using 2.4.18-19.7 > >My test program writes files in a loop, with the essential system calls being: >[pid 13335] mkdir("/tmp/volumes/md0/1/13335/00028", 0777) = 0 >[pid 13335] fsync(4) = 0 >[pid 13335] time(NULL) = 1044515538 >[pid 13335] gettimeofday({1044515538, 96401}, {4294967176, 0}) = 0 >[pid 13335] getpid() = 13335 >[pid 13335] sendto(3, "[Feb 6 9:12:18.096401 13335]: mkdir >/tmp/volumes/md0/1/13335/00028\n", 69, 0, {sin_family=AF_INET, >sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 69 >[pid 13335] open("/tmp/volumes/md0/1/13335/00028", O_RDONLY|O_SYNC) = 5 >[pid 13335] open("/tmp/volumes/md0/1/13335/00028/file", O_RDWR|O_CREAT|O_SYNC,>027777774120) = 6 >[pid 13335] fsync(5) = 0 >[pid 13335] time(NULL) = 1044515538 >[pid 13335] gettimeofday({1044515538, 112558}, {4294967176, 0}) = 0 >[pid 13335] getpid() = 13335 >[pid 13335] sendto(3, "[Feb 6 9:12:18.112558 13335]: open >/tmp/volumes/md0/1/13335/00028/file\n", 73, 0, {sin_family=AF_INET, >sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 73 >[pid 13335] getpid() = 13335 >[pid 13335] write(6, >"\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v \f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\2 1\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\2 6\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\ 2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\ t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17 \20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24 \25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31 \0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\ 7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\ 16\17\20\21\22\23\24\25\26\27\30\31\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\ 23\24\25\26\27\30\31\0\1\2\3\4\5"...,> 8192) = 8192 > [pid 13335] fsync(6) = 0 > [pid 13335] time(NULL) = 1044515538 > [pid 13335] gettimeofday({1044515538, 160822}, {4294967176, 0}) = 0 > [pid 13335] getpid() = 13335 > [pid 13335] sendto(3, "[Feb 6 9:12:18.160822 13335]: write > /tmp/volumes/md0/1/13335/00028/file\n", 74, 0, {sin_family=AF_INET, > sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 74 > [pid 13335] close(6) = 0 > [pid 13335] close(5) = 0 > [pid 13335] time(NULL) = 1044515538 > [pid 13335] gettimeofday({1044515538, 161148}, {4294967176, 0}) = 0 > [pid 13335] getpid() = 13335 > [pid 13335] sendto(3, "[Feb 6 9:12:18.161148 13335]: finished > /tmp/volumes/md0/1/13335/00028/file\n", 77, 0, {sin_family=AF_INET, > sin_port=htons(12345), sin_addr=inet_addr("127.0.0.1")}}, 16) = 77 > [ > > > In the middle of this loop of writing I turn down the power to my RAID > > in the first minute the last writes/actions seem to be pending. After that > the > write actions return successfully (which is very strange) and all following > actions return with errors. > > When I turn on the power again, the fs seems to restore, and the file is > available with the correct data. > > But when I do e2fsck to the device I get: > [root@node2-integ /]# e2fsck -y -f /dev/scsi/host2/bus0/target0/lun0/part3 > e2fsck 1.27 (8-Mar-2002) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Entry '8075' in /1 (1687558) has an incorrect filetype (was 2, should be 1). > Fix? yes > > Entry '00036' in /1/8022 (1736710) has deleted/unused inode 1179660. Clear? > yes > > Entry '00037' in /1/8024 (1802246) has deleted/unused inode 1146892. Clear? > yes > > Entry '00036' in /1/8027 (1998854) has deleted/unused inode 2654219. Clear? > yes > > Entry '00036' in /1/8035 (2097158) has deleted/unused inode 2588683. Clear? > yes > > Entry '00036' in /1/8033 (2228230) has deleted/unused inode 1212428. Clear? > yes > > Entry '00037' in /1/8038 (2326534) has deleted/unused inode 1114124. Clear? > yes > > Entry '00009' in /2/8179 (2981900) is a link to directory /1/8069 (1785868). > Clear? yes > > Entry '00013' in /2/8176 (2998284) is a link to directory /1/8045/00036 > (1261580). > Clear? yes > > Entry '00013' in /2/8178 (3047436) is a link to directory /1/8039/00036 > (1196044). > Clear? yes > > Entry '00009' in /2/8184 (3227660) is a link to directory /1/8071 (1949708). > Clear? yes > > Entry '00012' in /2/8182 (3293196) is a link to directory /1/8029/00037 > (1277964). > Clear? yes > > Entry '00012' in /2/8193 (3375116) is a link to directory /1/8021/00037 > (1228812). > Clear? yes > > Entry '00008' in /2/8188 (3522572) is a link to directory /1/8065 (1523724). > Clear? yes > > Entry '00009' in /2/8186 (3588108) is a link to directory /1/8070 (1884172). > Clear? yes > > Entry '00012' in /2/8198 (3637260) is a link to directory /1/8025/00038 > (1294348). > Clear? yes > > Entry '00008' in /2/8195 (3735564) is a link to directory /1/8066 (1556492). > Clear? yes > > Entry '00009' in /2/8204 (3964940) is a link to directory /1/8068 (1753100). > Clear? yes > > Entry '00008' in /2/8205 (3981319) is a link to directory /1/8067 (1589260). > Clear? yes > > Entry '00012' in /2/8205 (3981319) is a link to directory /1/8034/00036 > (1245196). > Clear? yes > > Pass 3: Checking directory connectivity > '..' in /1/8039/00036 (1196044) is /2/8178 (3047436), should be /1/8039 > (2359302). > Fix? yes > > '..' in /1/8021/00037 (1228812) is /2/8193 (3375116), should be /1/8021 > (1703942). > Fix? yes > > '..' in /1/8034/00036 (1245196) is /2/8205 (3981319), should be /1/8034 > (2179078). > Fix? yes > > '..' in /1/8045/00036 (1261580) is /2/8176 (2998284), should be /1/8045 > (2703366). > Fix? yes > > '..' in /1/8029/00037 (1277964) is /2/8182 (3293196), should be /1/8029 > (1900550). > Fix? yes > > '..' in /1/8025/00038 (1294348) is /2/8198 (3637260), should be /1/8025 > (1867782). > Fix? yes > > '..' in /1/8065 (1523724) is /2/8188 (3522572), should be /1 (1687558). > Fix? yes > > '..' in /1/8066 (1556492) is /2/8195 (3735564), should be /1 (1687558). > Fix? yes > > '..' in /1/8067 (1589260) is /2/8205 (3981319), should be /1 (1687558). > Fix? yes > > '..' in /1/8068 (1753100) is /2/8204 (3964940), should be /1 (1687558). > Fix? yes > > '..' in /1/8069 (1785868) is /2/8179 (2981900), should be /1 (1687558). > Fix? yes > > Unconnected directory inode 1835020 (/1/8069/???) > Connect to /lost+found? yes > > '..' in /1/8070 (1884172) is /2/8186 (3588108), should be /1 (1687558). > Fix? yes > > '..' in /1/8071 (1949708) is /2/8184 (3227660), should be /1 (1687558). > Fix? yes > > Unconnected directory inode 2064396 (/1/8071/???) > Connect to /lost+found? yes > > Pass 4: Checking reference counts > Inode 1671181 ref count is 1, should be 2. Fix? yes > > Inode 1687558 ref count is 73, should be 65. Fix? yes > > Inode 1703942 ref count is 41, should be 40. Fix? yes > > Inode 1736710 ref count is 39, should be 38. Fix? yes > > Inode 1785868 ref count is 1, should be 2. Fix? yes > > Inode 1802246 ref count is 40, should be 39. Fix? yes > > Inode 1835020 ref count is 3, should be 2. Fix? yes > > Inode 1867782 ref count is 42, should be 41. Fix? yes > > Inode 1900550 ref count is 41, should be 40. Fix? yes > > Inode 1949708 ref count is 1, should be 2. Fix? yes > > Inode 1998854 ref count is 39, should be 38. Fix? yes > > Inode 2064396 ref count is 3, should be 2. Fix? yes > > Inode 2097158 ref count is 39, should be 38. Fix? yes > > Inode 2179078 ref count is 40, should be 39. Fix? yes > > Inode 2228230 ref count is 39, should be 38. Fix? yes > > Inode 2326534 ref count is 40, should be 39. Fix? yes > > Inode 2359302 ref count is 40, should be 39. Fix? yes > > Inode 2703366 ref count is 40, should be 39. Fix? yes > > Pass 5: Checking group summary information > Block bitmap differences: -2228755 -2294289 -2359827 -2392595 -2425363 > -2458131 -2490899 -2523667 -2556435 -2589203 -(3441172--3441173) > -(5177874--5177876) -(5308946--5308948) -11862554 -12157466 -12223002 > -12255770 > -12288538 > Fix? yes > > Free blocks count wrong for group #68 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #70 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #72 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #73 (32232, counted=32235). > Fix? yes > > <snipped groups > > > Free blocks count wrong for group #74 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #75 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #76 (32229, counted=32232). > Fix? yes > > Free blocks count wrong for group #77 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #78 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #79 (32232, counted=32235). > Fix? yes > > Free blocks count wrong for group #81 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #82 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #83 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #84 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #85 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #86 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #87 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #88 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #89 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #90 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #91 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #92 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #93 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #94 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #95 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #96 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #97 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #101 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #102 (32233, counted=32235). > Fix? yes > > Free blocks count wrong for group #105 (32231, counted=32234). > Fix? yes > > Free blocks count wrong for group #107 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #109 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #113 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #115 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #118 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #119 (32233, counted=32234). > Fix? yes > > Free blocks count wrong for group #158 (32233, counted=32236). > Fix? yes > > Free blocks count wrong for group #162 (32233, counted=32236). > Fix? yes > > Free blocks count wrong for group #256 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #257 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #258 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #259 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #260 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #261 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #262 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #263 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #264 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #265 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #266 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #267 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #268 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #269 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #270 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #271 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #273 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #274 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #275 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #276 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #277 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #278 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #279 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #280 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #281 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #282 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #283 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #284 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #285 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #286 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #287 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #288 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #289 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #290 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #291 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #292 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #293 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #294 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #295 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #296 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #297 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #298 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #299 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #300 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #301 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #302 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #303 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #304 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #305 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #306 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #307 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #308 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #309 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #310 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #311 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #312 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #313 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #314 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #315 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #316 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #317 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #318 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #319 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #320 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #321 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #322 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #323 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #324 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #325 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #326 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #327 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #328 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #329 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #330 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #331 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #332 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #333 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #334 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #335 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #336 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #337 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #338 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #339 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #340 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #341 (32234, counted=32231). > Fix? yes > > Free blocks count wrong for group #342 (32234, counted=32231). > Fix? yes > > <snipped again> > > > > and then the file is missing. > > This seems to be incorrect behaviour. How should my application know that > there > was an error writing the file and provide the correct recovery ? > > I've yet to test this on other filesystems or journal modes. > > Does anyone have any explanation for this behaviour, or suggestion for > workarounds ? > > I can provide the test program if anyone is interested in reproducing. > > > Regards, > Yuval > > -- > Yuval Yeret > Exanet > yuval@exanet.com > http://www.exanet.com > Tel. 972-9-9717782 > Fax. 972-9-9717778 > >
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------>Hi, > >I'm encountering a strange situation. >[Yuval Yeret] Snipped >> Regards, > Yuval > > -- > Yuval Yeret > Exanet > yuval@exanet.com > http://www.exanet.com > Tel. 972-9-9717782 > Fax. 972-9-9717778 > >
Hi, normally we do use write back cache on the RAID (which is backed up by battery so should be ok...) but in order to verify this isn't the cause we switched to write-through and it didn't help. The best behaviour I've seen so far was when I configured DATA=JOURNAL, but even then it's all very timing-sensitive and I still get writes/closes which finish ok but are not on the disk later on... And I've yet to see a real explanation/coverage of how journalling filesystems in general, and EXT3 specifically, handle disk failure situations like power loss or FC cable disconnection. Currently our direction is to monitor the loop state (via /proc) and initiate a killall on the application and umount once we see a loop down indication. A brute-force test of this mechanism seems to work... Anyhow thanks for the sugggestion, will be happy to continue experimenting. Yuval -----Original Message----- From: Robert Liu [mailto:rgliu@sr5tech.com] Sent: Friday, February 14, 2003 8:13 AM To: ext3-users@redhat.com Cc: Yuval Yeret Subject: RE: Lost writes upon disk power failure Response to the following posting: http://www.redhat.com/mailing-lists/ext3-users/msg09093.html Hi Yuval, Do you know if write back cache is enabled on the disks? If so, the problem you're observing maybe due to lost data in the cache when you powered down the RAID array. We've performed power cycle experiments on IDE RAID arrays and have observed similar problems: http://www.sr5tech.com/write_back_cache_experiments.htm Rob ------------------------------------------------------------ Robert Liu Boon Storage Technologies, Inc. rgliu@sr5tech.com (408)393-8998 www.sr5tech.com ------------------------------------------------------------