Nathan Anderson
2009-Jul-07 16:08 UTC
A plea for help; or, how to shoot yourself in the foot with ext3
All, I need help. I did something dumb and shot myself in the foot. What I was doing when it happened was something I've done plenty of other times before, and I simply acted carelessly during the process. I tried to install a copy of MikroTik RouterOS (Linux-based routing software) onto a USB thumb-drive using my laptop last weekend, but I forgot to take the hard drive out of the laptop first like I usually do when attempting to do such things. And the install program found my hard drive instead of the USB drive, and...well, you can guess what happened next. HDD was wiped, repartitioned, and those partitions formatted. I lost everything as a result. MikroTik RouterOS appears to use ext3, which is why I'm here. I have a bit-for-bit clone backup of my HDD from a few months back that I really should have been refreshing more regularly, but lucky for me my partition layout on the laptop hadn't changed since the last snapshot, so I was able to at least restore my MBR and get the partition boundaries back. And, after doing so, miracle of miracles: I was able to see and access the filesystems of all three partitions I had on the drive (1 NTFS, 1 HFS+, and 1 ext3)!! It wasn't a great mystery to me that I was able to see the contents of the last 2 partitions, but I figured the filesystem structure of the NTFS partition at the beginning of the drive had to have been completely clobbered (since RouterOS did, in fact, complete the installation, and not just get as far as the formatting). I was ecstatic to learn otherwise! But I was in for a(nother) shock. Despite the fact that the filesystem metadata seemed to be intact, the data itself that was contained therein didn't appear to be so lucky. The corruption, or whatever it was, was so bad that I couldn't boot the operating systems contained on the latter 2 partitions even though they should have been minimally touched during the RouterOS disk format. Right now, I'm trying to figure out if this is the fault of ext3 and the mkfs.ext3 (may its name be forever cursed[1]) format process, or what. I took a large-ish (~200MB) file from one of the partitions, and compared it to a known-good copy of the file. Here is what I found after analyzing the differences (which I'm guessing will not be news to most of you; also, my math may be a little off since I did this in rather a hurry): * There are 2MiB + 8KiB contiguous "chunks" consisting completely of 0s with the exception of the first 64 bytes of the chunk, 63 of which are FF/255 and the 64th which is value 3. Where those "chunks" exist, my data is gone/overwritten. * These chunks of 0'd out data seem to occur in regular intervals of roughly 124Mbytes. * About 12MiB or so (actually exactly 80KiB short of 12MiB) before each "hole" is another much smaller blanked-out area, 4KiB in size, that roughly consists of all 0s as well but which also contains a few unique values at the beginning as well. Other files that I looked at in all three partitions had similar "holes" in them. I am guessing that all of this lovely handiwork was in fact the result of mkfs.ext3 (may its firstborn perish in agony[1]) during the portion of the RouterOS install where it said "Partitioning and formatting disk," but am not sure because I don't have a deep enough knowledge of ext2 or ext3 to know whether this kind of pattern is to be expected from a format/mkfs. Whatever it was that caused this, it took a shotgun to my data, and now it looks like swiss cheese. Based on this info, does this sound like something that mkfs.ext3 (may it be exposed to the flames of Hades for all eternity[1]) would do/have done? And does the ext3 formatting process really have to be so destructive? I doubt that, whatever the cause, it can be undone now, unless there is something that I'm missing here and there is somebody out there who might be able to suggest how I can reconstruct or re-discover the seemingly missing data. After learning of The Great Zero Challenge (http://16systems.com/zero/index.html and http://hostjury.com/blog/view/195/the-great-zero-challenge-remains-unaccepted), though, it doesn't sound as though there is much hope. Thanks for listening, and if anybody either has any suggestions or can at least confirm for me that it is a lost cause so I can stop worrying about it and waste no more time on research, I would be grateful. [1] I realize that the responsibility for the lost data lies solely with me and not with the author(s) of mkfs.ext3; this venting and poor attempt at comic relief is merely one method I've used to try to deal with the loss. -- Nathan Anderson nathan at anderson-net.com
Matija Nalis
2009-Jul-07 18:24 UTC
A plea for help; or, how to shoot yourself in the foot with ext3
On Tue, Jul 07, 2009 at 09:08:02AM -0700, Nathan Anderson wrote:> * There are 2MiB + 8KiB contiguous "chunks" consisting completely of 0s > with the exception of the first 64 bytes of the chunk, 63 of which are > FF/255 and the 64th which is value 3. Where those "chunks" exist, my > data is gone/overwritten. > > Based on this info, does this sound like something that mkfs.ext3 (may > it be exposed to the flames of Hades for all eternity[1]) would do/have > done? And does the ext3 formatting process really have to be so > destructive?Yes, ext2/ext3 does scatter data structures around the disk in such a way. The reasons are both for performance (keeping some metadata closer to the data it relates to, thus requiring less head movement) and safety (so if you overwrite just beggining of the disk for example, you could recover undamaged parts of the FS from backup superblocks etc).> Thanks for listening, and if anybody either has any suggestions or can > at least confirm for me that it is a lost cause so I can stop worrying > about it and waste no more time on research, I would be grateful.As of idea for trying to recover, it's probably easiest to just go with the last backup and then try to recover (using boot CD/USB) relatively smaller files which have higher chances of survival from mkfs.ext3 shotgun (like config files, documents, bookmarks, etc), or at least most important of those. However, alternatively you might want to make a program which will go over the damaged parts (you should first calculate where exactly that is) and copy over them only that damaged blocks from your old bit-for-bit backup. In that way, you might be able to even boot the system (as system files have a higher chance to be recovered that way as they don't move around disk much unless you do big system upgrades). You also *might* recover some more of the user stuff that way, but not all. Of course, if you've been moving files around much from last backup and/or been defragmenting your filesystems, your chances for success are much lower. You should make bit-for-bit image (of this broken filesystem!) first (for which you will need extra space, of course); as you might not get the program right on the first try, and you don't want to incur any more damage... -- Opinions above are GNU-copylefted.