Hi, I had a RAID5 double disk failure (40 megs or so bad sectors near middle of the second failed disk), bad news but I recovered what I was able to. The RAID contained a dm-crypt physical volume which then contained four logical volumes. Two are EXT4 and two BTRFS, about 1TB in size each. The failure occurred while the volumes were online and in use, so in addition to what was unreadable, all pending writes to the device between the failure and when the problem was discovered were lost as well. The two ext4, fortunately, had some relatively minor corruption which was cleared up with a few rounds of fsck. The two btrfs are completely unhappy though and I do not know how to proceed, since btrfs problems are new to me. Any suggestions are welcome. Here is the basic picture of what is going on. # cat /etc/fstab # <file system> <mount point> <type> <options> <dump> <pass> #/dev/mapper/tr5ut-media /mnt/media btrfs defaults,compress=lzo,space_cache 0 2 /dev/mapper/tr5ut-media /mnt/media ext4 defaults 0 2 /dev/mapper/tr5ut-vicep--library /vicepa auto defaults,compress=lzo,space_cache 0 2 /dev/mapper/tr5ut-vicep--clones /vicepb auto defaults,compress=lzo,space_cache 0 2 You can see that btrfs device scan does not find anything, while btrfs-show finds one of the volumes and not the other. Fscking the found volume halts due to checksum and assertion errors, while fscking the other volume fails completely, I guess due to a missing ''superblock'' type structure? seraph:~# btrfs device scan Scanning for Btrfs filesystems failed to read /dev/sr0 seraph:~# btrfs-show ** ** WARNING: this program is considered deprecated ** Please consider to switch to the btrfs utility ** failed to read /dev/sr0: No medium found Label: vicep-library uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 Total devices 1 FS bytes used 254.35GB devid 1 size 1.00TB used 299.04GB path /dev/dm-32 Btrfs Btrfs v0.19 seraph:~# btrfsck /dev/mapper/tr5ut-vicep--library checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 Csum didn''t match btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion `!(path->slots[0] == 0)'' failed. Aborted seraph:~# btrfsck /dev/mapper/tr5ut-vicep--clones No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones seraph:~# dpkg -l btrfs-tools Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Description +++-=============================================-=============================================-=========================================================================================================ii btrfs-tools 0.19+20111105-2 Checksumming Copy on Write Filesystem utilities -- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Does anyone have any idea how I should proceed with the below quoted situation? Unfortunately, I am going to have to give up on btrfs if it is really so fragile. I am using kernel 3.2.2 and btrfs-tools from November. On Sun, Feb 05, 2012 at 12:41:28PM -0600, Ryan C. Underwood wrote:> > Hi, > > I had a RAID5 double disk failure (40 megs or so bad sectors near > middle of the second failed disk), bad news but I recovered what I was > able to. > > The RAID contained a dm-crypt physical volume which then contained > four logical volumes. Two are EXT4 and two BTRFS, about 1TB in size > each. > > The failure occurred while the volumes were online and in use, so in > addition to what was unreadable, all pending writes to the device > between the failure and when the problem was discovered were lost as > well. > > The two ext4, fortunately, had some relatively minor corruption which > was cleared up with a few rounds of fsck. The two btrfs are > completely unhappy though and I do not know how to proceed, since > btrfs problems are new to me. Any suggestions are welcome. > > Here is the basic picture of what is going on. > > # cat /etc/fstab > # <file system> <mount point> <type> <options> <dump> <pass> > #/dev/mapper/tr5ut-media /mnt/media btrfs > defaults,compress=lzo,space_cache 0 2 > > /dev/mapper/tr5ut-media /mnt/media ext4 defaults 0 2 > > /dev/mapper/tr5ut-vicep--library /vicepa auto > defaults,compress=lzo,space_cache 0 2 > > /dev/mapper/tr5ut-vicep--clones /vicepb auto > defaults,compress=lzo,space_cache 0 2 > > > You can see that btrfs device scan does not find anything, while > btrfs-show finds one of the volumes and not the other. Fscking the > found volume halts due to checksum and assertion errors, while fscking > the other volume fails completely, I guess due to a missing > ''superblock'' type structure? > > > seraph:~# btrfs device scan > Scanning for Btrfs filesystems > failed to read /dev/sr0 > > > seraph:~# btrfs-show > ** > ** WARNING: this program is considered deprecated > ** Please consider to switch to the btrfs utility > ** > failed to read /dev/sr0: No medium found > Label: vicep-library uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 > Total devices 1 FS bytes used 254.35GB > devid 1 size 1.00TB used 299.04GB path /dev/dm-32 > > Btrfs Btrfs v0.19 > > > seraph:~# btrfsck /dev/mapper/tr5ut-vicep--library > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > Csum didn''t match > btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion > `!(path->slots[0] == 0)'' failed. > Aborted > > > seraph:~# btrfsck /dev/mapper/tr5ut-vicep--clones > No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones > > > seraph:~# dpkg -l btrfs-tools > Desired=Unknown/Install/Remove/Purge/Hold > | > Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version > Description > +++-=============================================-=============================================-=========================================================================================================> ii btrfs-tools 0.19+20111105-2 > Checksumming Copy on Write Filesystem utilities > > > -- > Ryan C. Underwood, <nemesis@icequake.net> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/07/2012 11:39 AM, Ryan C. Underwood wrote:> Does anyone have any idea how I should proceed with the below quoted > situation? Unfortunately, I am going to have to give up on btrfs if > it is really so fragile. I am using kernel 3.2.2 and btrfs-tools > from November. > > On Sun, Feb 05, 2012 at 12:41:28PM -0600, Ryan C. Underwood wrote: >> Hi, >> >> I had a RAID5 double disk failure (40 megs or so bad sectors near >> middle of the second failed disk), bad news but I recovered what I was >> able to. >> >> The RAID contained a dm-crypt physical volume which then contained >> four logical volumes. Two are EXT4 and two BTRFS, about 1TB in size >> each. >> >> The failure occurred while the volumes were online and in use, so in >> addition to what was unreadable, all pending writes to the device >> between the failure and when the problem was discovered were lost as >> well. >>Hi Ryan, So on the failure, what does dmesg show? checksum errors?>> The two ext4, fortunately, had some relatively minor corruption which >> was cleared up with a few rounds of fsck. The two btrfs are >> completely unhappy though and I do not know how to proceed, since >> btrfs problems are new to me. Any suggestions are welcome. >>btrfsck is not ready for data recovery, but only for error checking. But btrfs-tools do have some features that may help us, e.g zero-log. More recovery details refer to the thread from Hugo: http://www.spinics.net/lists/linux-btrfs/msg14890.html thanks, liubo>> Here is the basic picture of what is going on. >> >> # cat /etc/fstab >> # <file system> <mount point> <type> <options> <dump> <pass> >> #/dev/mapper/tr5ut-media /mnt/media btrfs >> defaults,compress=lzo,space_cache 0 2 >> >> /dev/mapper/tr5ut-media /mnt/media ext4 defaults 0 2 >> >> /dev/mapper/tr5ut-vicep--library /vicepa auto >> defaults,compress=lzo,space_cache 0 2 >> >> /dev/mapper/tr5ut-vicep--clones /vicepb auto >> defaults,compress=lzo,space_cache 0 2 >> >> >> You can see that btrfs device scan does not find anything, while >> btrfs-show finds one of the volumes and not the other. Fscking the >> found volume halts due to checksum and assertion errors, while fscking >> the other volume fails completely, I guess due to a missing >> ''superblock'' type structure? >> >> >> seraph:~# btrfs device scan >> Scanning for Btrfs filesystems >> failed to read /dev/sr0 >> >> >> seraph:~# btrfs-show >> ** >> ** WARNING: this program is considered deprecated >> ** Please consider to switch to the btrfs utility >> ** >> failed to read /dev/sr0: No medium found >> Label: vicep-library uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 >> Total devices 1 FS bytes used 254.35GB >> devid 1 size 1.00TB used 299.04GB path /dev/dm-32 >> >> Btrfs Btrfs v0.19 >> >> >> seraph:~# btrfsck /dev/mapper/tr5ut-vicep--library >> checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 >> checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 >> checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 >> checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 >> Csum didn''t match >> btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion >> `!(path->slots[0] == 0)'' failed. >> Aborted >> >> >> seraph:~# btrfsck /dev/mapper/tr5ut-vicep--clones >> No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones >> >> >> seraph:~# dpkg -l btrfs-tools >> Desired=Unknown/Install/Remove/Purge/Hold >> | >> Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend >> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) >> ||/ Name Version >> Description >> +++-=============================================-=============================================-=========================================================================================================>> ii btrfs-tools 0.19+20111105-2 >> Checksumming Copy on Write Filesystem utilities >> >> >> -- >> Ryan C. Underwood, <nemesis@icequake.net> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Ryan C. Underwood posted on Mon, 06 Feb 2012 21:39:45 -0600 as excerpted:> Does anyone have any idea how I should proceed with the below quoted > situation? Unfortunately, I am going to have to give up on btrfs if it > is really so fragile. I am using kernel 3.2.2 and btrfs-tools from > November.Regardless of the technical details of your situation, keep in mind that btrfs is still experimental at this time, and remains under heavy development, as you''ll have noticed if you read the kernel''s changelogs or this list at all. Kernel 3.2.2 is relatively recent altho you could try the latest 3.3 rc or git kernel as well, but I''d suggest a btrfs- tools rebuild as November really isn''t particularly current there. However, complaining about the fragility of a still in development and marked experimental filesystem would seem disingenuous at best. Particularly when it''s used on top of a dmcrypt layer that btrfs was known to have issues with (see the wiki), **AND** when you were using raid-5 and had not just a single spindle failure, but a double-spindle failure, a situation that''s well outside anything raid-5 claims to handle (raid-6 OTOH... or triple-redundant raid-1 or raid-10...). OK, so given you''re running an experimental filesystem on a block-device stack it''s known to have problems with, you surely had backups if the data was at all important to you. Simply restore from those backups. If you didn''t care to make backups, when running in such a known unstable situation, well, obviously the data couldn''t have been so important to you after all, as you obviously didn''t care about it enough to do those backups, and by the sound of things, not even enough to be informed about the development and stability status of the filesystem and block-device stack you were using. IOW, yes, btrfs is to be considered fragile at this point. It''s still in development, there''s not even an error-correcting btrfsck yet, and you were using it on a block-device stack that the wiki specifically mentions is problematic. Both the btrfs kernel option and the wiki have big warnings about the stability at this point, specifically stating that it''s not to be trusted to safely hold data yet. If you were using it contrary to those warnings and lost data due to lack of backups, there''s no one to blame but yourself. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Unfortunately, I am going to have to give up on btrfs if it > > is really so fragile. > > However, complaining about the fragility of a still in development > and > marked experimental filesystem would seem disingenuous at best.[snip paragraphs of tut-tutting]> IOW, yes, btrfs is to be considered fragile at this point.So you re-stated my position. I gave btrfs a chance but it is still apparently far more fragile than ext4 when corruption is introduced -- although btrfs is the filesystem of the two which is specifically designed to provide internal fault tolerance and resilience. Is there a fine line between "user feedback" and "disingenuous complaining" that I am not aware of? The data in question is not that important, though I would like to have it back considering it should mostly still be there as on the ext4 volumes. 40MB of bad sectors on one 2TB disk in a 6TB volume does not seem like a lot. Even if the whole beginning of the volume was wiped out surely there is the equivalent of backup superblocks? I can hack if I could just get a clue where to start. -- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 7, 2012 at 8:04 AM, Ryan C. Underwood <nemesis-lists@icequake.net> wrote:> >> > Unfortunately, I am going to have to give up on btrfs if it >> > is really so fragile. >> >> However, complaining about the fragility of a still in development >> and >> marked experimental filesystem would seem disingenuous at best. > [snip paragraphs of tut-tutting] >> IOW, yes, btrfs is to be considered fragile at this point. > > So you re-stated my position. I gave btrfs a chance but it is still > apparently far more fragile than ext4 when corruption is introduced -- > although btrfs is the filesystem of the two which is specifically > designed to provide internal fault tolerance and resilience. Is there > a fine line between "user feedback" and "disingenuous complaining" > that I am not aware of? > > The data in question is not that important, though I would like to > have it back considering it should mostly still be there as on the > ext4 volumes. 40MB of bad sectors on one 2TB disk in a 6TB volume > does not seem like a lot. Even if the whole beginning of the volume > was wiped out surely there is the equivalent of backup superblocks? I > can hack if I could just get a clue where to start. >Since you''re getting "failed to read /dev/sr0" messages, that might be an indication there are some newer btrfs-progs tools available. You might want to try the building btrfs-progs from the git repository: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=summary There are some recovery tools there that may extract your data (look at the "recover" program). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 07, 2012 at 12:17:23PM +0800, Liu Bo wrote:> >> > >> The failure occurred while the volumes were online and in use, so in > >> addition to what was unreadable, all pending writes to the device > >> between the failure and when the problem was discovered were lost as > >> well. > >> > > Hi Ryan, > > So on the failure, what does dmesg show? checksum errors?Dmesg at the time showed block errors on the RAID due to the multi-disk failure. I do have a log from that time which I have attached, including btrfs unhappiness at the time. Here is the oops I currently get on 3.2.2 when trying to mount the btrfs volume of the two that btrfs-show is able to detect: [ 1023.151683] device label vicep-library devid 1 transid 575931 /dev/mapper/tr5ut-vicep--library [ 1023.152136] btrfs: use lzo compression [ 1023.152174] btrfs: disk space caching is enabled [ 1023.191409] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 8E19212D level 0 [ 1023.211750] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 491D9C1A level 0 [ 1023.216243] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 8E19212D level 0 [ 1023.224252] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 491D9C1A level 0 [ 1023.224521] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 491D9C1A level 0 [ 1023.232211] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 8E19212D level 0 [ 1023.232456] btrfs: dm-32 checksum verify failed on 317874630656 wanted 28ABE8A6 found 491D9C1A level 0 [ 1023.232549] ------------[ cut here ]------------ [ 1023.232591] kernel BUG at fs/btrfs/disk-io.c:1203! [ 1023.232627] invalid opcode: 0000 [#1] SMP [ 1023.232723] CPU 1 [ 1023.232755] Modules linked in: ext2 ext4 jbd2 crc16 it87 hwmon_vid loop snd_hda_codec_hdmi tpm_tis tpm tpm_bios snd_hda_codec_realtek pcspkr evdev wmi snd_hda_intel i2c_piix4 i2c_core k8temp edac_core e dac_mce_amd snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc shpchp processor button thermal_sys pci_hotplug parport_pc parport ext3 jbd mbcache dm_snapshot aes_x86_64 aes_generic dm_crypt dm_mod raid1 md_mod nbd btrfs zlib_deflate crc32c libcrc32c xts gf128mul sg sr_mod cdrom sd_mod crc_t10dif ata_generic ohci_hcd pata_atiixp firewire_ohci ahci libahci firewire_core ehci_hcd libata tulip crc_itu_t scsi_mod usbcore floppy r8169 mii usb_common [last unloaded: scsi_wait_scan] [ 1023.235168] [ 1023.235203] Pid: 4829, comm: mount Not tainted 3.2.2 #3 Gigabyte Technology Co., Ltd. GA-MA78GPM-DS2H/GA-MA78GPM-DS2H [ 1023.235335] RIP: 0010:[<ffffffffa00c0a9a>] [<ffffffffa00c0a9a>] find_and_setup_root+0x5c/0xdc [btrfs] [ 1023.235437] RSP: 0018:ffff8801a7607b98 EFLAGS: 00010282 [ 1023.235473] RAX: 00000000fffffffe RBX: ffff8801a798b800 RCX: 0000000000000005 [ 1023.235510] RDX: 00000000fffffffb RSI: 000000000001af60 RDI: ffffea00069b1d40 [ 1023.235547] RBP: ffff8801a798f800 R08: ffffffffa00bc092 R09: 0000000000000000 [ 1023.235584] R10: ffff8801a798f800 R11: 0000000000000000 R12: 0000000000000002 [ 1023.235621] R13: ffff8801a7989400 R14: 000000000008c9bb R15: ffff8801a772f718 [ 1023.235659] FS: 00007fee836557e0(0000) GS:ffff8801afc40000(0000) knlGS:0000000000000000 [ 1023.235699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1023.235735] CR2: 00007fee836a4000 CR3: 00000001a7f41000 CR4: 00000000000006e0 [ 1023.235772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1023.235809] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1023.235846] Process mount (pid: 4829, threadinfo ffff8801a7606000, task ffff8801a7f521c0) [ 1023.235886] Stack: [ 1023.235920] 0000000000000002 000000000008c9bb ffff8801a772f000 ffff8801a798f800 [ 1023.236080] ffff8801a772d000 ffffffffa00c4436 0000000000000003 ffffffff817b16e0 [ 1023.236240] 0000000000001000 00001000811a57f0 ffff8801a7989680 ffff8801a798b800 [ 1023.236400] Call Trace: [ 1023.236448] [<ffffffffa00c4436>] ? open_ctree+0xf6c/0x1535 [btrfs] [ 1023.236489] [<ffffffff810ff823>] ? sget+0x39a/0x3ac [ 1023.236501] [<ffffffffa00a9fb5>] ? btrfs_mount+0x3a2/0x539 [btrfs] [ 1023.236501] [<ffffffff810d54ed>] ? pcpu_next_pop+0x37/0x43 [ 1023.236501] [<ffffffff810d50f3>] ? cpumask_next+0x18/0x1a [ 1023.236501] [<ffffffff810d6502>] ? pcpu_alloc+0x875/0x8be [ 1023.236501] [<ffffffff810ff3ab>] ? mount_fs+0x6c/0x14a [ 1023.236501] [<ffffffff81113715>] ? vfs_kern_mount+0x61/0x97 [ 1023.236501] [<ffffffff81114a2c>] ? do_kern_mount+0x49/0xd6 [ 1023.236501] [<ffffffff811151e1>] ? do_mount+0x728/0x792 [ 1023.236501] [<ffffffff810ee478>] ? alloc_pages_current+0xa7/0xc9 [ 1023.236501] [<ffffffff811152d3>] ? sys_mount+0x88/0xc3 [ 1023.236501] [<ffffffff81341152>] ? system_call_fastpath+0x16/0x1b [ 1023.236501] Code: 24 24 e8 23 f5 ff ff 48 8d 53 20 48 8d 8b 0f 01 00 00 4c 89 e6 48 89 ef e8 fc b4 ff ff 89 c2 b8 fe ff ff ff 83 fa 00 7f 79 74 04 <0f> 0b eb fe 80 bb 0e 01 00 00 00 48 8b ab c0 00 00 00 75 08 8b [ 1023.236501] RIP [<ffffffffa00c0a9a>] find_and_setup_root+0x5c/0xdc [btrfs] [ 1023.236501] RSP <ffff8801a7607b98> [ 1023.239093] ---[ end trace 5cc1f71c489542ef ]---> btrfsck is not ready for data recovery, but only for error checking. > But btrfs-tools do have some features that may help us, e.g zero-log. > > More recovery details refer to the thread from Hugo: > http://www.spinics.net/lists/linux-btrfs/msg14890.htmlI''ll take a look at the ''restore'' tool, thanks. -- Ryan C. Underwood, <nemesis@icequake.net>
On Tue, Feb 07, 2012 at 08:36:15AM -0600, Mitch Harder wrote:> > Since you''re getting "failed to read /dev/sr0" messages, that might be > an indication there are some newer btrfs-progs tools available. > > You might want to try the building btrfs-progs from the git repository: > http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=summaryI did so, here''s the new output (not much changed) # /usr/local/btrfs-progs/bin/btrfs-show ** ** WARNING: this program is considered deprecated ** Please consider to switch to the btrfs utility ** failed to read /dev/sr0: No medium found Label: vicep-library uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 Total devices 1 FS bytes used 254.35GB devid 1 size 1.00TB used 299.04GB path /dev/dm-32 Btrfs Btrfs v0.19 # /usr/local/btrfs-progs/bin/btrfs device scan Scanning for Btrfs filesystems failed to read /dev/sr0> > There are some recovery tools there that may extract your data (look > at the "recover" program).I found a ''restore'' program, are you referring to the mount option ''-o recovery''? -- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Output of btrfs-zero-log attempts similar to btrfsck. # ./btrfs-zero-log /dev/mapper/tr5ut-vicep--clones No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones # ./btrfs-zero-log /dev/mapper/tr5ut-vicep--library checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 Csum didn''t match btrfs-zero-log: root-tree.c:46: btrfs_find_last_root: Assertion `!(path->slots[0] == 0)'' failed. Aborted -- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Output of ''restore'': # /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--clones /mnt2 No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones Could not open root, trying backup super Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=3150973834573588028 Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=13791616683601169802 read block failed check_tree_block Couldn''t read tree root Could not open root, trying backup super Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=3150973834573588028 Check tree block failed, want=298807296, have=13791616683601169802 Check tree block failed, want=298807296, have=13791616683601169802 read block failed check_tree_block Couldn''t read tree root Could not open root, trying backup super [...here the output ends, seems to not complete?] # /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--library /mnt2 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 Csum didn''t match restore: root-tree.c:46: btrfs_find_last_root: Assertion `!(path->slots[0] == 0)'' failed. Aborted -- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 07 February 2012 20:53:59 Duncan wrote:> Kernel 3.2.2 is relatively recent altho you could > try the latest 3.3 rc or git kernel as wellPlease keep in mind that work done in git does not appear to get backported to the stable updates for releases (such as 3.2.x), in other words you''ll have the same btrfs code as in the first 3.2 release. You will need to use RC''s (or git) for the current btrfs kernel code.> Particularly when it''s used on top of a dmcrypt layer that btrfs was > known to have issues withI believe the issues between btrfs and dm-crypt have been sorted out as of 3.2 (going on an earlier posting of Chris Masons). Returning to the OP''s case, I''m surprised that ext4 is able to get anything back and I''d say that''s a testament to its long development life (ext->ext2->ext3->ext4) in comparison to btrfs. If that happened on a system I was sysadmin''ing (and it has - losing an entire tray of drives in a RAID array due to controller firmware bugs really spoils your day) I''d be reaching for the backup tapes about now. Best of luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC This email may come with a PGP signature as a file. Do not panic. For more info see: http://en.wikipedia.org/wiki/OpenPGP
So, I examined the below filesystem, the one of the two that I would really like to restore. There is basically nothing but zeros, and very occasionally a sparse string of data, until exactly 0x200000 offset, at which point the data is suddenly very packed and looks like usual compressed data should. Is there a way one could de-LZO the data chunkwise and dump to another device so I could even get an idea what I am looking at? What about a ''superblock'' signature I can scan for?> # /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--library /mnt2 > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 > checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 > Csum didn''t match > restore: root-tree.c:46: btrfs_find_last_root: Assertion > `!(path->slots[0] == 0)'' failed. > Aborted-- Ryan C. Underwood, <nemesis@icequake.net> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Feb 12, 2012 at 10:31:34AM -0600, Ryan C. Underwood wrote:> So, I examined the below filesystem, the one of the two that I would > really like to restore. There is basically nothing but zeros, and > very occasionally a sparse string of data, until exactly 0x200000 > offset,This matches start of an allocation cluster.> ... at which point the data is suddenly very packed and looks like > usual compressed data should. Is there a way one could de-LZO the > data chunkwise and dump to another device so I could even get an idea > what I am looking at?If the blocks are in right order, you can decompress the raw data from the format [4B total length] [4B compressed chunk length][chunk data] [another chunk] there is no signature of the compressed extent boundaries, but the lengths stored are always smaller than 128K, so it''s hex values like 23 04 00 00 | 34 01 00 00 | <lzo data...> and shoud be detectable in the block sequence.> What about a ''superblock'' signature I can scan > for?_BHRfS_M at offset 0x40 in a 4kb aligned block david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I made a little bit of progress recovering this mess, seems btrfs-progs has improved since I last tried. # ./btrfs-find-root /dev/mapper/tr5ut-vicep--library [..] Well block 317865713664 seems great, but generation doesn''t match, have=574372, want=575931 Well block 317874491392 seems great, but generation doesn''t match, have=575930, want=575931 Found tree root at 317874626560 Seems like this is a good sign that btrfs-find-root was able to find the root. But I''m still stuck on this trying to run btrfs-restore: # ./btrfs-restore -v -i -u 1 -t 317874626560 /dev/mapper/tr5ut-vicep--library . checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 checksum verify failed on 317874630656 wanted 491D9C1A found FFFFFFA6 checksum verify failed on 317874630656 wanted 8E19212D found FFFFFFA6 Csum didn''t match btrfs-restore: disk-io.c:441: find_and_setup_root: Assertion `!(ret)'' failed. Aborted It seems like -i should ignore the csum mismatch, what am I missing? -- Ryan C. Underwood, <nemesis@icequake.net>
Finally made some more progress on one of my melted down btrfs from earlier this year. First I hacked find-root.c to not stop scanning the disk when it thinks it has found the real root. I wanted it to print out all possible roots. I saved the stderr output to a logfile. About 1226 possible roots were found. Then I used btrfs-restore, iterating over each one of these to attempt to use each one of them as the root and see what files could be found: for temp in `sed ''s/^.*block \([0-9]\+\).*$/\1/'' log`; do echo $temp; nice ./btrfs-restore -t $temp /dev/mapper/tr5ut-vicep--library /mnt/recovery; done In this way I was able to recover about 36GB of data and the directory structure of what is recovered looks fine. The data also looks fine too by scanning MIME types with "file" and selecting a few text or HTML files to check manually. There is still a lot of data missing though. If I am reading this correctly there was about 300GB of data which compressed to 254GB on-disk. Label: ''vicep-library'' uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 Total devices 1 FS bytes used 254.35GB devid 1 size 1.00TB used 299.04GB path /dev/dm-27 A lot of my btrfs-restore output looks like this: 318259351552 parent transid verify failed on 318259351552 wanted 575931 found 546662 parent transid verify failed on 318259351552 wanted 575931 found 546662 parent transid verify failed on 318259351552 wanted 575931 found 546662 parent transid verify failed on 318259351552 wanted 575931 found 546662 Ignoring transid failure parent transid verify failed on 318125375488 wanted 541528 found 572360 parent transid verify failed on 318125375488 wanted 541528 found 572360 parent transid verify failed on 318125375488 wanted 541528 found 572360 parent transid verify failed on 318125375488 wanted 541528 found 572360 Ignoring transid failure parent transid verify failed on 561016832 wanted 544038 found 574369 parent transid verify failed on 561016832 wanted 544038 found 574369 parent transid verify failed on 561016832 wanted 544038 found 574369 parent transid verify failed on 561016832 wanted 544038 found 574369 Ignoring transid failure leaf parent key incorrect 561016832 Root objectid is 5 parent transid verify failed on 164073472 wanted 544650 found 562972 parent transid verify failed on 164073472 wanted 544650 found 562972 parent transid verify failed on 164073472 wanted 544650 found 562972 parent transid verify failed on 164073472 wanted 544650 found 562972 Ignoring transid failure leaf parent key incorrect 164073472 Error searching -1 As far as I can see only the #5 root object was found, at least I don''t see any others found in the output. This could account for the missing data. How could I get to the other root objects? -- Ryan C. Underwood, <nemesis@icequake.net>
On 15 Nov 2012 10:00 -0600, from nemesis-lists@icequake.net (Ryan C. Underwood):> There is still a lot of data missing though. If I am reading this > correctly there was about 300GB of data which compressed to 254GB > on-disk. > > Label: ''vicep-library'' uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851 > Total devices 1 FS bytes used 254.35GB > devid 1 size 1.00TB used 299.04GB path /dev/dm-27Isn''t that the other way around? The file system has allocated a total of 299 GB across devid 1 (to system, metadata and data), and of those, 254 GB are in actual use currently. Remember btrfs "overallocates" so it doesn''t have to constantly allocate more space for each separate kind of usage as data is added to the file system. That does not radically alter your conclusion, though. 36 GB is still only a small fraction of the total amount of data stored, assuming that the btrfs tool output you showed can be trusted. -- Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html