thr3ads.net - Btrfs devel - Recovery advice [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Sandy McArthur

2013-Jul-26 20:31 UTC

Recovery advice

I have a 4 disk RAID1 setup that fails to {mount,btrfsck} when disk 4
is connected.

With disk 4 attached btrfsck errors with:
btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion
`!(path->slots[0] == 0)'' failed
(I''d have to reboot in a non-functioning state to get the full output.)

I can mount the filesystem in a degraded state with the 4th drive
removed. I believe there is some data corruption as I see lines in
/var/log/messages from the degraded,ro filesystem like this:

BTRFS info (device sdd1): csum failed ino 4433 off 3254538240 csum
1033749897 private 2248083221

I''m at the point where all I can think to do is wipe disk 4 and then
add it back in. Is there anything else I should try first. I have
booted btrfs-next with the latest btrfs-progs.

Thanks.
--
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kai Krakow

2013-Aug-04 12:41 UTC

head link

Re: Recovery advice

Sandy McArthur <sandymac@gmail.com> schrieb:
> I have a 4 disk RAID1 setup that fails to {mount,btrfsck} when disk 4
> is connected.
> 
> With disk 4 attached btrfsck errors with:
> btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion
> `!(path->slots[0] == 0)'' failed
> (I''d have to reboot in a non-functioning state to get the full
output.)
> 
> I can mount the filesystem in a degraded state with the 4th drive
> removed. I believe there is some data corruption as I see lines in
> /var/log/messages from the degraded,ro filesystem like this:
> 
> BTRFS info (device sdd1): csum failed ino 4433 off 3254538240 csum
> 1033749897 private 2248083221
> 
> I''m at the point where all I can think to do is wipe disk 4 and
then
> add it back in. Is there anything else I should try first. I have
> booted btrfs-next with the latest btrfs-progs.
It is a RAID-1 so why bother with the faulty drive? Just wipe it, put it 
back in, then run a btrfs balance... There should be no data loss because 
all data is stored twice (two-way mirroring).

Regards,
Kai

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Duncan

2013-Aug-04 22:19 UTC

head link

Re: Recovery advice

Kai Krakow posted on Sun, 04 Aug 2013 14:41:54 +0200 as excerpted:
> It is a RAID-1 so why bother with the faulty drive? Just wipe it, put it
> back in, then run a btrfs balance... There should be no data loss
> because all data is stored twice (two-way mirroring).
The caveat would be if it didn''t start as btrfs raid1, and
there''s still
some data (or possibly metadata if it was the single drive at one point 
or they''re ssds, as btrfs defaults to metadata single in ssd mode) that
hasn''t been duped elsewhere.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kai Krakow

2013-Aug-04 23:05 UTC

head link

Re: Recovery advice

Duncan <1i5t5.duncan@cox.net> schrieb:
>> It is a RAID-1 so why bother with the faulty drive? Just wipe it, put
it
>> back in, then run a btrfs balance... There should be no data loss
>> because all data is stored twice (two-way mirroring).
> 
> The caveat would be if it didn''t start as btrfs raid1, and
there''s still
> some data (or possibly metadata if it was the single drive at one point
> or they''re ssds, as btrfs defaults to metadata single in ssd mode)
that
> hasn''t been duped elsewhere.
Oh... That''s actually a pitfall... :-\

Note to myself: Ensure balance has been run successfully and completely.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Murphy

2013-Aug-04 23:13 UTC

head link

Re: Recovery advice

On Aug 4, 2013, at 4:19 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Kai Krakow posted on Sun, 04 Aug 2013 14:41:54 +0200 as excerpted:
> 
>> It is a RAID-1 so why bother with the faulty drive? Just wipe it, put
it
>> back in, then run a btrfs balance... There should be no data loss
>> because all data is stored twice (two-way mirroring).
> 
> The caveat would be if it didn''t start as btrfs raid1, and
there''s still
> some data (or possibly metadata if it was the single drive at one point 
> or they''re ssds, as btrfs defaults to metadata single in ssd mode)
that
> hasn''t been duped elsewhere.
I agree. I think tossing the data on the problematic device is a bit of a
hammer. It may be necessary, but I don''t think enough information has
been provided to conclusively determine all other options have been explored.
What kernel versions have been used? What does dmesg record beginning at the
time of a normal mount attempt with all four devices available? What does
btrfsck (without repair) report? Are there any prior kernel messages related to
the controller or libata messages related to the suspect drive? What''s
the smartctl -x output for the suspect drive? Has mounting with -o recovery been
attempted, and if so what were the messages recorded?

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sandy McArthur

2013-Aug-05 15:44 UTC

head link

Re: Recovery advice

FYI: I ended up wipefs''ing the drive and adding it back in. I also has
to abort the residual balance process to get the filesystem back to a
state where I could add disk. I didn''t realize this until after wiping
the drive. Maybe if I''d known to look for that I could have recovered
the drive before the wipe. Anyway, all seems fine now and I''m not mix
and matching connection types.

More History:
The filesystem came to a failed state during a balance just after
adding the problem disk. This disk also had been installed inside the
case on SATA instead of inside an external multi-drive enclosure. My
thoughts at the time (now known to be semi-faulty) were it would be
faster to push data into the disk that way. When the machine
hardlocked this one drive was different enough from the other 3 I
simply could not get btrfs to work with all four disks at once.

On Sun, Aug 4, 2013 at 7:05 PM, Kai Krakow <hurikhan77+btrfs@gmail.com>
wrote:> Duncan <1i5t5.duncan@cox.net> schrieb:
>
>>> It is a RAID-1 so why bother with the faulty drive? Just wipe it,
put it
>>> back in, then run a btrfs balance... There should be no data loss
>>> because all data is stored twice (two-way mirroring).
>>
>> The caveat would be if it didn''t start as btrfs raid1, and
there''s still
>> some data (or possibly metadata if it was the single drive at one point
>> or they''re ssds, as btrfs defaults to metadata single in ssd
mode) that
>> hasn''t been duped elsewhere.
>
> Oh... That''s actually a pitfall... :-\
>
> Note to myself: Ensure balance has been run successfully and completely.
>
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Jul 2013 - Recovery advice

Recovery advice

Re: Recovery advice

Re: Recovery advice

Re: Recovery advice

Re: Recovery advice

Re: Recovery advice