thr3ads.net - zfs discuss - [zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors [May 2010]

If this information is useful, please help other people find it:
Share via:

Rob Levy

2010-May-20 13:52 UTC

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

Folks I posted this question on (OpenSolaris - Help) without any replies
http://opensolaris.org/jive/thread.jspa?threadID=129436&tstart=0 and am
re-posting here in the hope someone can help ... I have updated the wording a
little too (in an attempt to clarify)

I currently use OpenSolaris on a Toshiba M10 laptop.

One morning the system wouldn''t boot OpenSolaris 2009.06 (it was simply
unable progress to the second stage grub). On further investigation I discovered
the hdd partition slice with rpool appeared to have bad sectors.

Faced with either a rebuild or an attempt at recovery, I first made an attempt
to recover the slice before rebuilding.

The c7t0d0 HDD (p0) was divided into p1 (NTFS 24GB), p2 (OpenSolaris 24GB), p3
(OpenSolaris zfs pool for data 160GB) and p4 (50GB extended with 32GB pcfs, 12GB
linux and linux swap) partitions (or something close to that). On the first
Solaris partition (p2), slice 0 was the OpenSolaris rpool zpool.

To attempt recovery I booted the OpenSolaris 2009.06 live CD and was able to
import the ZFS pool which was configured on p3. On the p2 device (Solaris boot
partition which wouldn''t boot) I then ran dd if=/dev/rdsk/c7t0d0s2
bs=512 conv=sync, noerror of=/p0/s2image.dd.

Due to sector read error timeouts, this took longer than my maintenance window
allowed and I ended up aborting the attempt with a significant amount of sectors
already captured.

On block examination of this (so far) captured image.dd, I noticed the first two
s0 vdev labels appeared to be intact. I then skipped the expected number of s2
sectors to get to the s0 start and copied blocks to attempt to reconstruct the
s0 rpool (against this I ran zdb -l which reported the first two labels) and
gave me the encouragement necessary to continue the exercise.

At the next opportunity I ran the command again using the skip directive to
capture the balance of slice. The result was that I had two files (images)
comprising the good c7t0d0s0 sectors (with I expect the bad padded) Ie. an
s0image_start.dd and s0image_end.dd

As mentioned at this stage I was able to run ''zfs -l
s0image_start.dd'' and see the first two vdev labels and ''zfs
-l s0image_end.dd'' and see the last two vdev labels.

I then combined the two files (I tried various approaches eg. cat and dd with
the append directive) however only the first two vdev labels appear to be
readable in the resulting s0image_s0.dd? The resulting file size, which I expect
is largely good sectors with padding for bad sectors, matches that of the
prtvtoc s0 sectors count multiplied by 512.

Can anyone advise .. why I am unable to read the third and forth vdev labels
once the start and end files are combined?

Is there another approach that may prove more fruitful?

Once I have the file (with labels being in the correct places) I was intending
to attempt to import the vdev zpool as rpool2 or attempt any repair procedures I
could locate (as far as was possible anyway) to see what data could be recovered
(besides it was an opportunity to get another close look at ZFS).

Incidentally *only* the c7t0d0s0 slice appeared to have bad sectors (I do wonder
what the significance is of this?).
-- 
This message posted from opensolaris.org

Roy Sigurd Karlsbakk

2010-May-20 14:59 UTC

head link

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

----- "Rob Levy" <Rob.Levy at oracle.com> skrev:
> Folks I posted this question on (OpenSolaris - Help) without any
> replies
> http://opensolaris.org/jive/thread.jspa?threadID=129436&tstart=0 and
> am re-posting here in the hope someone can help ... I have updated the
> wording a little too (in an attempt to clarify)
> 
> I currently use OpenSolaris on a Toshiba M10 laptop.
> 
> One morning the system wouldn''t boot OpenSolaris 2009.06 (it was
> simply unable progress to the second stage grub). On further
> investigation I discovered the hdd partition slice with rpool appeared
> to have bad sectors.
I would recommend against debugging a filesystem like you have described here.
If you have bad sectors on a drive, get a new drive, connect the other drive
(directly or with an USB dock or something), import the pools, move the data
(rsync if you just want the data or zfs send/receive if you also want the
snapshots etc). This might take a while with bad sectors and disk timeouts, but
you''ll get (most of?) your data moved over without much hassle.

Just my two c.
 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Rob Levy

2010-May-25 13:26 UTC

head link

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

Roy,

Thanks for your reply. 

I did get a new drive and attempted the approach (as you have suggested pre your
reply) however once booted off the OpenSolaris Live CD (or the rebuilt new
drive), I was not able to import the rpool (which I had established had sector
errors). I expect I should have had some success if the vdev labels were intact
(I currently suspect some critical boot files are impacted by bad sectors
resulting in failed boot attempts from that partition slice). Unfortunately, I
didn''t keep a copy of the messages (if any - I have tried many
permutations since).

At my last attempt ... I installed knoppix (debian) on one of the partitions 
(also allowed access to smartctl and hdparm too - I was hoping to reduce the
read timeout to speed up the exercise), then added zfs-fuse (to access the space
I will use to stage the recovery file) and added dd_rescue and gnu ddrescue
packages. smartctl appears not to be able to manage the disk while attached to
usb (but I am guessing because don''t have much experience with it).

At this point I attempted dd_rescue to create an image of the partition with bad
sectors (hoping there were efficiencies beyong normal dd) but it was at 5.6GB in
36 hours, so again I needed to abort however it does log the blocks attempted so
far so hopefully I can skip past them when I next get an opportunity. Although
it does now appear that gnu ddrescue is the preferred of the two utilities which
I may opt to use to look at creating an image of the partition before attempting
recovery of the slice (rpool).

As an aside, I noticed that the knoppix ''dmesg | grep sd''
command which reflects the primary partition devices, no longer appears to
reflect the solaris partition (p2) slice devices (as it would the extended p4
partitions logical partition devices configured). I suspect due to this, the
rpool (one of the solaris partition slices) appears not to be detected by the
knoppix zfs-fuse ''zpool import'' (although I can access the
zpool which exists on partition p3). I wonder if this is related to the
transition from ufs to zfs?
-- 
This message posted from opensolaris.org

Apparently Analagous Threads

Search for more seemingly similar threads

zfs discuss - May 2010 - reconstruct recovery of rpool zpool and zfs file system with bad sectors

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

Apparently Analagous Threads