thr3ads.net - zfs discuss - [zfs-discuss] raidz recovery [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Gareth de Vaux

2010-Dec-11 08:34 UTC

[zfs-discuss] raidz recovery

Hi all, I''m trying to simulate a drive failure and recovery on a
raidz array. I''m able to do so using ''replace'', but
this requires
an extra disk that was not part of the array. How do you manage when
you don''t have or need an extra disk yet?

For example when I ''dd if=/dev/zero of=/dev/ad6'', or
physically remove
the drive for awhile, then ''online'' the disk, after it
resilvers I''m
typically left with the following after scrubbing:

root at file:~# zpool status
  pool: pool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using ''zpool clear'' or replace the device with
''zpool replace''.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 0h0m with 0 errors on Fri Dec 10 23:45:56 2010
config:

	NAME        STATE     READ WRITE CKSUM
	pool        ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    ad12    ONLINE       0     0     0
	    ad13    ONLINE       0     0     0
	    ad4     ONLINE       0     0     0
	    ad6     ONLINE       0     0     7

errors: No known data errors


http://www.sun.com/msg/ZFS-8000-9P lists my above actions as a cause for this
state and rightfully doesn''t think them serious. When I
''clear'' the errors
though and offline/fault another drive, and then reboot, the array faults.
That tells me ad6 was never fully integrated back in. Can I tell the array
to re-add ad6 from scratch? ''detach'' and
''remove'' don''t work for raidz.
Otherwise I need to use ''replace'' to get out of this
situation.

My system:

root at file:~# uname -a
FreeBSD file 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #0: Sun Nov 28 13:36:08 SAST
2010     root at file:/usr/obj/usr/src/sys/COWNEL  amd64
root at file:~# dmesg | grep ZFS
ZFS filesystem version 4
ZFS storage pool version 15

Marion Hakanson

2010-Dec-14 00:41 UTC

head link

[zfs-discuss] raidz recovery

zfs at lordcow.org said:> For example when I ''dd if=/dev/zero of=/dev/ad6'', or
physically remove the
> drive for awhile, then ''online'' the disk, after it
resilvers I''m typically
> left with the following after scrubbing:
> 
> root at file:~# zpool status
>   pool: pool
>  state: ONLINE status: One or more devices has experienced an unrecoverable
> error.  An
> 	attempt was made to correct the error.  Applications are unaffected.
action:
> Determine if the device needs to be replaced, and clear the errors
> 	using ''zpool clear'' or replace the device with
''zpool replace''.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: scrub completed after 0h0m with 0 errors on Fri Dec 10 23:45:56
2010
> config:
> 
> 	NAME        STATE     READ WRITE CKSUM
> 	pool        ONLINE       0     0     0
> 	  raidz1    ONLINE       0     0     0
> 	    ad12    ONLINE       0     0     0
> 	    ad13    ONLINE       0     0     0
> 	    ad4     ONLINE       0     0     0
> 	    ad6     ONLINE       0     0     7
> 
> errors: No known data errors
> 
> http://www.sun.com/msg/ZFS-8000-9P lists my above actions as a cause for
this
> state and rightfully doesn''t think them serious. When I
''clear'' the errors
> though and offline/fault another drive, and then reboot, the array faults.
> That tells me ad6 was never fully integrated back in. Can I tell the array
to
> re-add ad6 from scratch? ''detach'' and
''remove'' don''t work for raidz.
> Otherwise I need to use ''replace'' to get out of this
situation.

After you "clear" the errors, do another "scrub" before
trying anything
else.  Once you get a complete scrub with no new errors (and no checksum
errors), you should be confident that the damaged drive has been fully
re-integrated into the pool.

Regards,

Marion

Bruno Sousa

2010-Dec-14 08:34 UTC

head link

[zfs-discuss] Resilver misleading output

Hello everyone,

I have a pool consisting of 28 1TB sata disks configured in 15*2 vdevs
raid1 (2 disks per mirror)2 SSD in miror for the ZIL and 3 SSD''s for
L2ARC,
and recently i added two more disks.
For some reason the resilver process kicked in, and the system is
noticeable slower, but i''m clueless to what should i do , because the
zpool
status says that the resilver process has finished.

This system is running opensolaris snv_134, has 32GB of memory, and
here''s
the zpool output

zpool status -xv vol0
  pool: vol0
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 13h24m, 100.00% done, 0h0m to go
config:

zpool iostat snip

mirror-12                          ONLINE       0     0     0
            c8t5000C5001A11A4AEd0            ONLINE       0     0     0
            c8t5000C5001A10CFB7d0            ONLINE       0     0     0 
1.71G resilvered
          mirror-13                          ONLINE       0     0     0
            c8t5000C5001A0F621Dd0            ONLINE       0     0     0
            c8t5000C50019EB3E2Ed0            ONLINE       0     0     0
          mirror-14                          ONLINE       0     0     0
            c8t5000C5001A0F543Dd0            ONLINE       0     0     0
            c8t5000C5001A105D8Cd0            ONLINE       0     0     0
          mirror-15                          ONLINE       0     0     0
           c8t5000C5001A0FEB16d0  ONLINE       0     0     0
           c8t5000C50019C1D460d0            ONLINE       0     0     0 
4.06G resilvered


Any idea for this type of situation?

Thanks,
Bruno




-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Giovanni Tirloni

2010-Dec-14 09:58 UTC

head link

[zfs-discuss] Resilver misleading output

On Tue, Dec 14, 2010 at 6:34 AM, Bruno Sousa <bsousa at epinfante.com>
wrote:
> Hello everyone,
>
> I have a pool consisting of 28 1TB sata disks configured in 15*2 vdevs
> raid1 (2 disks per mirror)2 SSD in miror for the ZIL and 3 SSD''s
for L2ARC,
> and recently i added two more disks.
> For some reason the resilver process kicked in, and the system is
> noticeable slower, but i''m clueless to what should i do , because
the zpool
> status says that the resilver process has finished.
>
> This system is running opensolaris snv_134, has 32GB of memory, and
here''s
> the zpool output
>
> zpool status -xv vol0
>  pool: vol0
>  state: ONLINE
> status: One or more devices is currently being resilvered.  The pool will
>        continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 13h24m, 100.00% done, 0h0m to go
> config:
>
> zpool iostat snip
>
> mirror-12                          ONLINE       0     0     0
>            c8t5000C5001A11A4AEd0            ONLINE       0     0     0
>            c8t5000C5001A10CFB7d0            ONLINE       0     0     0
> 1.71G resilvered
>          mirror-13                          ONLINE       0     0     0
>            c8t5000C5001A0F621Dd0            ONLINE       0     0     0
>            c8t5000C50019EB3E2Ed0            ONLINE       0     0     0
>          mirror-14                          ONLINE       0     0     0
>            c8t5000C5001A0F543Dd0            ONLINE       0     0     0
>            c8t5000C5001A105D8Cd0            ONLINE       0     0     0
>          mirror-15                          ONLINE       0     0     0
>           c8t5000C5001A0FEB16d0  ONLINE       0     0     0
>           c8t5000C50019C1D460d0            ONLINE       0     0     0
> 4.06G resilvered
>
>
> Any idea for this type of situation?
>

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6899970

-- 
Giovanni Tirloni
gtirloni at sysdroid.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101214/136e75cb/attachment.html>

Lin Ling

2010-Dec-14 18:36 UTC

head link

[zfs-discuss] Resilver misleading output

On Dec 14, 2010, at 1:58 AM, Giovanni Tirloni wrote:
> 
> 
> On Tue, Dec 14, 2010 at 6:34 AM, Bruno Sousa <bsousa at
epinfante.com> wrote:
> Hello everyone,
> 
> I have a pool consisting of 28 1TB sata disks configured in 15*2 vdevs
> raid1 (2 disks per mirror)2 SSD in miror for the ZIL and 3 SSD''s
for L2ARC,
> and recently i added two more disks.
> For some reason the resilver process kicked in, and the system is
> noticeable slower, but i''m clueless to what should i do , because
the zpool
> status says that the resilver process has finished.
> 
> This system is running opensolaris snv_134, has 32GB of memory, and
here''s
> the zpool output
> 
> zpool status -xv vol0
>  pool: vol0
>  state: ONLINE
> status: One or more devices is currently being resilvered.  The pool will
>        continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 13h24m, 100.00% done, 0h0m to go
> config:
> 
> zpool iostat snip
> 
> mirror-12                          ONLINE       0     0     0
>            c8t5000C5001A11A4AEd0            ONLINE       0     0     0
>            c8t5000C5001A10CFB7d0            ONLINE       0     0     0
> 1.71G resilvered
>          mirror-13                          ONLINE       0     0     0
>            c8t5000C5001A0F621Dd0            ONLINE       0     0     0
>            c8t5000C50019EB3E2Ed0            ONLINE       0     0     0
>          mirror-14                          ONLINE       0     0     0
>            c8t5000C5001A0F543Dd0            ONLINE       0     0     0
>            c8t5000C5001A105D8Cd0            ONLINE       0     0     0
>          mirror-15                          ONLINE       0     0     0
>           c8t5000C5001A0FEB16d0  ONLINE       0     0     0
>           c8t5000C50019C1D460d0            ONLINE       0     0     0
> 4.06G resilvered
> 
> 
> Any idea for this type of situation?
> 
> 
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6899970

If you have snapshot deletion/creation on going, then see 6981250/6891824.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101214/1575d68d/attachment.html>

Gareth de Vaux

2010-Dec-15 13:29 UTC

head link

[zfs-discuss] raidz recovery

On Mon 2010-12-13 (16:41), Marion Hakanson wrote:> After you "clear" the errors, do another "scrub" before
trying anything
> else.  Once you get a complete scrub with no new errors (and no checksum
> errors), you should be confident that the damaged drive has been fully
> re-integrated into the pool.
Ok I did a scrub after zero''ing, and the array came back clean,
apparently, but
same final result - the array faults as soon as I ''offline'' a
different vdev.
The zero''ing is just a pretend-the-errors-aren''t-there
directive, and the scrub
seems to be listening to that. What I need in this situation is a way to
prompt ad6 to resilver from scratch.

Btw I can reproduce this behaviour every time. I can also produce
faultless behaviour by offlining and then onlining, or replacing disks
repeatedly, as expected.

Tuomas Leikola

2010-Dec-18 12:55 UTC

head link

[zfs-discuss] raidz recovery

On Wed, Dec 15, 2010 at 3:29 PM, Gareth de Vaux <zfs at lordcow.org>
wrote:> On Mon 2010-12-13 (16:41), Marion Hakanson wrote:
>> After you "clear" the errors, do another "scrub"
before trying anything
>> else. ?Once you get a complete scrub with no new errors (and no
checksum
>> errors), you should be confident that the damaged drive has been fully
>> re-integrated into the pool.
>
> Ok I did a scrub after zero''ing, and the array came back clean,
apparently, but
> same final result - the array faults as soon as I
''offline'' a different vdev.
> The zero''ing is just a pretend-the-errors-aren''t-there
directive, and the scrub
> seems to be listening to that. What I need in this situation is a way to
> prompt ad6 to resilver from scratch.
>
I think scrub doesn''t replace all superblocks or other stuff not in
the active dataset but rather some drive labels.

have you tried zpool replace? like remove ad6, fill with zeroes,
replace, command "zpool replace tank ad6". That should simulate drive
failure and replacement with a new disk.

-- 
- Tuomas

Gareth de Vaux

2010-Dec-18 17:36 UTC

head link

[zfs-discuss] raidz recovery

On Sat 2010-12-18 (14:55), Tuomas Leikola wrote:> have you tried zpool replace? like remove ad6, fill with zeroes,
> replace, command "zpool replace tank ad6". That should simulate
drive
> failure and replacement with a new disk.
''replace'' requires a different disk to replace with.

How do you "remove ad6"?

Gareth de Vaux

2010-Dec-21 08:51 UTC

head link

[zfs-discuss] raidz recovery

Hi, I''m copying the list - assume you meant to send it there.

On Sun 2010-12-19 (15:52), Miles Nordin wrote:> If ''zpool replace /dev/ad6'' will not accept that the disk
is a
> replacement, then You can unplug the disk, erase the label in a
> different machine using
> 
> dd if=/dev/zero of=/dev/thedisk bs=512 count=XXX
> dd if=/dev/zero of=/dev/thedisk bs=512 count=XXX seek=YYY
> 
> then plug it back into its old spot and issue ''zpool replace
/dev/ad6''
> 
> XXX should be about a mbyte worth of sectors, and YYY should be the
> LBA of about 1mbyte from the end of the disk.  You can read or
> experiment to determine the exact values.  you do need to know the
> size of your disk in sectors though.  There''s a copy of the EFI
label
> at the end of the disk and another at the beginning, which is why you
> have to do this.
Awesome, that does the trick thanx. I assumed it was identifying the
disk by serial number or something. I don''t need to unplug the disk
though, it works if I zero it from the same machine.

This should probably be implemented as a zpool function, if it hasn''t
already been in later versions.
> In general especially when a disk has corrupt data on it rather than
> unreadable sectors it''s best to do the replacement in a way that
the
> old and new disks are available simultaneously, because ZFS will use
> the old disk sometimes in places where the old disk is correct.  If
> you take away the old disk, then the old disk can''t be used at all
> even when it''s correct, so if there are a few spots where there
are
> problems with the other good disks in the raidz you will not be able
> to recover that, while with a suspect old disk you could.  OTOH if the
> old disk has unreadable sectors, the controller and ZFS will freeze
> whenever it touches those unreadable sectors, causing the replacement
> to take forever.  This is kind of bullshit and should be solved with
> software IMNSHO, but it''s how things are, so if you have a
physically
> failing disk I would suggest running the replace/resilver with the
> physically failing disk physically removed (while if the disk has bad
> data on it and is not physically failing i suggest keeping it
> ocnnected somehow).  so...yeah...if there is corrupt data on this
> disk, you''ll have to buy another disk to follow my advice in this
> paragraph.  you can go ahead and break the advice, wipe the label,
> replace, though.
Noted. Though if "there are a few spots where there are problems with
the other good disks" ZFS should know about them right?

zfs discuss - Dec 2010 - raidz recovery

[zfs-discuss] raidz recovery

[zfs-discuss] raidz recovery

[zfs-discuss] Resilver misleading output

[zfs-discuss] Resilver misleading output

[zfs-discuss] Resilver misleading output

[zfs-discuss] raidz recovery

[zfs-discuss] raidz recovery

[zfs-discuss] raidz recovery

[zfs-discuss] raidz recovery