thr3ads.net - zfs discuss - [zfs-discuss] Problems replacing a failed drive. [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Michael Stalnaker

2008-Feb-29 09:02 UTC

[zfs-discuss] Problems replacing a failed drive.

I have a 24 disk SATA array running on Open Solaris Nevada, b78. We had a
drive fail, and I?ve replaced the device but can?t get the system to
recognize that I replaced the drive.

Zpool status ?v shows the failed drive:

[mstalnak at mondo4 ~]$ zpool status -v
  pool: LogData
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use ''zpool clear'' to
mark the device
        repaired.
 scrub: resilver completed with 0 errors on Wed Feb 27 11:51:45 2008
config:

        NAME         STATE     READ WRITE CKSUM
        LogData      DEGRADED     0     0     0
          raidz2     DEGRADED     0     0     0
            c0t12d0  ONLINE       0     0     0
            c0t5d0   ONLINE       0     0     0
            c0t0d0   ONLINE       0     0     0
            c0t4d0   ONLINE       0     0     0
            c0t8d0   ONLINE       0     0     0
            c0t16d0  ONLINE       0     0     0
            c0t20d0  ONLINE       0     0     0
            c0t1d0   ONLINE       0     0     0
            c0t9d0   ONLINE       0     0     0
            c0t13d0  ONLINE       0     0     0
            c0t17d0  ONLINE       0     0     0
            c0t20d0  FAULTED      0     0     0  too many errors
            c0t2d0   ONLINE       0     0     0
            c0t6d0   ONLINE       0     0     0
            c0t10d0  ONLINE       0     0     0
            c0t14d0  ONLINE       0     0     0
            c0t18d0  ONLINE       0     0     0
            c0t22d0  ONLINE       0     0     0
            c0t3d0   ONLINE       0     0     0
            c0t7d0   ONLINE       0     0     0
            c0t11d0  ONLINE       0     0     0
            c0t15d0  ONLINE       0     0     0
            c0t19d0  ONLINE       0     0     0
            c0t23d0  ONLINE       0     0     0

errors: No known data errors


I tried doing a zpool clear with no luck:

[root at mondo4 ~]# zpool clear LogData c0t20d0
[root at mondo4 ~]# zpool status -v
  pool: LogData
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use ''zpool clear'' to
mark the device
        repaired.
 scrub: resilver completed with 0 errors on Wed Feb 27 11:51:45 2008
config:

        NAME         STATE     READ WRITE CKSUM
        LogData      DEGRADED     0     0     0
          raidz2     DEGRADED     0     0     0
            c0t12d0  ONLINE       0     0     0
            c0t5d0   ONLINE       0     0     0
            c0t0d0   ONLINE       0     0     0
            c0t4d0   ONLINE       0     0     0
            c0t8d0   ONLINE       0     0     0
            c0t16d0  ONLINE       0     0     0
            c0t20d0  ONLINE       0     0     0
            c0t1d0   ONLINE       0     0     0
            c0t9d0   ONLINE       0     0     0
            c0t13d0  ONLINE       0     0     0
            c0t17d0  ONLINE       0     0     0
            c0t20d0  FAULTED      0     0     0  too many errors
            c0t2d0   ONLINE       0     0     0
            c0t6d0   ONLINE       0     0     0
            c0t10d0  ONLINE       0     0     0
            c0t14d0  ONLINE       0     0     0
            c0t18d0  ONLINE       0     0     0
            c0t22d0  ONLINE       0     0     0
            c0t3d0   ONLINE       0     0     0
            c0t7d0   ONLINE       0     0     0

And I?ve tried zpool replace:

[root at mondo4 ~]# 
[root at mondo4 ~]# zpool replace -f LogData c0t20d0
invalid vdev specification
the following errors must be manually repaired:
/dev/dsk/c0t20d0s0 is part of active ZFS pool LogData. Please see zpool(1M).


So.. What am I missing here folks?

Any help would be appreciated.

-Mike


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080229/43f57628/attachment.html>

Bart Smaalders

2008-Mar-01 01:56 UTC

head link

[zfs-discuss] Problems replacing a failed drive.

Michael Stalnaker wrote:> I have a 24 disk SATA array running on Open Solaris Nevada, b78. We had 
> a drive fail, and I?ve replaced the device but can?t get the system to 
> recognize that I replaced the drive.
> 
> Zpool status ?v shows the failed drive:
> 
> [mstalnak at mondo4 ~]$ zpool status -v
>   pool: LogData
>  state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.
> action: Replace the faulted device, or use ''zpool clear''
to mark the device
>         repaired.
>  scrub: resilver completed with 0 errors on Wed Feb 27 11:51:45 2008
> config:
> 
>         NAME         STATE     READ WRITE CKSUM
>         LogData      DEGRADED     0     0     0
>           raidz2     DEGRADED     0     0     0
>             c0t12d0  ONLINE       0     0     0
>             c0t5d0   ONLINE       0     0     0
>             c0t0d0   ONLINE       0     0     0
>             c0t4d0   ONLINE       0     0     0
>             c0t8d0   ONLINE       0     0     0
>             c0t16d0  ONLINE       0     0     0
>             c0t20d0  ONLINE       0     0     0
>             c0t1d0   ONLINE       0     0     0
>             c0t9d0   ONLINE       0     0     0
>             c0t13d0  ONLINE       0     0     0
>             c0t17d0  ONLINE       0     0     0
>             c0t20d0  FAULTED      0     0     0  too many errors
>             c0t2d0   ONLINE       0     0     0
>             c0t6d0   ONLINE       0     0     0
>             c0t10d0  ONLINE       0     0     0
>             c0t14d0  ONLINE       0     0     0
>             c0t18d0  ONLINE       0     0     0
>             c0t22d0  ONLINE       0     0     0
>             c0t3d0   ONLINE       0     0     0
>             c0t7d0   ONLINE       0     0     0
>             c0t11d0  ONLINE       0     0     0
>             c0t15d0  ONLINE       0     0     0
>             c0t19d0  ONLINE       0     0     0
>             c0t23d0  ONLINE       0     0     0
> 
> errors: No known data errors
> 
> 
> I tried doing a zpool clear with no luck:
> 
> [root at mondo4 ~]# zpool clear LogData c0t20d0
> [root at mondo4 ~]# zpool status -v
>   pool: LogData
>  state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.
> action: Replace the faulted device, or use ''zpool clear''
to mark the device
>         repaired.
>  scrub: resilver completed with 0 errors on Wed Feb 27 11:51:45 2008
> config:
> 
>         NAME         STATE     READ WRITE CKSUM
>         LogData      DEGRADED     0     0     0
>           raidz2     DEGRADED     0     0     0
>             c0t12d0  ONLINE       0     0     0
>             c0t5d0   ONLINE       0     0     0
>             c0t0d0   ONLINE       0     0     0
>             c0t4d0   ONLINE       0     0     0
>             c0t8d0   ONLINE       0     0     0
>             c0t16d0  ONLINE       0     0     0
>             c0t20d0  ONLINE       0     0     0
>             c0t1d0   ONLINE       0     0     0
>             c0t9d0   ONLINE       0     0     0
>             c0t13d0  ONLINE       0     0     0
>             c0t17d0  ONLINE       0     0     0
>             c0t20d0  FAULTED      0     0     0  too many errors
>             c0t2d0   ONLINE       0     0     0
>             c0t6d0   ONLINE       0     0     0
>             c0t10d0  ONLINE       0     0     0
>             c0t14d0  ONLINE       0     0     0
>             c0t18d0  ONLINE       0     0     0
>             c0t22d0  ONLINE       0     0     0
>             c0t3d0   ONLINE       0     0     0
>             c0t7d0   ONLINE       0     0     0
> 
> And I?ve tried zpool replace:
> 
> [root at mondo4 ~]#
> [root at mondo4 ~]# zpool replace -f LogData c0t20d0
> invalid vdev specification
> the following errors must be manually repaired:
> /dev/dsk/c0t20d0s0 is part of active ZFS pool LogData. Please see
zpool(1M).
> 
> 
> So.. What am I missing here folks?
> 
> Any help would be appreciated.
> 
Did you pull out the old drive and add a new one in its place hot? What 
does
cfgadm -al report?  Your drives should look like this:

sata0/0::dsk/c7t0d0            disk         connected    configured   ok
sata0/1::dsk/c7t1d0            disk         connected    configured   ok
sata1/0::dsk/c8t0d0            disk         connected    configured   ok
sata1/1::dsk/c8t1d0            disk         connected    configured   ok

If c0t20d0 isn''t configured, use

# cfgadm -c configure sata1/1::dsk/c0t20d0

before attempting the zpool replace.

hth -

- Bart

-- 
Bart Smaalders			Solaris Kernel Performance
barts at cyber.eng.sun.com		http://blogs.sun.com/barts
"You will contribute more with mercurial than with thunderbird."

zfs discuss - Feb 2008 - Problems replacing a failed drive.

[zfs-discuss] Problems replacing a failed drive.

[zfs-discuss] Problems replacing a failed drive.