thr3ads.net - zfs discuss - [zfs-discuss] Replaced drive in zpool, was fine, now degraded

If this information is useful, please help other people find it:
Share via:

Jonathan

2010-Apr-14 07:05 UTC

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

I just started replacing drives in this zpool (to increase storage). I pulled
the first drive, and replaced it with a new drive and all was well. It
resilvered with 0 errors. This was 5 days ago. Just today I was looking around
and noticed that my pool was degraded (I see now that this occurred last night).
Sure enough there are 12 read errors on the new drive.

I''m on snv 111b. I attempted to get smartmontools workings, but it
doesn''t seem to want to work as these are all sata drives. fmdump
indicates that the read errors occurred within about 10 minutes of one another.

Is it safe to say this drive is bad, or is there anything else I can do about
this?

Thanks,
Jon

--------------------------------------------------------
$ zpool status MyStorage
  pool: MyStorage
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use ''zpool clear'' to
mark the device
        repaired.
 scrub: scrub completed after 8h7m with 0 errors on Sun Apr 11 13:07:40 2010
config:

        NAME        STATE     READ WRITE CKSUM
        MyStorage   DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            c5t0d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
            c7t1d0  FAULTED     12     0     0  too many errors

errors: No known data errors
--------------------------------------------------------
$ fmdump
TIME                 UUID                                 SUNW-MSG-ID
Apr 09 16:08:04.4660 1f07d23f-a4ba-cbbb-8713-d003d9771079 ZFS-8000-D3
Apr 13 22:29:02.8063 e26c7e32-e5dd-cd9c-cd26-d5715049aad8 ZFS-8000-FD
--------------------------------------------------------
That first log is the original drive being replaced. The second is the read
errors on the new drive.
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-14 16:45 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

On Apr 14, 2010, at 12:05 AM, Jonathan wrote:
> I just started replacing drives in this zpool (to increase storage). I
pulled the first drive, and replaced it with a new drive and all was well. It
resilvered with 0 errors. This was 5 days ago. Just today I was looking around
and noticed that my pool was degraded (I see now that this occurred last night).
Sure enough there are 12 read errors on the new drive.
> 
> I''m on snv 111b. I attempted to get smartmontools workings, but it
doesn''t seem to want to work as these are all sata drives. fmdump
indicates that the read errors occurred within about 10 minutes of one another.
Use "iostat -En" to see the nature of the I/O errors.
> 
> Is it safe to say this drive is bad, or is there anything else I can do
about this?
It is safe to say that there was trouble reading from the drive at some
time in the past. But you have not determined the root cause -- the info
available in zpool status is not sufficient.
 -- richard
> 
> Thanks,
> Jon
> 
> --------------------------------------------------------
> $ zpool status MyStorage
>  pool: MyStorage
> state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>        Sufficient replicas exist for the pool to continue functioning in a
>        degraded state.
> action: Replace the faulted device, or use ''zpool clear''
to mark the device
>        repaired.
> scrub: scrub completed after 8h7m with 0 errors on Sun Apr 11 13:07:40 2010
> config:
> 
>        NAME        STATE     READ WRITE CKSUM
>        MyStorage   DEGRADED     0     0     0
>          raidz1    DEGRADED     0     0     0
>            c5t0d0  ONLINE       0     0     0
>            c5t1d0  ONLINE       0     0     0
>            c6t1d0  ONLINE       0     0     0
>            c7t1d0  FAULTED     12     0     0  too many errors
> 
> errors: No known data errors
> --------------------------------------------------------
> $ fmdump
> TIME                 UUID                                 SUNW-MSG-ID
> Apr 09 16:08:04.4660 1f07d23f-a4ba-cbbb-8713-d003d9771079 ZFS-8000-D3
> Apr 13 22:29:02.8063 e26c7e32-e5dd-cd9c-cd26-d5715049aad8 ZFS-8000-FD
> --------------------------------------------------------
> That first log is the original drive being replaced. The second is the read
errors on the new drive.
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Jonathan

2010-Apr-14 16:56 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

I just ran ''iostat -En''. This is what was reported for the
drive in question (all other drives showed 0 errors across the board.

All drives indicated the "illegal request... predictive failure
analysis"
------------------------------------------------------------------------------
c7t1d0           Soft Errors: 0 Hard Errors: 36 Transport Errors: 0 
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:  
Size: 2000.40GB <2000398934016 bytes>
Media Error: 36 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 126 Predictive Failure Analysis: 0 
------------------------------------------------------------------------------
-- 
This message posted from opensolaris.org

Eric Andersen

2010-Apr-14 17:00 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

> I''m on snv 111b. I attempted to get smartmontools
> workings, but it doesn''t seem to want to work as
> these are all sata drives. 
Have you tried using ''-d sat,12'' when using smartmontools?

opensolaris.org/jive/thread.jspa?messageID=473727
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-14 17:02 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

On Apr 14, 2010, at 9:56 AM, Jonathan wrote:
> I just ran ''iostat -En''. This is what was reported for
the drive in question (all other drives showed 0 errors across the board.
> 
> All drives indicated the "illegal request... predictive failure
analysis"
>
------------------------------------------------------------------------------
> c7t1d0           Soft Errors: 0 Hard Errors: 36 Transport Errors: 0 
> Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:  
> Size: 2000.40GB <2000398934016 bytes>
> Media Error: 36 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 126 Predictive Failure Analysis: 0 
>
------------------------------------------------------------------------------
Don''t worry about illegal requests, they are not permanent.

Do worry about media errors. Though this is the most common HDD
error, it is also the cause of data loss. Fortunately, ZFS detected this
and repaired it for you.  Other file systems may not be so gracious.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Jonathan

2010-Apr-14 17:08 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

Yeah, 
------------------------------------------
$smartctl -d sat,12 -i /dev/rdsk/c5t0d0
smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
------------------------------------------

I''m thinking between 111 and 132 (mentioned in post) something changed.
-- 
This message posted from opensolaris.org

Cindy Swearingen

2010-Apr-14 17:27 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

Jonathan,

For a different diagnostic perspective, you might use the fmdump -eV
command to identify what FMA indicates for this device. This level of
diagnostics is below the ZFS level and definitely more detailed so
you can see when these errors began and for how long.

Cindy

On 04/14/10 11:08, Jonathan wrote:> Yeah, 
> ------------------------------------------
> $smartctl -d sat,12 -i /dev/rdsk/c5t0d0
> smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
> 
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
> ------------------------------------------
> 
> I''m thinking between 111 and 132 (mentioned in post) something
changed.

Jonathan

2010-Apr-14 17:28 UTC

head link

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

>
> Do worry about media errors. Though this is the most
> common HDD
> error, it is also the cause of data loss.
> Fortunately, ZFS detected this
> and repaired it for you.
Right. I assume you do recommend swapping the faulted drive out though?


  Other file systems may not> be so gracious.
>  -- richard

As we are all too aware I''m sure :)
-- 
This message posted from opensolaris.org

zfs discuss - Apr 2010 - Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno

[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno