thr3ads.net - zfs discuss - [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Paul B. Henson

2010-Jan-09 17:45 UTC

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

We just had our first x4500 disk failure (which of course had to happen
late Friday night <sigh>), I''ve opened a ticket on it but
don''t expect a
response until Monday so was hoping to verify the hot spare took over
correctly and we still have redundancy pending device replacement.

This is an S10U6 box:

SunOS cartman 5.10 Generic_141445-09 i86pc i386 i86pc

Looks like the first errors started yesterday morning:

Jan  8 07:46:02 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:15 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:32 cartman sata: [ID 801593 kern.warning] WARNING:
/pci at 0,0/pci1022
,7458 at 2/pci11ab,11ab at 1:
Jan  8 07:46:32 cartman SATA device at port 2 - device failed
Jan  8 07:46:32 cartman scsi: [ID 107833 kern.warning] WARNING:
/pci at 0,0/pci1022
,7458 at 2/pci11ab,11ab at 1/disk at 2,0 (sd26):
Jan  8 07:46:32 cartman         Command failed to complete...Device is gone

ZFS failed the drive about 11:15PM:

Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: One or more devices has experienced an unrecoverable error.  An
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: attempt was made to correct the error.  Applications are
unaffected.
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] unknown
head
er see
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error]
warning: poo
l export health DEGRADED

However, the errors continue still:

Jan  9 03:54:48 cartman scsi: [ID 107833 kern.warning] WARNING:
/pci at 0,0/pci1022
,7458 at 2/pci11ab,11ab at 1/disk at 2,0 (sd26):
Jan  9 03:54:48 cartman         Command failed to complete...Device is gone
[...]
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/pci at 0,0/pci1022
,7458 at 2/pci11ab,11ab at 1/disk at 2,0 (sd26):
Jan  9 07:56:12 cartman         Command failed to complete...Device is gone
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/pci at 0,0/pci1022
,7458 at 2/pci11ab,11ab at 1/disk at 2,0 (sd26):
Jan  9 07:56:12 cartman         drive offline

If ZFS removed the drive from the pool, why does the system keep
complaining about it? Is fault management stuff still poking at it?

Here''s the zpool status output:

  pool: export
 state: DEGRADED
[...]
 scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
2010


        NAME          STATE     READ WRITE CKSUM
        export        DEGRADED     0     0     0

          mirror      DEGRADED     0     0     0
            c0t2d0    ONLINE       0     0     0
            spare     DEGRADED 18.9K     0     0
              c1t2d0  REMOVED      0     0     0
              c5t0d0  ONLINE       0     0 18.9K

        spares
          c5t0d0      INUSE     currently in use

Is the pool/mirror/spare still supposed to show up as degraded after the
hot spare is deployed?

There are 18.9K checksum errors on the disk that failed, but there are also
18.9K read errors on the hot spare?

The scrub started at 11pm last night, the disk got booted at 11:15pm,
presumably the scrub came across the failures the os had been reporting.
The last scrub status shows that scrub completing successfully. What
happened to the resilver status? How can I tell if the resilver was
successful? Did the resilver start and complete while the scrub was still
running and its status output was lost? Is there any way to see the status
of past scrubs/resilvers, or is only the most recent one available?

Fault managment doesn''t report any problems:

root at cartman ~ # fmdump
TIME                 UUID                                 SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty

Shouldn''t this show a failed disk?

fmdump -e shows tuns of bad stuff:

Jan 08 07:46:32.9467 ereport.fs.zfs.probe_failure
Jan 08 07:46:36.2015 ereport.fs.zfs.io
[...]
Jan 08 07:51:05.1865 ereport.fs.zfs.io

None of that results in a fault diagnosys?

Mostly I''d like to verify my hot spare is working correctly. Given the
spare status is "degraded", the read errors on the spare device, and
the
lack of successful resilver status output, it seems like the spare might
not have been added successfully.

Thanks for any input you might provide...


-- 
Paul B. Henson  |  (909) 979-6361  |  csupomona.edu/~henson
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Eric Schrock

2010-Jan-09 21:09 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On Jan 9, 2010, at 9:45 AM, Paul B. Henson wrote:> 
> If ZFS removed the drive from the pool, why does the system keep
> complaining about it?
It''s not failing in the sense that it''s returning I/O errors,
but it''s flaky, so it''s attaching and detaching.  Most likely
it decided to attach again and then you got transport errors.
> Is fault management stuff still poking at it?
No.
> Is the pool/mirror/spare still supposed to show up as degraded after the
> hot spare is deployed?
Yes.
> There are 18.9K checksum errors on the disk that failed, but there are also
> 18.9K read errors on the hot spare?
This is a bug recently fixed in OpenSolaris.
> The last scrub status shows that scrub completing successfully. What
> happened to the resilver status?
If there was a scrub it will show as the last thing completed.
> How can I tell if the resilver was
> successful?
If the scrub was successful.
> Did the resilver start and complete while the scrub was still
> running and its status output was lost?
No, only one can be active at any time.
> Is there any way to see the status
> of past scrubs/resilvers, or is only the most recent one available?
Only the most recent one.
> None of that results in a fault diagnosys?
When the device is in the process of going away, no.  From the OS perspective
this disk was physically removed from the system.
> Mostly I''d like to verify my hot spare is working correctly. Given
the
> spare status is "degraded", the read errors on the spare device,
and the
> lack of successful resilver status output, it seems like the spare might
> not have been added successfully.
No, it''s fine.  DEGRADED just means the pool is not operating at the
ideal state.  By definition a hot spare is always DEGRADED.  As long as the
spare itself is ONLINE it''s fine.

Hope that helps,

- Eric

--
Eric Schrock, Fishworks                        blogs.sun.com/eschrock

Ian Collins

2010-Jan-09 22:04 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

Paul B. Henson wrote:> We just had our first x4500 disk failure (which of course had to happen
> late Friday night <sigh>), I''ve opened a ticket on it but
don''t expect a
> response until Monday so was hoping to verify the hot spare took over
> correctly and we still have redundancy pending device replacement.
>
> This is an S10U6 box:
>
> Here''s the zpool status output:
>
>   pool: export
>  state: DEGRADED
> [...]
>  scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
> 2010
>
>
>         NAME          STATE     READ WRITE CKSUM
>         export        DEGRADED     0     0     0
>
>           mirror      DEGRADED     0     0     0
>             c0t2d0    ONLINE       0     0     0
>             spare     DEGRADED 18.9K     0     0
>               c1t2d0  REMOVED      0     0     0
>               c5t0d0  ONLINE       0     0 18.9K
>
>         spares
>           c5t0d0      INUSE     currently in use
>
> Is the pool/mirror/spare still supposed to show up as degraded after the
> hot spare is deployed?
>
>   Yes, the spare will show as degraded until you replace it. I had a pool 
on a 4500 that lost one drive, then swapped out 3 more due to brain 
farts from that naff Marvell driver. It was a bit of a concern for a 
while seeing two degraded devices in one raidz vdev!
> The scrub started at 11pm last night, the disk got booted at 11:15pm,
> presumably the scrub came across the failures the os had been reporting.
> The last scrub status shows that scrub completing successfully. What
> happened to the resilver status? How can I tell if the resilver was
> successful? Did the resilver start and complete while the scrub was still
> running and its status output was lost? Is there any way to see the status
> of past scrubs/resilvers, or is only the most recent one available?
>
>   You only see the last one, but a resilver is a scrub.
> Mostly I''d like to verify my hot spare is working correctly. Given
the
> spare status is "degraded", the read errors on the spare device,
and the
> lack of successful resilver status output, it seems like the spare might
> not have been added successfully.
>
>   It has - "scrub completed after 0h6m with 0 errors".

-- 
Ian.

Paul B. Henson

2010-Jan-09 23:17 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On Sat, 9 Jan 2010, Eric Schrock wrote:
> > If ZFS removed the drive from the pool, why does the system keep
> > complaining about it?
>
> It''s not failing in the sense that it''s returning I/O
errors, but it''s
> flaky, so it''s attaching and detaching.  Most likely it decided to
attach
> again and then you got transport errors.
Ok, how do I make it stop logging messages about the drive until it is
replaced? It''s still filling up the logs with the same errors about the
drive being offline.

Looks like hdadm isn''t it:

root at cartman ~ # hdadm offline disk c1t2d0
/usr/bin/hdadm[1762]: /dev/rdsk/c1t2d0d0p0: cannot open
/dev/rdsk/c1t2d0d0p0 is not available

Hmm, I was able to unconfigure it with cfgadm:

root at cartman ~ # cfgadm -c unconfigure sata1/2::dsk/c1t2d0

It went from:

sata1/2::dsk/c1t2d0            disk         connected    configured   failed

to:

sata1/2                        disk         connected    unconfigured failed

Hopefully that will stop the errors until it''s replaced and not break
anything else :).
> No, it''s fine.  DEGRADED just means the pool is not operating at
the
> ideal state.  By definition a hot spare is always DEGRADED.  As long as
> the spare itself is ONLINE it''s fine.
The spare shows as "INUSE", but I''m guessing that''s
fine too.
> Hope that helps
That was perfect, thank you very much for the review. Now I can not worry
about it until Monday :).

-- 
Paul B. Henson  |  (909) 979-6361  |  csupomona.edu/~henson
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Cindy Swearingen

2010-Jan-11 16:56 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

Hi Paul,

Example 11-1 in this section describes how to replace a
disk on an x4500 system:

docs.sun.com/app/docs/doc/819-5461/gbcet?a=view

Cindy

On 01/09/10 16:17, Paul B. Henson wrote:> On Sat, 9 Jan 2010, Eric Schrock wrote:
> 
>>> If ZFS removed the drive from the pool, why does the system keep
>>> complaining about it?
>> It''s not failing in the sense that it''s returning I/O
errors, but it''s
>> flaky, so it''s attaching and detaching.  Most likely it
decided to attach
>> again and then you got transport errors.
> 
> Ok, how do I make it stop logging messages about the drive until it is
> replaced? It''s still filling up the logs with the same errors
about the
> drive being offline.
> 
> Looks like hdadm isn''t it:
> 
> root at cartman ~ # hdadm offline disk c1t2d0
> /usr/bin/hdadm[1762]: /dev/rdsk/c1t2d0d0p0: cannot open
> /dev/rdsk/c1t2d0d0p0 is not available
> 
> Hmm, I was able to unconfigure it with cfgadm:
> 
> root at cartman ~ # cfgadm -c unconfigure sata1/2::dsk/c1t2d0
> 
> It went from:
> 
> sata1/2::dsk/c1t2d0            disk         connected    configured  
failed
> 
> to:
> 
> sata1/2                        disk         connected    unconfigured
failed
> 
> Hopefully that will stop the errors until it''s replaced and not
break
> anything else :).
> 
>> No, it''s fine.  DEGRADED just means the pool is not operating
at the
>> ideal state.  By definition a hot spare is always DEGRADED.  As long as
>> the spare itself is ONLINE it''s fine.
> 
> The spare shows as "INUSE", but I''m guessing
that''s fine too.
> 
>> Hope that helps
> 
> That was perfect, thank you very much for the review. Now I can not worry
> about it until Monday :).
>

Paul B. Henson

2010-Jan-12 01:42 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On Sat, 9 Jan 2010, Eric Schrock wrote:
> No, it''s fine.  DEGRADED just means the pool is not operating at
the
> ideal state.  By definition a hot spare is always DEGRADED.  As long as
> the spare itself is ONLINE it''s fine.
One more question on this; so there''s no way to tell just from the
status
the difference between a pool degraded due to disk failure but still with
full redundancy from a hot spare vs a pool degraded due to disk failure
that has lost redundancy due to that failure? I guess you can review the
pool details for the specifics but for large pools it seems it would be
valuable to be able to quickly distinguish these states from the short
status.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  csupomona.edu/~henson
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Eric Schrock

2010-Jan-12 01:48 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On 01/11/10 17:42, Paul B. Henson wrote:> On Sat, 9 Jan 2010, Eric Schrock wrote:
> 
>> No, it''s fine.  DEGRADED just means the pool is not operating
at the
>> ideal state.  By definition a hot spare is always DEGRADED.  As long as
>> the spare itself is ONLINE it''s fine.
> 
> One more question on this; so there''s no way to tell just from the
status
> the difference between a pool degraded due to disk failure but still with
> full redundancy from a hot spare vs a pool degraded due to disk failure
> that has lost redundancy due to that failure? I guess you can review the
> pool details for the specifics but for large pools it seems it would be
> valuable to be able to quickly distinguish these states from the short
> status.
No, there is no way to tell if a pool has DTL (dirty time log) entries.

- Eric

-- 
Eric Schrock, Fishworks                    blogs.sun.com/eschrock

Paul B. Henson

2010-Jan-12 02:35 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On Mon, 11 Jan 2010, Eric Schrock wrote:
> No, there is no way to tell if a pool has DTL (dirty time log) entries.
Hmm, I hadn''t heard that term before, but based on a quick search I
take it
that''s the list of data in the pool that is not fully redundant? So if
a
2-way mirror vdev lost a half, everything written after the loss would be
on the DTL, and if the same device came back, recovery would entail just
running through the DTL and writing out what it missed? Although presumably
if the failed device was replaced with another device entirely all of the
data would need to be written out.

I''m not quite sure that answered my question. My original question was,
for
example, given a 2-way mirror, one half fails. There is a hot spare
available, which is pulled in, and while the pool isn''t optimal, it
does
have the same number of devices that it''s supposed to. On the other
hand,
the same mirror loses a device, there''s no hot spare, and the pool is
short
one device. My understanding is that in both scenarios the pool status
would be "DEGRADED", but it seems there''s an important
difference. In the
first case, another device could fail, and the pool would still be ok. In
the second, another device failing would result in complete loss of data.

While you can tell the difference between these two different states by
looking at the detailed output and seeing if a hot spare is in use, I was
just saying that it would be nice for the short status to have some
distinction between "device failed, hot spare in use" and "device
failed,
keep fingers crossed" ;).

Back to your answer, if the existance of DTL entries means the pool
doesn''t
have full redundancy for some data, and you can''t tell if a pool has
DTL
entries, are you saying there''s no way to tell if the current state of
your
pool could survive a device failure? If a resilver successfully completes,
barring another device failure, doesn''t that mean the pool is restored
to
full redundancy? I feel like I must be misunderstanding something :(.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  csupomona.edu/~henson
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Eric Schrock

2010-Jan-12 03:28 UTC

head link

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

On Jan 11, 2010, at 6:35 PM, Paul B. Henson wrote:
> On Mon, 11 Jan 2010, Eric Schrock wrote:
> 
>> No, there is no way to tell if a pool has DTL (dirty time log) entries.
> 
> Hmm, I hadn''t heard that term before, but based on a quick search
I take it
> that''s the list of data in the pool that is not fully redundant?
So if a
> 2-way mirror vdev lost a half, everything written after the loss would be
> on the DTL, and if the same device came back, recovery would entail just
> running through the DTL and writing out what it missed? Although presumably
> if the failed device was replaced with another device entirely all of the
> data would need to be written out.
> 
> I''m not quite sure that answered my question. My original question
was, for
> example, given a 2-way mirror, one half fails. There is a hot spare
> available, which is pulled in, and while the pool isn''t optimal,
it does
> have the same number of devices that it''s supposed to. On the
other hand,
> the same mirror loses a device, there''s no hot spare, and the pool
is short
> one device. My understanding is that in both scenarios the pool status
> would be "DEGRADED", but it seems there''s an important
difference. In the
> first case, another device could fail, and the pool would still be ok. In
> the second, another device failing would result in complete loss of data.
> 
> While you can tell the difference between these two different states by
> looking at the detailed output and seeing if a hot spare is in use, I was
> just saying that it would be nice for the short status to have some
> distinction between "device failed, hot spare in use" and
"device failed,
> keep fingers crossed" ;).
> 
> Back to your answer, if the existance of DTL entries means the pool
doesn''t
> have full redundancy for some data, and you can''t tell if a pool
has DTL
> entries, are you saying there''s no way to tell if the current
state of your
> pool could survive a device failure? If a resilver successfully completes,
> barring another device failure, doesn''t that mean the pool is
restored to
> full redundancy? I feel like I must be misunderstanding something :(.
DTLs are a more specific answer to your question.  It implies that a toplevel
vdev has a known time when there is invalid data for it or one of its children. 
This may because a device failed and is accumulating DTL time, a new replacing
or spare vdev was attached, or it may be because a device was unplugged and then
plugged back in.  Your example (hot spares) is but one of the ways in which this
can happen, but in any of the cases it implies that data is not fully
replicated.

There is obviously a way to detect this in the kernel, it''s simply not
exported to userland in any useful way.  The reason I focused on DTLs is that if
any mechanism were provided to distinguish a pool lacking full redundancy, it
would be based on DTLs - nothing else makes sense.

- Eric
> 
> Thanks...
> 
> 
> -- 
> Paul B. Henson  |  (909) 979-6361  |  csupomona.edu/~henson
> Operating Systems and Network Analyst  |  henson at csupomona.edu
> California State Polytechnic University  |  Pomona CA 91768
--
Eric Schrock, Fishworks                        blogs.sun.com/eschrock

zfs discuss - Jan 2010 - x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly