thr3ads.net - zfs discuss - [zfs-discuss] New twist on the faulted zpools [May 2011]

If this information is useful, please help other people find it:
Share via:

Paul Kraus

2011-May-20 17:12 UTC

[zfs-discuss] New twist on the faulted zpools

I have run into a more serious and scary situation after our array
outage yesterday.

As I posted earlier today, I came in this morning and found 9 LUNs off
line (our of over 120). Not a big deal, as the rest of the array was
OK (and still is), and the other arrays are fine. Everything is
mirrored across arrays. I started "zpool replace"ing bad LUNs with
some excess capacity we have. The first two went fine, the third is
still resilvering. The fourth, on the other hand, has been a
nightmare. Here is the current state:

   pool: deadbeef
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run ''zpool
clear''.
   see: http://www.sun.com/msg/ZFS-8000-HC
 scrub: resilver in progress for 2h4m, 0.07% done, 3186h1m to go
config:

        NAME                                         STATE     READ WRITE CKSUM
        deadbeef                                   UNAVAIL      0
0     0  insufficient replicas
          mirror-0                                   DEGRADED     0     0     0
            c5t600C0FF00000000009278536638D9B07d0    ONLINE       0     0     0
            replacing-1                              DEGRADED     0     0     0
              c5t600C0FF0000000000922614781B19005d0  UNAVAIL      0
 0     0  corrupted data
              c5t600C0FF00000000009277F7905F6DD05d0  ONLINE       0
 0     0  38K resilvered
          mirror-1                                   UNAVAIL      0
 0     0  corrupted data
            c5t600C0FF0000000000927852FB91AD301d0    ONLINE       0     0     0
            c5t600C0FF0000000000922614781B19006d0    ONLINE       0
 0     0  14K resilvered
          mirror-2                                   ONLINE       0     0     0
            c5t600C0FF00000000009277F6FA1A14C06d0    ONLINE       0
 0     0  31K resilvered
            c5t600015D000060200000000000000B361d0    ONLINE       0     0     0
          mirror-3                                   DEGRADED     0     0     0
            replacing-0                              DEGRADED     0     0     0
              c5t600C0FF000000000092261491D9A9F09d0  UNAVAIL      0
 0     0  cannot open
              c5t600015D000060200000000000000B365d0  ONLINE       0
 0     0  32.9M resilvered
            c5t600C0FF00000000009277F7905F6DD02d0    ONLINE       0
 0     0  2.50K resilvered

errors: 134 data errors, use ''-v'' for a list

    Now, of all these UNAVAIL and FAULTed devices only one is actually
bad, c5t600C0FF000000000092261491D9A9F09d0 is from the raid set that
is dead. Now, when the array was cold booted yesterday there was a
temporary outage of the LUNs from the other two raidsets as well
(c5t600C0FF0000000000922614781B19005d0 and
c5t600C0FF0000000000922614781B19006d0). We have seen this before, and
usually we just do a ''zpool clear'' of the device and a
resilver gets
us back where we need to be.

    This time has been different... I did a ''zpool clear deadbeef
c5t600C0FF0000000000922614781B19005d0'' and the zpool immediately went
UNAVAIL with c5t600C0FF00000000009278536638D9B07d0 going UNAVAIL. I
did a ''zpool clear deadbeef
c5t600C0FF00000000009278536638D9B07d0'' and
it came right back.

    At that point I confirmed that I could read from both
c5t600C0FF00000000009278536638D9B07d0 and
c5t600C0FF0000000000922614781B19005d0 using dd. I also let the
resilver in progress complete, which it did in about an hour with no
issues.

    I then did the zpool replace on
c5t600C0FF000000000092261491D9A9F09d0 in mirror-3 (the really dead
device) and I was rewarded with an UNAVAIL pool again. I cleared a
number of known good devices and the got the pool back.

   At this point I assumed the zfs label on the
c5t600C0FF0000000000922614781B19005d0 had gotten somehow corrupted so
I tried a zpool replace of it with itself and even with -f it would
not let me. So I tried replacing it with a different LUN, as you can
see above. That was when it all went into the crapper and has stayed
there. zpool clear does not even return (and can''t be killed).
mirror-1 reports UNAVAIL but both halves report ONLINE.

   I am afraid to EXPORT in case it won''t IMPORT, but I have also
started the process to restore from the replicated copy of the data
from a remote site. After lunch I will probably try and EXPORT /
IMPORT and see if that gets me anywhere.

NOTE: there are 16 other pools on this server, one of which is
resilvering, one of which still has bad LUNs I need to replace, and
the rest are fine. The pool has a capacity of 1.5 TB and is about 1.37
TB used, the remaining pool to cleanup is 8 TB used out of 9 TB and we
really can''t afford to have these kinds of problems with that one.

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Paul Kraus

2011-May-21 13:35 UTC

head link

[zfs-discuss] New twist on the faulted zpools

Followup....

    The problems I was having were NOT ZFS problems, but a bug in the
sd/ssd (scsi) driver.

    Last night I rebooted the server and about half of the zpools came
up FAULTED. I tried a whole host of things, including checking the zfs
labels with zdb -l, everything looked good but zfs would not let me
use the pools (I had pools where all the devices were ONLINE but all
the vdevs were FAULTED).

    I opened a case with Oracle at about 8:30 PM and got a call back
from Stephen Foster, a support engineer in the kernel group, within
half an hour. He asked me to check a couple more things and rapidly
got to the point where he was recommending a specific version (due to
other patch requirements on this box) of the sd/ssd driver patch.
After having to hop through a couple small hoops to get the sd/ssd and
our IDR patch to play nice together the system came up and all the
zpools were as I expected (all but two were ONLINE, the two that
weren''t were DEGRADED due to missing devices).

    I zpool replaced the last of the (really) failed devices and
started a scrub sunning on one other suspect pool and went to bed.

    The problem the sd/ssd patch fixes is that the sd/ssd driver was
not handling a couple very specific scsi commands fast enough for ZFS.
The problem shows itself when importing under Solaris 10U9 or when
making other underlying storage changes. It is apparently 10U9
specific.

On Fri, May 20, 2011 at 1:12 PM, Paul Kraus <paul at kraus-haus.org>
wrote:> I have run into a more serious and scary situation after our array
> outage yesterday.
>
> As I posted earlier today, I came in this morning and found 9 LUNs off
> line (our of over 120). Not a big deal, as the rest of the array was
> OK (and still is), and the other arrays are fine. Everything is
> mirrored across arrays. I started "zpool replace"ing bad LUNs
with
> some excess capacity we have. The first two went fine, the third is
> still resilvering. The fourth, on the other hand, has been a
> nightmare. Here is the current state:
>
> ? pool: deadbeef
> ?state: UNAVAIL
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run
''zpool clear''.
> ? see: http://www.sun.com/msg/ZFS-8000-HC
> ?scrub: resilver in progress for 2h4m, 0.07% done, 3186h1m to go
> config:
>
> ? ? ? ?NAME ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? STATE ? ? READ WRITE
CKSUM
> ? ? ? ?deadbeef ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? UNAVAIL ? ? ?0
> 0 ? ? 0 ?insufficient replicas
> ? ? ? ? ?mirror-0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? DEGRADED ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ?c5t600C0FF00000000009278536638D9B07d0 ? ?ONLINE ? ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ?replacing-1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?DEGRADED ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ? ?c5t600C0FF0000000000922614781B19005d0 ?UNAVAIL ? ? ?0
> ?0 ? ? 0 ?corrupted data
> ? ? ? ? ? ? ?c5t600C0FF00000000009277F7905F6DD05d0 ?ONLINE ? ? ? 0
> ?0 ? ? 0 ?38K resilvered
> ? ? ? ? ?mirror-1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? UNAVAIL ? ? ?0
> ?0 ? ? 0 ?corrupted data
> ? ? ? ? ? ?c5t600C0FF0000000000927852FB91AD301d0 ? ?ONLINE ? ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ?c5t600C0FF0000000000922614781B19006d0 ? ?ONLINE ? ? ? 0
> ?0 ? ? 0 ?14K resilvered
> ? ? ? ? ?mirror-2 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ?c5t600C0FF00000000009277F6FA1A14C06d0 ? ?ONLINE ? ? ? 0
> ?0 ? ? 0 ?31K resilvered
> ? ? ? ? ? ?c5t600015D000060200000000000000B361d0 ? ?ONLINE ? ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ?mirror-3 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? DEGRADED ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ?replacing-0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?DEGRADED ? ? 0 ? ? 0 ?
? 0
> ? ? ? ? ? ? ?c5t600C0FF000000000092261491D9A9F09d0 ?UNAVAIL ? ? ?0
> ?0 ? ? 0 ?cannot open
> ? ? ? ? ? ? ?c5t600015D000060200000000000000B365d0 ?ONLINE ? ? ? 0
> ?0 ? ? 0 ?32.9M resilvered
> ? ? ? ? ? ?c5t600C0FF00000000009277F7905F6DD02d0 ? ?ONLINE ? ? ? 0
> ?0 ? ? 0 ?2.50K resilvered
>
> errors: 134 data errors, use ''-v'' for a list
>
> ? ?Now, of all these UNAVAIL and FAULTed devices only one is actually
> bad, c5t600C0FF000000000092261491D9A9F09d0 is from the raid set that
> is dead. Now, when the array was cold booted yesterday there was a
> temporary outage of the LUNs from the other two raidsets as well
> (c5t600C0FF0000000000922614781B19005d0 and
> c5t600C0FF0000000000922614781B19006d0). We have seen this before, and
> usually we just do a ''zpool clear'' of the device and a
resilver gets
> us back where we need to be.
>
> ? ?This time has been different... I did a ''zpool clear deadbeef
> c5t600C0FF0000000000922614781B19005d0'' and the zpool immediately
went
> UNAVAIL with c5t600C0FF00000000009278536638D9B07d0 going UNAVAIL. I
> did a ''zpool clear deadbeef
c5t600C0FF00000000009278536638D9B07d0'' and
> it came right back.
>
> ? ?At that point I confirmed that I could read from both
> c5t600C0FF00000000009278536638D9B07d0 and
> c5t600C0FF0000000000922614781B19005d0 using dd. I also let the
> resilver in progress complete, which it did in about an hour with no
> issues.
>
> ? ?I then did the zpool replace on
> c5t600C0FF000000000092261491D9A9F09d0 in mirror-3 (the really dead
> device) and I was rewarded with an UNAVAIL pool again. I cleared a
> number of known good devices and the got the pool back.
>
> ? At this point I assumed the zfs label on the
> c5t600C0FF0000000000922614781B19005d0 had gotten somehow corrupted so
> I tried a zpool replace of it with itself and even with -f it would
> not let me. So I tried replacing it with a different LUN, as you can
> see above. That was when it all went into the crapper and has stayed
> there. zpool clear does not even return (and can''t be killed).
> mirror-1 reports UNAVAIL but both halves report ONLINE.
>
> ? I am afraid to EXPORT in case it won''t IMPORT, but I have also
> started the process to restore from the replicated copy of the data
> from a remote site. After lunch I will probably try and EXPORT /
> IMPORT and see if that gets me anywhere.
>
> NOTE: there are 16 other pools on this server, one of which is
> resilvering, one of which still has bad LUNs I need to replace, and
> the rest are fine. The pool has a capacity of 1.5 TB and is about 1.37
> TB used, the remaining pool to cleanup is 8 TB used out of 9 TB and we
> really can''t afford to have these kinds of problems with that one.
>
> --
>
{--------1---------2---------3---------4---------5---------6---------7---------}
> Paul Kraus
> -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/
)
> -> Sound Coordinator, Schenectady Light Opera Company (
> http://www.sloctheater.org/ )
> -> Technical Advisor, RPI Players
>


-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

zfs discuss - May 2011 - New twist on the faulted zpools

[zfs-discuss] New twist on the faulted zpools

[zfs-discuss] New twist on the faulted zpools