thr3ads.net - zfs discuss - [zfs-discuss] Replacing a Lun (Raid 0) (Santricity) [Mar 2008]

If this information is useful, please help other people find it:
Share via:

David Smith

2008-Mar-14 14:23 UTC

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

I would like advise about how to replace a raid 0 lun.  The lun is basically a
raid 0 lun which is from a single disk volume group / volume from our Flexline
380 unit.  So every disk in the unit is a volume group/volume/lun mapped to the
host.  We then let ZFS do the raid.

We have a lun now which has been getting read errors and basically the
underlying drive needs to be replaced.  I''ve done this in the past but
as I remember it was cumbersome and didn''t go that smoothly so I would
like some advise about how to go about it.  As I recall if you just fail the
disk via Santricity, the lun really doesn''t go offline from a host
point of via and hence ZFS still tries to write to the lun/disk.  I believe I
either unmapped the lun to the host at which point ZFS kicked in a hot spare, or
I offlined the disk via ZFS first.  Then at that point the drive can be replace
on the storage and the volume re-initialized then remapped back to the host.  So
part of my question... does the above sound reasonable, or should I be doing
this differently?  Also I''m a little unsure about how to get the
original lun back in operation and the spare back to a spare.

For example I have the following situation now:

          raidz2                                      ONLINE       0     0     0
            c10t600A0B800011399600007CF945E80E95d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FEEB45E8145Ed0    ONLINE       0     0     0
            spare                                     ONLINE       0     0     0
              c10t600A0B800011399600007D2345E81075d0  ONLINE       0     0     0
              c10t600A0B800011399600007CE145E80D4Dd0  ONLINE       0     0     0
            c10t600A0B800011399600007D3F45E81157d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FF1145E817BEd0    ONLINE       0     0     0
            c10t600A0B800011399600007D5D45E813EBd0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FE7D45E80CDEd0    ONLINE       0     0     0
            c10t600A0B800011399600007C6145E808C7d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FE9945E80E6Ad0    ONLINE       0     0     0
            c10t600A0B800011399600007C8B45E80A59d0    ONLINE       0     0     0
            c10t600A0B800011399600007CA745E80B21d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FEB545E810D4d0    ONLINE       0     0     0
            c10t600A0B800011399600007CD145E80CD7d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FED145E8129Cd0    ONLINE       0     0     0
            c10t600A0B800011399600007CFB45E80EA5d0    ONLINE       0     0     0
            c10t600A0B8000115EA20000FEED45E8146Ed0    ONLINE       0     0     0

        spares
          c10t600A0B800011399600007CE145E80D4Dd0      INUSE     currently in use
          c10t600A0B8000115EA20000FEE145E81328d0      AVAIL   
          c10t600A0B800011399600007D0B45E80F21d0      AVAIL   
          c10t600A0B8000115EA20000FEFD45E81506d0      AVAIL   
          c10t600A0B800011399600007D3545E81107d0      AVAIL   
          c10t600A0B800011399600007D5345E81289d0      AVAIL   
          c10t600A0B8000115EA20000FF2345E81864d0      AVAIL   
          c10t600A0B800011399600007D6F45E8149Bd0      AVAIL   

I thought if I replace the original device with the same (but new) lun that
should work, but I get
the following:

zpool replace tank c10t600A0B800011399600007D2345E81075d0
c10t600A0B800011399600007D2345E81075d0
invalid vdev specification
use ''-f'' to override the following errors:
/dev/dsk/c10t600A0B800011399600007D2345E81075d0s0 is part of active ZFS pool
tank. Please see zpool(1M).

The above lun is the same as before, but the underlying disk on the storage was
replaced.  Do I need to do something to this lun to make ZFS think it is a new
disk?  Or should I be doing something different?

I now have another disk which has gone back so I need to fix my above situation
with the hot
spare first, and then go through the process again for my second failure.

The environment is S10U4, running on a x4600 with Flexline 380 storage units.

Tia,

David
 
 
This message posted from opensolaris.org

David Smith

2008-Mar-14 14:57 UTC

head link

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

Addtional information:

It looks like perhaps the original drive is in use, and the hot spare is
assigned but not in use see below about zpool iostat:

  raidz2                                2.76T  4.49T      0      0  29.0K  18.4K
    c10t600A0B800011399600007CF945E80E95d0      -      -      0      0  2.46K 
1.33K
    c10t600A0B8000115EA20000FEEB45E8145Ed0      -      -      0      0  2.46K 
1.33K
    spare                                   -      -      0      0  1.81K  1.33K
      c10t600A0B800011399600007D2345E81075d0      -      -      0      0  2.47K 
1.33K
      c10t600A0B800011399600007CE145E80D4Dd0      -      -      0      0      0 
1.33K
    c10t600A0B800011399600007D3F45E81157d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B8000115EA20000FF1145E817BEd0      -      -      0      0  2.46K 
1.33K
    c10t600A0B800011399600007D5D45E813EBd0      -      -      0      0  2.46K 
1.33K
    c10t600A0B8000115EA20000FE7D45E80CDEd0      -      -      0      0  2.47K 
1.33K
    c10t600A0B800011399600007C6145E808C7d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B8000115EA20000FE9945E80E6Ad0      -      -      0      0  2.46K 
1.33K
    c10t600A0B800011399600007C8B45E80A59d0      -      -      0      0  2.46K 
1.33K
    c10t600A0B800011399600007CA745E80B21d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B8000115EA20000FEB545E810D4d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B800011399600007CD145E80CD7d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B8000115EA20000FED145E8129Cd0      -      -      0      0  2.47K 
1.33K
    c10t600A0B800011399600007CFB45E80EA5d0      -      -      0      0  2.47K 
1.33K
    c10t600A0B8000115EA20000FEED45E8146Ed0      -      -      0      0  2.46K 
1.33K

        spares
          c10t600A0B800011399600007CE145E80D4Dd0      INUSE     currently in use

So how do I get the hot spare out of the inuse state?
 
 
This message posted from opensolaris.org

Cindy.Swearingen at Sun.COM

2008-Mar-14 16:10 UTC

head link

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

David,

Try detaching the spare, like this:

# zpool detach pool-name c10t600A0B800011399600007CE145E80D4Dd0

Cindy

David Smith wrote:> Addtional information:
> 
> It looks like perhaps the original drive is in use, and the hot spare is
assigned but not in use see below about zpool iostat:
> 
>   raidz2                                2.76T  4.49T      0      0  29.0K 
18.4K
>     c10t600A0B800011399600007CF945E80E95d0      -      -      0      0 
2.46K  1.33K
>     c10t600A0B8000115EA20000FEEB45E8145Ed0      -      -      0      0 
2.46K  1.33K
>     spare                                   -      -      0      0  1.81K 
1.33K
>       c10t600A0B800011399600007D2345E81075d0      -      -      0      0 
2.47K  1.33K
>       c10t600A0B800011399600007CE145E80D4Dd0      -      -      0      0   
0  1.33K
>     c10t600A0B800011399600007D3F45E81157d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B8000115EA20000FF1145E817BEd0      -      -      0      0 
2.46K  1.33K
>     c10t600A0B800011399600007D5D45E813EBd0      -      -      0      0 
2.46K  1.33K
>     c10t600A0B8000115EA20000FE7D45E80CDEd0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B800011399600007C6145E808C7d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B8000115EA20000FE9945E80E6Ad0      -      -      0      0 
2.46K  1.33K
>     c10t600A0B800011399600007C8B45E80A59d0      -      -      0      0 
2.46K  1.33K
>     c10t600A0B800011399600007CA745E80B21d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B8000115EA20000FEB545E810D4d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B800011399600007CD145E80CD7d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B8000115EA20000FED145E8129Cd0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B800011399600007CFB45E80EA5d0      -      -      0      0 
2.47K  1.33K
>     c10t600A0B8000115EA20000FEED45E8146Ed0      -      -      0      0 
2.46K  1.33K
> 
>         spares
>           c10t600A0B800011399600007CE145E80D4Dd0      INUSE     currently
in use
> 
> So how do I get the hot spare out of the inuse state?
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

David Smith

2008-Mar-14 16:18 UTC

head link

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

Yes! That worked to get the spare back to an available state.  Thanks!


So that leaves me with the trying to put together a recommended procedure to 
replace a failed lun/disk from our Flexline 380.   Does anyone have
configuration in
which they are using a RAID 0 lun, which they need to replace?

Thanks,

David
 
 
This message posted from opensolaris.org

zfs discuss - Mar 2008 - Replacing a Lun (Raid 0) (Santricity)

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)

[zfs-discuss] Replacing a Lun (Raid 0) (Santricity)