thr3ads.net - zfs discuss - [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not... [May 2011]

If this information is useful, please help other people find it:
Share via:

Alex

2011-May-19 18:17 UTC

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

I thought this was interesting - it looks like we have a failing drive in our
mirror, but the two device nodes in the mirror are the same:

  pool: tank
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using ''zpool replace''.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011
config:

        NAME        STATE     READ WRITE CKSUM
        tank        DEGRADED     0     0     0
          mirror-0  DEGRADED     0     0     0
            c5t1d0  ONLINE       0     0     0
            c5t1d0  FAULTED      0     0     0  corrupted data

c5t1d0 does indeed only appear once in the "format" list. I wonder how
to go about correcting this if I can''t uniquely identify the failing
drive.

"format" takes forever to spill its guts, and the zpool commands all
hang...... clearly there is hardware error here, probably causing that, but not
sure how to identify which disk to pull.
-- 
This message posted from opensolaris.org

Jim Klimov

2011-May-19 19:41 UTC

head link

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

Just a random thought: if two devices have same IDs and seem to work in 
turns,
are you certain you have a mirror and not two paths to the same backend?

A few years back I was given to support a box with "sporadically failing 
drives"
which turned out to be two paths to the same external array, and 
configuring
MPxIO failover properly helped the system detect them as being actually one
device and stop complaining as long as one path works.

On another hand, you might have some "dd if=disk1 of=disk2" kind of
cloning
which may have puzzled the system...

HTH,
//Jim

Cindy Swearingen

2011-May-20 15:34 UTC

head link

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

Hi Alex

More scary than interesting to me.

What kind of hardware and which Solaris release?

Do you know what steps lead up to this problem? Any recent hardware
changes?

This output should tell you which disks were in this pool originally:

# zpool history tank

If the history identifies tank''s actual disks, maybe you can determine
which disk is masquerading as c5t1d0.

If that doesn''t work, accessing the individual disk entries in format
should tell which one is the problem, if its only one.

I would like to see the output of this command:

# zdb -l /dev/dsk/c5t1d0s0

Make sure you have a good backup of your data. If you need to pull a
disk to check cabling, or rule out controller issues, you should
probably export this pool first. Have a good backup.

Others have resolved minor device issues by exporting/importing the
pool but with format/zpool commands hanging on your system, I''m not
confident that this operation will work for you.

Thanks,

Cindy

On 05/19/11 12:17, Alex wrote:> I thought this was interesting - it looks like we have a failing drive in
our mirror, but the two device nodes in the mirror are the same:
> 
>   pool: tank
>  state: DEGRADED
> status: One or more devices could not be used because the label is missing
or
>         invalid.  Sufficient replicas exist for the pool to continue
>         functioning in a degraded state.
> action: Replace the device using ''zpool replace''.
>    see: http://www.sun.com/msg/ZFS-8000-4J
>  scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45
2011
> config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         tank        DEGRADED     0     0     0
>           mirror-0  DEGRADED     0     0     0
>             c5t1d0  ONLINE       0     0     0
>             c5t1d0  FAULTED      0     0     0  corrupted data
> 
> c5t1d0 does indeed only appear once in the "format" list. I
wonder how to go about correcting this if I can''t uniquely identify the
failing drive.
> 
> "format" takes forever to spill its guts, and the zpool commands
all hang...... clearly there is hardware error here, probably causing that, but
not sure how to identify which disk to pull.

Alex Dolski

2011-May-22 00:05 UTC

head link

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

Hi Cindy,

Thanks for the advice. This is just a little old Gateway PC provisioned as an
informal workgroup server. The main storage is two SATA drives in an external
enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b,
upgraded from snv_111a.

I can''t identify a cause in particular. The box has been running for
several months without much oversight. It''s possible that the two eSATA
cables got reconnected to different ports after a recent move.

The backup has been made and I will try the export & import, per your advice
(if zpool command works - it does again at the moment, no reboot!). I will also
try switching the eSATA cables to opposite ports.

Thanks,
Alex


Command output follows:

# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB>
          /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0
       1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63>
          /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0
       2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63>
          /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0
       3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB>
          /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0


# zpool history tank
History for ''tank'':
2010-06-18.15:14:16 zpool create tank c13t0d0
2011-05-07.02:00:07 zpool scrub tank
2011-05-14.02:00:08 zpool scrub tank
2011-05-21.02:00:12 zpool scrub tank
<a million ''zfs snapshot'' and ''zfs
destroy'' events from zfs-auto-snap omitted>


# zdb -l /dev/dsk/c5t1d0s0
--------------------------------------------
LABEL 0
--------------------------------------------
    version: 14
    name: ''tank''
    state: 0
    txg: 3374337
    pool_guid: 6242690959503408617
    hostid: 8697169
    hostname: ''wdssandbox''
    top_guid: 17982590661103377266
    guid: 1717308203478351258
    vdev_children: 1
    vdev_tree:
        type: ''mirror''
        id: 0
        guid: 17982590661103377266
        whole_disk: 0
        metaslab_array: 23
        metaslab_shift: 32
        ashift: 9
        asize: 500094468096
        is_log: 0
        children[0]:
            type: ''disk''
            id: 0
            guid: 1717308203478351258
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 27
        children[1]:
            type: ''disk''
            id: 1
            guid: 9267693216478869057
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 893
--------------------------------------------
LABEL 1
--------------------------------------------
    version: 14
    name: ''tank''
    state: 0
    txg: 3374337
    pool_guid: 6242690959503408617
    hostid: 8697169
    hostname: ''wdssandbox''
    top_guid: 17982590661103377266
    guid: 1717308203478351258
    vdev_children: 1
    vdev_tree:
        type: ''mirror''
        id: 0
        guid: 17982590661103377266
        whole_disk: 0
        metaslab_array: 23
        metaslab_shift: 32
        ashift: 9
        asize: 500094468096
        is_log: 0
        children[0]:
            type: ''disk''
            id: 0
            guid: 1717308203478351258
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 27
        children[1]:
            type: ''disk''
            id: 1
            guid: 9267693216478869057
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 893
--------------------------------------------
LABEL 2
--------------------------------------------
    version: 14
    name: ''tank''
    state: 0
    txg: 3374337
    pool_guid: 6242690959503408617
    hostid: 8697169
    hostname: ''wdssandbox''
    top_guid: 17982590661103377266
    guid: 1717308203478351258
    vdev_children: 1
    vdev_tree:
        type: ''mirror''
        id: 0
        guid: 17982590661103377266
        whole_disk: 0
        metaslab_array: 23
        metaslab_shift: 32
        ashift: 9
        asize: 500094468096
        is_log: 0
        children[0]:
            type: ''disk''
            id: 0
            guid: 1717308203478351258
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 27
        children[1]:
            type: ''disk''
            id: 1
            guid: 9267693216478869057
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 893
--------------------------------------------
LABEL 3
--------------------------------------------
    version: 14
    name: ''tank''
    state: 0
    txg: 3374337
    pool_guid: 6242690959503408617
    hostid: 8697169
    hostname: ''wdssandbox''
    top_guid: 17982590661103377266
    guid: 1717308203478351258
    vdev_children: 1
    vdev_tree:
        type: ''mirror''
        id: 0
        guid: 17982590661103377266
        whole_disk: 0
        metaslab_array: 23
        metaslab_shift: 32
        ashift: 9
        asize: 500094468096
        is_log: 0
        children[0]:
            type: ''disk''
            id: 0
            guid: 1717308203478351258
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 27
        children[1]:
            type: ''disk''
            id: 1
            guid: 9267693216478869057
            path: ''/dev/dsk/c5t1d0s0''
            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
            phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132
at 0/disk at 1,0:a''
            whole_disk: 1
            DTL: 893







On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote:
> Hi Alex
> 
> More scary than interesting to me.
> 
> What kind of hardware and which Solaris release?
> 
> Do you know what steps lead up to this problem? Any recent hardware
> changes?
> 
> This output should tell you which disks were in this pool originally:
> 
> # zpool history tank
> 
> If the history identifies tank''s actual disks, maybe you can
determine
> which disk is masquerading as c5t1d0.
> 
> If that doesn''t work, accessing the individual disk entries in
format
> should tell which one is the problem, if its only one.
> 
> I would like to see the output of this command:
> 
> # zdb -l /dev/dsk/c5t1d0s0
> 
> Make sure you have a good backup of your data. If you need to pull a
> disk to check cabling, or rule out controller issues, you should
> probably export this pool first. Have a good backup.
> 
> Others have resolved minor device issues by exporting/importing the
> pool but with format/zpool commands hanging on your system, I''m
not
> confident that this operation will work for you.
> 
> Thanks,
> 
> Cindy
> 
> On 05/19/11 12:17, Alex wrote:
>> I thought this was interesting - it looks like we have a failing drive
in our mirror, but the two device nodes in the mirror are the same:
>>  pool: tank
>> state: DEGRADED
>> status: One or more devices could not be used because the label is
missing or
>>        invalid.  Sufficient replicas exist for the pool to continue
>>        functioning in a degraded state.
>> action: Replace the device using ''zpool replace''.
>>   see: http://www.sun.com/msg/ZFS-8000-4J
>> scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45
2011
>> config:
>>        NAME        STATE     READ WRITE CKSUM
>>        tank        DEGRADED     0     0     0
>>          mirror-0  DEGRADED     0     0     0
>>            c5t1d0  ONLINE       0     0     0
>>            c5t1d0  FAULTED      0     0     0  corrupted data
>> c5t1d0 does indeed only appear once in the "format" list. I
wonder how to go about correcting this if I can''t uniquely identify the
failing drive.
>> "format" takes forever to spill its guts, and the zpool
commands all hang...... clearly there is hardware error here, probably causing
that, but not sure how to identify which disk to pull.

Cindy Swearingen

2011-May-24 16:58 UTC

head link

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

Hi Alex,

If the hardware and cables were moved around then this is probably
the root cause of your problem. You should see if you can move the
devices/cabling back to what they were before the move.

The zpool history output provides the original device name, which
isn''t c5t1d0, either:

# zpool create tank c13t0d0

You might grep the zpool history output to find out which disk was
eventually attached, like this:

# zpool history | grep attach

But its clear from the zdb -l output, that the devid for this
particular device changed, which we''ve seen happen on some hardware. If
the devid persists, ZFS can follow the devid of the device even if its
physical path changes and is able to recover more gracefully.

If you continue to use this hardware for your storage pool, you should
export the pool before making any kind of hardware change.

Thanks,

Cindy


On 05/21/11 18:05, Alex Dolski wrote:> Hi Cindy,
> 
> Thanks for the advice. This is just a little old Gateway PC provisioned as
an informal workgroup server. The main storage is two SATA drives in an external
enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b,
upgraded from snv_111a.
> 
> I can''t identify a cause in particular. The box has been running
for several months without much oversight. It''s possible that the two
eSATA cables got reconnected to different ports after a recent move.
> 
> The backup has been made and I will try the export & import, per your
advice (if zpool command works - it does again at the moment, no reboot!). I
will also try switching the eSATA cables to opposite ports.
> 
> Thanks,
> Alex
> 
> 
> Command output follows:
> 
> # format
> Searching for disks...done
> 
> AVAILABLE DISK SELECTIONS:
>        0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB>
>           /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0
>        1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63>
>           /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0
>        2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63>
>           /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0
>        3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB>
>           /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0
> 
> 
> # zpool history tank
> History for ''tank'':
> 2010-06-18.15:14:16 zpool create tank c13t0d0
> 2011-05-07.02:00:07 zpool scrub tank
> 2011-05-14.02:00:08 zpool scrub tank
> 2011-05-21.02:00:12 zpool scrub tank
> <a million ''zfs snapshot'' and ''zfs
destroy'' events from zfs-auto-snap omitted>
> 
> 
> # zdb -l /dev/dsk/c5t1d0s0
> --------------------------------------------
> LABEL 0
> --------------------------------------------
>     version: 14
>     name: ''tank''
>     state: 0
>     txg: 3374337
>     pool_guid: 6242690959503408617
>     hostid: 8697169
>     hostname: ''wdssandbox''
>     top_guid: 17982590661103377266
>     guid: 1717308203478351258
>     vdev_children: 1
>     vdev_tree:
>         type: ''mirror''
>         id: 0
>         guid: 17982590661103377266
>         whole_disk: 0
>         metaslab_array: 23
>         metaslab_shift: 32
>         ashift: 9
>         asize: 500094468096
>         is_log: 0
>         children[0]:
>             type: ''disk''
>             id: 0
>             guid: 1717308203478351258
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 27
>         children[1]:
>             type: ''disk''
>             id: 1
>             guid: 9267693216478869057
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 893
> --------------------------------------------
> LABEL 1
> --------------------------------------------
>     version: 14
>     name: ''tank''
>     state: 0
>     txg: 3374337
>     pool_guid: 6242690959503408617
>     hostid: 8697169
>     hostname: ''wdssandbox''
>     top_guid: 17982590661103377266
>     guid: 1717308203478351258
>     vdev_children: 1
>     vdev_tree:
>         type: ''mirror''
>         id: 0
>         guid: 17982590661103377266
>         whole_disk: 0
>         metaslab_array: 23
>         metaslab_shift: 32
>         ashift: 9
>         asize: 500094468096
>         is_log: 0
>         children[0]:
>             type: ''disk''
>             id: 0
>             guid: 1717308203478351258
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 27
>         children[1]:
>             type: ''disk''
>             id: 1
>             guid: 9267693216478869057
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 893
> --------------------------------------------
> LABEL 2
> --------------------------------------------
>     version: 14
>     name: ''tank''
>     state: 0
>     txg: 3374337
>     pool_guid: 6242690959503408617
>     hostid: 8697169
>     hostname: ''wdssandbox''
>     top_guid: 17982590661103377266
>     guid: 1717308203478351258
>     vdev_children: 1
>     vdev_tree:
>         type: ''mirror''
>         id: 0
>         guid: 17982590661103377266
>         whole_disk: 0
>         metaslab_array: 23
>         metaslab_shift: 32
>         ashift: 9
>         asize: 500094468096
>         is_log: 0
>         children[0]:
>             type: ''disk''
>             id: 0
>             guid: 1717308203478351258
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 27
>         children[1]:
>             type: ''disk''
>             id: 1
>             guid: 9267693216478869057
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 893
> --------------------------------------------
> LABEL 3
> --------------------------------------------
>     version: 14
>     name: ''tank''
>     state: 0
>     txg: 3374337
>     pool_guid: 6242690959503408617
>     hostid: 8697169
>     hostname: ''wdssandbox''
>     top_guid: 17982590661103377266
>     guid: 1717308203478351258
>     vdev_children: 1
>     vdev_tree:
>         type: ''mirror''
>         id: 0
>         guid: 17982590661103377266
>         whole_disk: 0
>         metaslab_array: 23
>         metaslab_shift: 32
>         ashift: 9
>         asize: 500094468096
>         is_log: 0
>         children[0]:
>             type: ''disk''
>             id: 0
>             guid: 1717308203478351258
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 27
>         children[1]:
>             type: ''disk''
>             id: 1
>             guid: 9267693216478869057
>             path: ''/dev/dsk/c5t1d0s0''
>             devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>             phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>             whole_disk: 1
>             DTL: 893
> 
> 
> 
> 
> 
> 
> 
> On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote:
> 
>> Hi Alex
>>
>> More scary than interesting to me.
>>
>> What kind of hardware and which Solaris release?
>>
>> Do you know what steps lead up to this problem? Any recent hardware
>> changes?
>>
>> This output should tell you which disks were in this pool originally:
>>
>> # zpool history tank
>>
>> If the history identifies tank''s actual disks, maybe you can
determine
>> which disk is masquerading as c5t1d0.
>>
>> If that doesn''t work, accessing the individual disk entries in
format
>> should tell which one is the problem, if its only one.
>>
>> I would like to see the output of this command:
>>
>> # zdb -l /dev/dsk/c5t1d0s0
>>
>> Make sure you have a good backup of your data. If you need to pull a
>> disk to check cabling, or rule out controller issues, you should
>> probably export this pool first. Have a good backup.
>>
>> Others have resolved minor device issues by exporting/importing the
>> pool but with format/zpool commands hanging on your system,
I''m not
>> confident that this operation will work for you.
>>
>> Thanks,
>>
>> Cindy
>>
>> On 05/19/11 12:17, Alex wrote:
>>> I thought this was interesting - it looks like we have a failing
drive in our mirror, but the two device nodes in the mirror are the same:
>>>  pool: tank
>>> state: DEGRADED
>>> status: One or more devices could not be used because the label is
missing or
>>>        invalid.  Sufficient replicas exist for the pool to continue
>>>        functioning in a degraded state.
>>> action: Replace the device using ''zpool replace''.
>>>   see: http://www.sun.com/msg/ZFS-8000-4J
>>> scrub: scrub completed after 1h9m with 0 errors on Sat May 14
03:09:45 2011
>>> config:
>>>        NAME        STATE     READ WRITE CKSUM
>>>        tank        DEGRADED     0     0     0
>>>          mirror-0  DEGRADED     0     0     0
>>>            c5t1d0  ONLINE       0     0     0
>>>            c5t1d0  FAULTED      0     0     0  corrupted data
>>> c5t1d0 does indeed only appear once in the "format" list.
I wonder how to go about correcting this if I can''t uniquely identify
the failing drive.
>>> "format" takes forever to spill its guts, and the zpool
commands all hang...... clearly there is hardware error here, probably causing
that, but not sure how to identify which disk to pull.
>

Alex Dolski

2011-May-24 20:57 UTC

head link

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

Sure enough Cindy, the eSATA cables had been crossed. I exported, powered off,
reversed the cables, booted, imported, and the pool is currently resilvering
with both c5t0d0 & c5t1d0 present in the mirror. :) Thank you!!

Alex



On May 24, 2011, at 9:58 AM, Cindy Swearingen wrote:
> Hi Alex,
> 
> If the hardware and cables were moved around then this is probably
> the root cause of your problem. You should see if you can move the
> devices/cabling back to what they were before the move.
> 
> The zpool history output provides the original device name, which
> isn''t c5t1d0, either:
> 
> # zpool create tank c13t0d0
> 
> You might grep the zpool history output to find out which disk was
> eventually attached, like this:
> 
> # zpool history | grep attach
> 
> But its clear from the zdb -l output, that the devid for this
> particular device changed, which we''ve seen happen on some
hardware. If
> the devid persists, ZFS can follow the devid of the device even if its
> physical path changes and is able to recover more gracefully.
> 
> If you continue to use this hardware for your storage pool, you should
> export the pool before making any kind of hardware change.
> 
> Thanks,
> 
> Cindy
> 
> 
> On 05/21/11 18:05, Alex Dolski wrote:
>> Hi Cindy,
>> Thanks for the advice. This is just a little old Gateway PC provisioned
as an informal workgroup server. The main storage is two SATA drives in an
external enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is
snv_134b, upgraded from snv_111a.
>> I can''t identify a cause in particular. The box has been
running for several months without much oversight. It''s possible that
the two eSATA cables got reconnected to different ports after a recent move.
>> The backup has been made and I will try the export & import, per
your advice (if zpool command works - it does again at the moment, no reboot!).
I will also try switching the eSATA cables to opposite ports.
>> Thanks,
>> Alex
>> Command output follows:
>> # format
>> Searching for disks...done
>> AVAILABLE DISK SELECTIONS:
>>       0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB>
>>          /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0
>>       1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63>
>>          /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0
>>       2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63>
>>          /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0
>>       3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB>
>>          /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0
>> # zpool history tank
>> History for ''tank'':
>> 2010-06-18.15:14:16 zpool create tank c13t0d0
>> 2011-05-07.02:00:07 zpool scrub tank
>> 2011-05-14.02:00:08 zpool scrub tank
>> 2011-05-21.02:00:12 zpool scrub tank
>> <a million ''zfs snapshot'' and ''zfs
destroy'' events from zfs-auto-snap omitted>
>> # zdb -l /dev/dsk/c5t1d0s0
>> --------------------------------------------
>> LABEL 0
>> --------------------------------------------
>>    version: 14
>>    name: ''tank''
>>    state: 0
>>    txg: 3374337
>>    pool_guid: 6242690959503408617
>>    hostid: 8697169
>>    hostname: ''wdssandbox''
>>    top_guid: 17982590661103377266
>>    guid: 1717308203478351258
>>    vdev_children: 1
>>    vdev_tree:
>>        type: ''mirror''
>>        id: 0
>>        guid: 17982590661103377266
>>        whole_disk: 0
>>        metaslab_array: 23
>>        metaslab_shift: 32
>>        ashift: 9
>>        asize: 500094468096
>>        is_log: 0
>>        children[0]:
>>            type: ''disk''
>>            id: 0
>>            guid: 1717308203478351258
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 27
>>        children[1]:
>>            type: ''disk''
>>            id: 1
>>            guid: 9267693216478869057
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 893
>> --------------------------------------------
>> LABEL 1
>> --------------------------------------------
>>    version: 14
>>    name: ''tank''
>>    state: 0
>>    txg: 3374337
>>    pool_guid: 6242690959503408617
>>    hostid: 8697169
>>    hostname: ''wdssandbox''
>>    top_guid: 17982590661103377266
>>    guid: 1717308203478351258
>>    vdev_children: 1
>>    vdev_tree:
>>        type: ''mirror''
>>        id: 0
>>        guid: 17982590661103377266
>>        whole_disk: 0
>>        metaslab_array: 23
>>        metaslab_shift: 32
>>        ashift: 9
>>        asize: 500094468096
>>        is_log: 0
>>        children[0]:
>>            type: ''disk''
>>            id: 0
>>            guid: 1717308203478351258
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 27
>>        children[1]:
>>            type: ''disk''
>>            id: 1
>>            guid: 9267693216478869057
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 893
>> --------------------------------------------
>> LABEL 2
>> --------------------------------------------
>>    version: 14
>>    name: ''tank''
>>    state: 0
>>    txg: 3374337
>>    pool_guid: 6242690959503408617
>>    hostid: 8697169
>>    hostname: ''wdssandbox''
>>    top_guid: 17982590661103377266
>>    guid: 1717308203478351258
>>    vdev_children: 1
>>    vdev_tree:
>>        type: ''mirror''
>>        id: 0
>>        guid: 17982590661103377266
>>        whole_disk: 0
>>        metaslab_array: 23
>>        metaslab_shift: 32
>>        ashift: 9
>>        asize: 500094468096
>>        is_log: 0
>>        children[0]:
>>            type: ''disk''
>>            id: 0
>>            guid: 1717308203478351258
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 27
>>        children[1]:
>>            type: ''disk''
>>            id: 1
>>            guid: 9267693216478869057
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 893
>> --------------------------------------------
>> LABEL 3
>> --------------------------------------------
>>    version: 14
>>    name: ''tank''
>>    state: 0
>>    txg: 3374337
>>    pool_guid: 6242690959503408617
>>    hostid: 8697169
>>    hostname: ''wdssandbox''
>>    top_guid: 17982590661103377266
>>    guid: 1717308203478351258
>>    vdev_children: 1
>>    vdev_tree:
>>        type: ''mirror''
>>        id: 0
>>        guid: 17982590661103377266
>>        whole_disk: 0
>>        metaslab_array: 23
>>        metaslab_shift: 32
>>        ashift: 9
>>        asize: 500094468096
>>        is_log: 0
>>        children[0]:
>>            type: ''disk''
>>            id: 0
>>            guid: 1717308203478351258
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 27
>>        children[1]:
>>            type: ''disk''
>>            id: 1
>>            guid: 9267693216478869057
>>            path: ''/dev/dsk/c5t1d0s0''
>>            devid: ''id1,sd at
SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a''
>>            phys_path: ''/pci at 0,0/pci8086,2845 at
1c,3/pci1095,3132 at 0/disk at 1,0:a''
>>            whole_disk: 1
>>            DTL: 893
>> On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote:
>>> Hi Alex
>>> 
>>> More scary than interesting to me.
>>> 
>>> What kind of hardware and which Solaris release?
>>> 
>>> Do you know what steps lead up to this problem? Any recent hardware
>>> changes?
>>> 
>>> This output should tell you which disks were in this pool
originally:
>>> 
>>> # zpool history tank
>>> 
>>> If the history identifies tank''s actual disks, maybe you
can determine
>>> which disk is masquerading as c5t1d0.
>>> 
>>> If that doesn''t work, accessing the individual disk
entries in format
>>> should tell which one is the problem, if its only one.
>>> 
>>> I would like to see the output of this command:
>>> 
>>> # zdb -l /dev/dsk/c5t1d0s0
>>> 
>>> Make sure you have a good backup of your data. If you need to pull
a
>>> disk to check cabling, or rule out controller issues, you should
>>> probably export this pool first. Have a good backup.
>>> 
>>> Others have resolved minor device issues by exporting/importing the
>>> pool but with format/zpool commands hanging on your system,
I''m not
>>> confident that this operation will work for you.
>>> 
>>> Thanks,
>>> 
>>> Cindy
>>> 
>>> On 05/19/11 12:17, Alex wrote:
>>>> I thought this was interesting - it looks like we have a
failing drive in our mirror, but the two device nodes in the mirror are the
same:
>>>> pool: tank
>>>> state: DEGRADED
>>>> status: One or more devices could not be used because the label
is missing or
>>>>       invalid.  Sufficient replicas exist for the pool to
continue
>>>>       functioning in a degraded state.
>>>> action: Replace the device using ''zpool
replace''.
>>>>  see: http://www.sun.com/msg/ZFS-8000-4J
>>>> scrub: scrub completed after 1h9m with 0 errors on Sat May 14
03:09:45 2011
>>>> config:
>>>>       NAME        STATE     READ WRITE CKSUM
>>>>       tank        DEGRADED     0     0     0
>>>>         mirror-0  DEGRADED     0     0     0
>>>>           c5t1d0  ONLINE       0     0     0
>>>>           c5t1d0  FAULTED      0     0     0  corrupted data
>>>> c5t1d0 does indeed only appear once in the "format"
list. I wonder how to go about correcting this if I can''t uniquely
identify the failing drive.
>>>> "format" takes forever to spill its guts, and the
zpool commands all hang...... clearly there is hardware error here, probably
causing that, but not sure how to identify which disk to pull.

zfs discuss - May 2011 - Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...