Alex
2011-May-19 18:17 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
I thought this was interesting - it looks like we have a failing drive in our mirror, but the two device nodes in the mirror are the same: pool: tank state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-4J scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c5t1d0 ONLINE 0 0 0 c5t1d0 FAULTED 0 0 0 corrupted data c5t1d0 does indeed only appear once in the "format" list. I wonder how to go about correcting this if I can''t uniquely identify the failing drive. "format" takes forever to spill its guts, and the zpool commands all hang...... clearly there is hardware error here, probably causing that, but not sure how to identify which disk to pull. -- This message posted from opensolaris.org
Jim Klimov
2011-May-19 19:41 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
Just a random thought: if two devices have same IDs and seem to work in turns, are you certain you have a mirror and not two paths to the same backend? A few years back I was given to support a box with "sporadically failing drives" which turned out to be two paths to the same external array, and configuring MPxIO failover properly helped the system detect them as being actually one device and stop complaining as long as one path works. On another hand, you might have some "dd if=disk1 of=disk2" kind of cloning which may have puzzled the system... HTH, //Jim
Cindy Swearingen
2011-May-20 15:34 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
Hi Alex More scary than interesting to me. What kind of hardware and which Solaris release? Do you know what steps lead up to this problem? Any recent hardware changes? This output should tell you which disks were in this pool originally: # zpool history tank If the history identifies tank''s actual disks, maybe you can determine which disk is masquerading as c5t1d0. If that doesn''t work, accessing the individual disk entries in format should tell which one is the problem, if its only one. I would like to see the output of this command: # zdb -l /dev/dsk/c5t1d0s0 Make sure you have a good backup of your data. If you need to pull a disk to check cabling, or rule out controller issues, you should probably export this pool first. Have a good backup. Others have resolved minor device issues by exporting/importing the pool but with format/zpool commands hanging on your system, I''m not confident that this operation will work for you. Thanks, Cindy On 05/19/11 12:17, Alex wrote:> I thought this was interesting - it looks like we have a failing drive in our mirror, but the two device nodes in the mirror are the same: > > pool: tank > state: DEGRADED > status: One or more devices could not be used because the label is missing or > invalid. Sufficient replicas exist for the pool to continue > functioning in a degraded state. > action: Replace the device using ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-4J > scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011 > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror-0 DEGRADED 0 0 0 > c5t1d0 ONLINE 0 0 0 > c5t1d0 FAULTED 0 0 0 corrupted data > > c5t1d0 does indeed only appear once in the "format" list. I wonder how to go about correcting this if I can''t uniquely identify the failing drive. > > "format" takes forever to spill its guts, and the zpool commands all hang...... clearly there is hardware error here, probably causing that, but not sure how to identify which disk to pull.
Alex Dolski
2011-May-22 00:05 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
Hi Cindy, Thanks for the advice. This is just a little old Gateway PC provisioned as an informal workgroup server. The main storage is two SATA drives in an external enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b, upgraded from snv_111a. I can''t identify a cause in particular. The box has been running for several months without much oversight. It''s possible that the two eSATA cables got reconnected to different ports after a recent move. The backup has been made and I will try the export & import, per your advice (if zpool command works - it does again at the moment, no reboot!). I will also try switching the eSATA cables to opposite ports. Thanks, Alex Command output follows: # format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB> /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0 1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0 2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63> /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0 3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB> /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0 # zpool history tank History for ''tank'': 2010-06-18.15:14:16 zpool create tank c13t0d0 2011-05-07.02:00:07 zpool scrub tank 2011-05-14.02:00:08 zpool scrub tank 2011-05-21.02:00:12 zpool scrub tank <a million ''zfs snapshot'' and ''zfs destroy'' events from zfs-auto-snap omitted> # zdb -l /dev/dsk/c5t1d0s0 -------------------------------------------- LABEL 0 -------------------------------------------- version: 14 name: ''tank'' state: 0 txg: 3374337 pool_guid: 6242690959503408617 hostid: 8697169 hostname: ''wdssandbox'' top_guid: 17982590661103377266 guid: 1717308203478351258 vdev_children: 1 vdev_tree: type: ''mirror'' id: 0 guid: 17982590661103377266 whole_disk: 0 metaslab_array: 23 metaslab_shift: 32 ashift: 9 asize: 500094468096 is_log: 0 children[0]: type: ''disk'' id: 0 guid: 1717308203478351258 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 27 children[1]: type: ''disk'' id: 1 guid: 9267693216478869057 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 893 -------------------------------------------- LABEL 1 -------------------------------------------- version: 14 name: ''tank'' state: 0 txg: 3374337 pool_guid: 6242690959503408617 hostid: 8697169 hostname: ''wdssandbox'' top_guid: 17982590661103377266 guid: 1717308203478351258 vdev_children: 1 vdev_tree: type: ''mirror'' id: 0 guid: 17982590661103377266 whole_disk: 0 metaslab_array: 23 metaslab_shift: 32 ashift: 9 asize: 500094468096 is_log: 0 children[0]: type: ''disk'' id: 0 guid: 1717308203478351258 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 27 children[1]: type: ''disk'' id: 1 guid: 9267693216478869057 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 893 -------------------------------------------- LABEL 2 -------------------------------------------- version: 14 name: ''tank'' state: 0 txg: 3374337 pool_guid: 6242690959503408617 hostid: 8697169 hostname: ''wdssandbox'' top_guid: 17982590661103377266 guid: 1717308203478351258 vdev_children: 1 vdev_tree: type: ''mirror'' id: 0 guid: 17982590661103377266 whole_disk: 0 metaslab_array: 23 metaslab_shift: 32 ashift: 9 asize: 500094468096 is_log: 0 children[0]: type: ''disk'' id: 0 guid: 1717308203478351258 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 27 children[1]: type: ''disk'' id: 1 guid: 9267693216478869057 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 893 -------------------------------------------- LABEL 3 -------------------------------------------- version: 14 name: ''tank'' state: 0 txg: 3374337 pool_guid: 6242690959503408617 hostid: 8697169 hostname: ''wdssandbox'' top_guid: 17982590661103377266 guid: 1717308203478351258 vdev_children: 1 vdev_tree: type: ''mirror'' id: 0 guid: 17982590661103377266 whole_disk: 0 metaslab_array: 23 metaslab_shift: 32 ashift: 9 asize: 500094468096 is_log: 0 children[0]: type: ''disk'' id: 0 guid: 1717308203478351258 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 27 children[1]: type: ''disk'' id: 1 guid: 9267693216478869057 path: ''/dev/dsk/c5t1d0s0'' devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' whole_disk: 1 DTL: 893 On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote:> Hi Alex > > More scary than interesting to me. > > What kind of hardware and which Solaris release? > > Do you know what steps lead up to this problem? Any recent hardware > changes? > > This output should tell you which disks were in this pool originally: > > # zpool history tank > > If the history identifies tank''s actual disks, maybe you can determine > which disk is masquerading as c5t1d0. > > If that doesn''t work, accessing the individual disk entries in format > should tell which one is the problem, if its only one. > > I would like to see the output of this command: > > # zdb -l /dev/dsk/c5t1d0s0 > > Make sure you have a good backup of your data. If you need to pull a > disk to check cabling, or rule out controller issues, you should > probably export this pool first. Have a good backup. > > Others have resolved minor device issues by exporting/importing the > pool but with format/zpool commands hanging on your system, I''m not > confident that this operation will work for you. > > Thanks, > > Cindy > > On 05/19/11 12:17, Alex wrote: >> I thought this was interesting - it looks like we have a failing drive in our mirror, but the two device nodes in the mirror are the same: >> pool: tank >> state: DEGRADED >> status: One or more devices could not be used because the label is missing or >> invalid. Sufficient replicas exist for the pool to continue >> functioning in a degraded state. >> action: Replace the device using ''zpool replace''. >> see: http://www.sun.com/msg/ZFS-8000-4J >> scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011 >> config: >> NAME STATE READ WRITE CKSUM >> tank DEGRADED 0 0 0 >> mirror-0 DEGRADED 0 0 0 >> c5t1d0 ONLINE 0 0 0 >> c5t1d0 FAULTED 0 0 0 corrupted data >> c5t1d0 does indeed only appear once in the "format" list. I wonder how to go about correcting this if I can''t uniquely identify the failing drive. >> "format" takes forever to spill its guts, and the zpool commands all hang...... clearly there is hardware error here, probably causing that, but not sure how to identify which disk to pull.
Cindy Swearingen
2011-May-24 16:58 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
Hi Alex, If the hardware and cables were moved around then this is probably the root cause of your problem. You should see if you can move the devices/cabling back to what they were before the move. The zpool history output provides the original device name, which isn''t c5t1d0, either: # zpool create tank c13t0d0 You might grep the zpool history output to find out which disk was eventually attached, like this: # zpool history | grep attach But its clear from the zdb -l output, that the devid for this particular device changed, which we''ve seen happen on some hardware. If the devid persists, ZFS can follow the devid of the device even if its physical path changes and is able to recover more gracefully. If you continue to use this hardware for your storage pool, you should export the pool before making any kind of hardware change. Thanks, Cindy On 05/21/11 18:05, Alex Dolski wrote:> Hi Cindy, > > Thanks for the advice. This is just a little old Gateway PC provisioned as an informal workgroup server. The main storage is two SATA drives in an external enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b, upgraded from snv_111a. > > I can''t identify a cause in particular. The box has been running for several months without much oversight. It''s possible that the two eSATA cables got reconnected to different ports after a recent move. > > The backup has been made and I will try the export & import, per your advice (if zpool command works - it does again at the moment, no reboot!). I will also try switching the eSATA cables to opposite ports. > > Thanks, > Alex > > > Command output follows: > > # format > Searching for disks...done > > AVAILABLE DISK SELECTIONS: > 0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB> > /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0 > 1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0 > 2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63> > /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0 > 3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB> > /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0 > > > # zpool history tank > History for ''tank'': > 2010-06-18.15:14:16 zpool create tank c13t0d0 > 2011-05-07.02:00:07 zpool scrub tank > 2011-05-14.02:00:08 zpool scrub tank > 2011-05-21.02:00:12 zpool scrub tank > <a million ''zfs snapshot'' and ''zfs destroy'' events from zfs-auto-snap omitted> > > > # zdb -l /dev/dsk/c5t1d0s0 > -------------------------------------------- > LABEL 0 > -------------------------------------------- > version: 14 > name: ''tank'' > state: 0 > txg: 3374337 > pool_guid: 6242690959503408617 > hostid: 8697169 > hostname: ''wdssandbox'' > top_guid: 17982590661103377266 > guid: 1717308203478351258 > vdev_children: 1 > vdev_tree: > type: ''mirror'' > id: 0 > guid: 17982590661103377266 > whole_disk: 0 > metaslab_array: 23 > metaslab_shift: 32 > ashift: 9 > asize: 500094468096 > is_log: 0 > children[0]: > type: ''disk'' > id: 0 > guid: 1717308203478351258 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 27 > children[1]: > type: ''disk'' > id: 1 > guid: 9267693216478869057 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 893 > -------------------------------------------- > LABEL 1 > -------------------------------------------- > version: 14 > name: ''tank'' > state: 0 > txg: 3374337 > pool_guid: 6242690959503408617 > hostid: 8697169 > hostname: ''wdssandbox'' > top_guid: 17982590661103377266 > guid: 1717308203478351258 > vdev_children: 1 > vdev_tree: > type: ''mirror'' > id: 0 > guid: 17982590661103377266 > whole_disk: 0 > metaslab_array: 23 > metaslab_shift: 32 > ashift: 9 > asize: 500094468096 > is_log: 0 > children[0]: > type: ''disk'' > id: 0 > guid: 1717308203478351258 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 27 > children[1]: > type: ''disk'' > id: 1 > guid: 9267693216478869057 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 893 > -------------------------------------------- > LABEL 2 > -------------------------------------------- > version: 14 > name: ''tank'' > state: 0 > txg: 3374337 > pool_guid: 6242690959503408617 > hostid: 8697169 > hostname: ''wdssandbox'' > top_guid: 17982590661103377266 > guid: 1717308203478351258 > vdev_children: 1 > vdev_tree: > type: ''mirror'' > id: 0 > guid: 17982590661103377266 > whole_disk: 0 > metaslab_array: 23 > metaslab_shift: 32 > ashift: 9 > asize: 500094468096 > is_log: 0 > children[0]: > type: ''disk'' > id: 0 > guid: 1717308203478351258 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 27 > children[1]: > type: ''disk'' > id: 1 > guid: 9267693216478869057 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 893 > -------------------------------------------- > LABEL 3 > -------------------------------------------- > version: 14 > name: ''tank'' > state: 0 > txg: 3374337 > pool_guid: 6242690959503408617 > hostid: 8697169 > hostname: ''wdssandbox'' > top_guid: 17982590661103377266 > guid: 1717308203478351258 > vdev_children: 1 > vdev_tree: > type: ''mirror'' > id: 0 > guid: 17982590661103377266 > whole_disk: 0 > metaslab_array: 23 > metaslab_shift: 32 > ashift: 9 > asize: 500094468096 > is_log: 0 > children[0]: > type: ''disk'' > id: 0 > guid: 1717308203478351258 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 27 > children[1]: > type: ''disk'' > id: 1 > guid: 9267693216478869057 > path: ''/dev/dsk/c5t1d0s0'' > devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' > phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' > whole_disk: 1 > DTL: 893 > > > > > > > > On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote: > >> Hi Alex >> >> More scary than interesting to me. >> >> What kind of hardware and which Solaris release? >> >> Do you know what steps lead up to this problem? Any recent hardware >> changes? >> >> This output should tell you which disks were in this pool originally: >> >> # zpool history tank >> >> If the history identifies tank''s actual disks, maybe you can determine >> which disk is masquerading as c5t1d0. >> >> If that doesn''t work, accessing the individual disk entries in format >> should tell which one is the problem, if its only one. >> >> I would like to see the output of this command: >> >> # zdb -l /dev/dsk/c5t1d0s0 >> >> Make sure you have a good backup of your data. If you need to pull a >> disk to check cabling, or rule out controller issues, you should >> probably export this pool first. Have a good backup. >> >> Others have resolved minor device issues by exporting/importing the >> pool but with format/zpool commands hanging on your system, I''m not >> confident that this operation will work for you. >> >> Thanks, >> >> Cindy >> >> On 05/19/11 12:17, Alex wrote: >>> I thought this was interesting - it looks like we have a failing drive in our mirror, but the two device nodes in the mirror are the same: >>> pool: tank >>> state: DEGRADED >>> status: One or more devices could not be used because the label is missing or >>> invalid. Sufficient replicas exist for the pool to continue >>> functioning in a degraded state. >>> action: Replace the device using ''zpool replace''. >>> see: http://www.sun.com/msg/ZFS-8000-4J >>> scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011 >>> config: >>> NAME STATE READ WRITE CKSUM >>> tank DEGRADED 0 0 0 >>> mirror-0 DEGRADED 0 0 0 >>> c5t1d0 ONLINE 0 0 0 >>> c5t1d0 FAULTED 0 0 0 corrupted data >>> c5t1d0 does indeed only appear once in the "format" list. I wonder how to go about correcting this if I can''t uniquely identify the failing drive. >>> "format" takes forever to spill its guts, and the zpool commands all hang...... clearly there is hardware error here, probably causing that, but not sure how to identify which disk to pull. >
Alex Dolski
2011-May-24 20:57 UTC
[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...
Sure enough Cindy, the eSATA cables had been crossed. I exported, powered off, reversed the cables, booted, imported, and the pool is currently resilvering with both c5t0d0 & c5t1d0 present in the mirror. :) Thank you!! Alex On May 24, 2011, at 9:58 AM, Cindy Swearingen wrote:> Hi Alex, > > If the hardware and cables were moved around then this is probably > the root cause of your problem. You should see if you can move the > devices/cabling back to what they were before the move. > > The zpool history output provides the original device name, which > isn''t c5t1d0, either: > > # zpool create tank c13t0d0 > > You might grep the zpool history output to find out which disk was > eventually attached, like this: > > # zpool history | grep attach > > But its clear from the zdb -l output, that the devid for this > particular device changed, which we''ve seen happen on some hardware. If > the devid persists, ZFS can follow the devid of the device even if its > physical path changes and is able to recover more gracefully. > > If you continue to use this hardware for your storage pool, you should > export the pool before making any kind of hardware change. > > Thanks, > > Cindy > > > On 05/21/11 18:05, Alex Dolski wrote: >> Hi Cindy, >> Thanks for the advice. This is just a little old Gateway PC provisioned as an informal workgroup server. The main storage is two SATA drives in an external enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b, upgraded from snv_111a. >> I can''t identify a cause in particular. The box has been running for several months without much oversight. It''s possible that the two eSATA cables got reconnected to different ports after a recent move. >> The backup has been made and I will try the export & import, per your advice (if zpool command works - it does again at the moment, no reboot!). I will also try switching the eSATA cables to opposite ports. >> Thanks, >> Alex >> Command output follows: >> # format >> Searching for disks...done >> AVAILABLE DISK SELECTIONS: >> 0. c5t1d0 <ATA-WDC WD5000AAKS-0-1D05-465.76GB> >> /pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0 >> 1. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63> >> /pci at 0,0/pci-ide at 1f,2/ide at 0/cmdk at 0,0 >> 2. c9d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63> >> /pci at 0,0/pci-ide at 1f,2/ide at 1/cmdk at 0,0 >> 3. c11t0d0 <WD-Ext HDD 1021-2002-931.51GB> >> /pci at 0,0/pci107b,5058 at 1a,7/storage at 1/disk at 0,0 >> # zpool history tank >> History for ''tank'': >> 2010-06-18.15:14:16 zpool create tank c13t0d0 >> 2011-05-07.02:00:07 zpool scrub tank >> 2011-05-14.02:00:08 zpool scrub tank >> 2011-05-21.02:00:12 zpool scrub tank >> <a million ''zfs snapshot'' and ''zfs destroy'' events from zfs-auto-snap omitted> >> # zdb -l /dev/dsk/c5t1d0s0 >> -------------------------------------------- >> LABEL 0 >> -------------------------------------------- >> version: 14 >> name: ''tank'' >> state: 0 >> txg: 3374337 >> pool_guid: 6242690959503408617 >> hostid: 8697169 >> hostname: ''wdssandbox'' >> top_guid: 17982590661103377266 >> guid: 1717308203478351258 >> vdev_children: 1 >> vdev_tree: >> type: ''mirror'' >> id: 0 >> guid: 17982590661103377266 >> whole_disk: 0 >> metaslab_array: 23 >> metaslab_shift: 32 >> ashift: 9 >> asize: 500094468096 >> is_log: 0 >> children[0]: >> type: ''disk'' >> id: 0 >> guid: 1717308203478351258 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 27 >> children[1]: >> type: ''disk'' >> id: 1 >> guid: 9267693216478869057 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 893 >> -------------------------------------------- >> LABEL 1 >> -------------------------------------------- >> version: 14 >> name: ''tank'' >> state: 0 >> txg: 3374337 >> pool_guid: 6242690959503408617 >> hostid: 8697169 >> hostname: ''wdssandbox'' >> top_guid: 17982590661103377266 >> guid: 1717308203478351258 >> vdev_children: 1 >> vdev_tree: >> type: ''mirror'' >> id: 0 >> guid: 17982590661103377266 >> whole_disk: 0 >> metaslab_array: 23 >> metaslab_shift: 32 >> ashift: 9 >> asize: 500094468096 >> is_log: 0 >> children[0]: >> type: ''disk'' >> id: 0 >> guid: 1717308203478351258 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 27 >> children[1]: >> type: ''disk'' >> id: 1 >> guid: 9267693216478869057 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 893 >> -------------------------------------------- >> LABEL 2 >> -------------------------------------------- >> version: 14 >> name: ''tank'' >> state: 0 >> txg: 3374337 >> pool_guid: 6242690959503408617 >> hostid: 8697169 >> hostname: ''wdssandbox'' >> top_guid: 17982590661103377266 >> guid: 1717308203478351258 >> vdev_children: 1 >> vdev_tree: >> type: ''mirror'' >> id: 0 >> guid: 17982590661103377266 >> whole_disk: 0 >> metaslab_array: 23 >> metaslab_shift: 32 >> ashift: 9 >> asize: 500094468096 >> is_log: 0 >> children[0]: >> type: ''disk'' >> id: 0 >> guid: 1717308203478351258 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 27 >> children[1]: >> type: ''disk'' >> id: 1 >> guid: 9267693216478869057 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 893 >> -------------------------------------------- >> LABEL 3 >> -------------------------------------------- >> version: 14 >> name: ''tank'' >> state: 0 >> txg: 3374337 >> pool_guid: 6242690959503408617 >> hostid: 8697169 >> hostname: ''wdssandbox'' >> top_guid: 17982590661103377266 >> guid: 1717308203478351258 >> vdev_children: 1 >> vdev_tree: >> type: ''mirror'' >> id: 0 >> guid: 17982590661103377266 >> whole_disk: 0 >> metaslab_array: 23 >> metaslab_shift: 32 >> ashift: 9 >> asize: 500094468096 >> is_log: 0 >> children[0]: >> type: ''disk'' >> id: 0 >> guid: 1717308203478351258 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1939879/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 27 >> children[1]: >> type: ''disk'' >> id: 1 >> guid: 9267693216478869057 >> path: ''/dev/dsk/c5t1d0s0'' >> devid: ''id1,sd at SATA_____WDC_WD5000AAKS-0_____WD-WCAWF1769949/a'' >> phys_path: ''/pci at 0,0/pci8086,2845 at 1c,3/pci1095,3132 at 0/disk at 1,0:a'' >> whole_disk: 1 >> DTL: 893 >> On May 20, 2011, at 8:34 AM, Cindy Swearingen wrote: >>> Hi Alex >>> >>> More scary than interesting to me. >>> >>> What kind of hardware and which Solaris release? >>> >>> Do you know what steps lead up to this problem? Any recent hardware >>> changes? >>> >>> This output should tell you which disks were in this pool originally: >>> >>> # zpool history tank >>> >>> If the history identifies tank''s actual disks, maybe you can determine >>> which disk is masquerading as c5t1d0. >>> >>> If that doesn''t work, accessing the individual disk entries in format >>> should tell which one is the problem, if its only one. >>> >>> I would like to see the output of this command: >>> >>> # zdb -l /dev/dsk/c5t1d0s0 >>> >>> Make sure you have a good backup of your data. If you need to pull a >>> disk to check cabling, or rule out controller issues, you should >>> probably export this pool first. Have a good backup. >>> >>> Others have resolved minor device issues by exporting/importing the >>> pool but with format/zpool commands hanging on your system, I''m not >>> confident that this operation will work for you. >>> >>> Thanks, >>> >>> Cindy >>> >>> On 05/19/11 12:17, Alex wrote: >>>> I thought this was interesting - it looks like we have a failing drive in our mirror, but the two device nodes in the mirror are the same: >>>> pool: tank >>>> state: DEGRADED >>>> status: One or more devices could not be used because the label is missing or >>>> invalid. Sufficient replicas exist for the pool to continue >>>> functioning in a degraded state. >>>> action: Replace the device using ''zpool replace''. >>>> see: http://www.sun.com/msg/ZFS-8000-4J >>>> scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011 >>>> config: >>>> NAME STATE READ WRITE CKSUM >>>> tank DEGRADED 0 0 0 >>>> mirror-0 DEGRADED 0 0 0 >>>> c5t1d0 ONLINE 0 0 0 >>>> c5t1d0 FAULTED 0 0 0 corrupted data >>>> c5t1d0 does indeed only appear once in the "format" list. I wonder how to go about correcting this if I can''t uniquely identify the failing drive. >>>> "format" takes forever to spill its guts, and the zpool commands all hang...... clearly there is hardware error here, probably causing that, but not sure how to identify which disk to pull.