Anonymous Remailer (austria)
2012-Jan-17 12:11 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
I have a desktop system with 2 ZFS mirrors. One drive in one mirror is starting to produce read errors and slowing things down dramatically. I detached it and the system is running fine. I can''t tell which drive it is though! The error message and format command let me know which pair the bad drive is in, but I don''t know how to get any more info than that like the serial number etc. to know which disk it is. Is there a command to do this? All it shows is a 500G WD drive producing read errors and format shows identical info for both of the 500G WD drives.
Casper.Dik at oracle.com
2012-Jan-17 12:17 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
> >I have a desktop system with 2 ZFS mirrors. One drive in one mirror is >starting to produce read errors and slowing things down dramatically. I >detached it and the system is running fine. I can''t tell which drive it is >though! The error message and format command let me know which pair the bad >drive is in, but I don''t know how to get any more info than that like the >serial number etc. to know which disk it is. Is there a command to do this? >All it shows is a 500G WD drive producing read errors and format shows >identical info for both of the 500G WD drives.Did you try: iostat -En messages in /var/adm/messages perhaps? They should include the path to the disk in error. Casper
Jim Klimov
2012-Jan-17 12:29 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
2012-01-17 16:17, Casper.Dik at oracle.com ?????:> >> >> I have a desktop system with 2 ZFS mirrors. One drive in one mirror is >> starting to produce read errors and slowing things down dramatically. I >> detached it and the system is running fine. I can''t tell which drive it is >> though! The error message and format command let me know which pair the bad >> drive is in, but I don''t know how to get any more info than that like the >> serial number etc. to know which disk it is. Is there a command to do this? >> All it shows is a 500G WD drive producing read errors and format shows >> identical info for both of the 500G WD drives. > > > Did you try: > > iostat -En > > messages in /var/adm/messages perhaps? They should include the path > to the disk in error.Further on, when the system initialises, it may include serial numbers as part of disk info in /var/adm/messages. Alternatively, you can try rescanning your devices to same effect with "devfsadm -Cv". As another alternative, you can try to see if ZFS knows the info about serial numbers, by inspecting component disks (slices) of your pool, i.e.: # zdb -l /dev/dsk/c4t1d0s0 | egrep ''path|devid|name'' name: ''rpool'' hostname: ''testhost'' path: ''/dev/dsk/c4t1d0s0'' devid: ''id1,sd at AST3320620AS=____________6QF16KGR/a'' phys_path: ''/pci at 0,0/pci108e,534b at 5/disk at 1,0:a'' path: ''/dev/dsk/c4t0d0s0'' devid: ''id1,sd at AST3320620AS=____________6QF181M4/a'' phys_path: ''/pci at 0,0/pci108e,534b at 5/disk at 0,0:a'' (repeats 4x...) In my systems, the "devid" line contains the model and sernum info. This depends on the HDD capabilities and drivers, maybe. You can find the disk device names in "zpool status" or "zpool import" outputs, or deduce them by running format: # echo "" | format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c4t0d0 <DEFAULT cyl 38910 alt 2 hd 255 sec 63> /pci at 0,0/pci108e,534b at 5/disk at 0,0 1. c4t1d0 <ATA-ST3320620AS-K cyl 38910 alt 2 hd 255 sec 63> /pci at 0,0/pci108e,534b at 5/disk at 1,0 Specify disk (enter its number): Specify disk (enter its number): # HTH, //Jim
Edward Ned Harvey
2012-Jan-17 13:43 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Anonymous Remailer (austria) > > I have a desktop system with 2 ZFS mirrors. One drive in one mirror is > starting to produce read errors and slowing things down dramatically. I > detached it and the system is running fine. I can''t tell which drive it is > though! The error message and format command let me know which pair the > bad > drive is in, but I don''t know how to get any more info than that like the > serial number etc. to know which disk it is. Is there a command to dothis?> All it shows is a 500G WD drive producing read errors and format shows > identical info for both of the 500G WD drives.A few things you could do... This will read for 1 second, then pause for one second. Hopefully making a nice consistent blinking light for you to find in your server. export baddisk=/dev/rdsk/cXtYdZ while true ; do dd if=$baddisk of=/dev/null bs=1024k count=128 ; sleep 1 ; done But if you don''t have lights or something... The safest thing for you to do is to zpool export, then shutdown, remove one disk. Power on, devfsadm -Cv, and try to zpool import -a When the bad disk is gone, you''ll be able to import no problem. If you accidentally pull the wrong disk, it will not cause any harm. Pool will refuse to import.
Anonymous Remailer (austria)
2012-Jan-17 16:35 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
Hello all, Trying to reply to everyone so far in one post. Casper.Dik at oracle.com said> Did you try: > > iostat -EnI issued that command and I see (soft) errors from all 4 drives. There is a serial no. field in the message headers but it is has no contents.> > messages in /var/adm/messages perhaps? They should include the path > to the disk in error.Success, thank you! It has the serial numbers of the drives along with the path. I had the path before but I couldn''t relate it to the physical connectors on the mobo. The serial number is what I needed. Edward Ned Harvey said:> A few things you could do... > > This will read for 1 second, then pause for one second. Hopefully making a > nice consistent blinking light for you to find in your server. > export baddisk=/dev/rdsk/cXtYdZ > while true ; do dd if=$baddisk of=/dev/null bs=1024k count=128 ; sleep 1 ; > doneAlas this is a desktop box and there is one LED that lights for any disk activity.> But if you don''t have lights or something... > The safest thing for you to do is to zpool export, then shutdown, remove > one disk. Power on, devfsadm -Cv, and try to zpool import -a When the bad > disk is gone, you''ll be able to import no problem. If you accidentally > pull the wrong disk, it will not cause any harm. Pool will refuse to > import.Sounds like a good plan b. I will keep this in mind. Since I got the serial number from /var/adm/messages I am good to go. I couldn''t copy and paste Jim''s message since he posted with MIME instead of regular ASCII test. Thank you Jim for the help. This turned out to be a Solaris question not a ZFS question. Sorry and thank you Casper and Jim and Edward Ned!
Richard Elling
2012-Jan-17 16:55 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
On Jan 17, 2012, at 4:11 AM, Anonymous Remailer (austria) wrote:> I have a desktop system with 2 ZFS mirrors. One drive in one mirror is > starting to produce read errors and slowing things down dramatically. I > detached it and the system is running fine. I can''t tell which drive it is > though! The error message and format command let me know which pair the bad > drive is in, but I don''t know how to get any more info than that like the > serial number etc. to know which disk it is. Is there a command to do this? > All it shows is a 500G WD drive producing read errors and format shows > identical info for both of the 500G WD drives.If the errors bubble up to ZFS, then they will be shown in the output of "zpool status" Otherwise, you can use "iostat -En," as Casper noted, to show the error counters per disk. For more detailed information, use "fmdump -eV" -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120117/1cdb3ca5/attachment.html>
Anonymous Remailer (austria)
2012-Jan-17 20:33 UTC
[zfs-discuss] Failing WD desktop drive in mirror, how to identify?
Richard Elling said> If the errors bubble up to ZFS, then they will be shown in the output of > "zpool status"On the console I was seeing retryable read errors that eventually failed. The block number and drive path were included but not any info I could relate to the actual disk. zpool status showed a nonzero count of READ errors but nothing more.> Otherwise, you can use "iostat -En," as Casper noted, to show the error > counters per disk. For more detailed information, use "fmdump -eV"What I needed was to identify the drive and Casper and Jim''s suggestion to look at /var/adm/messages was where I found the info. It is just a drive going bad AFAIK and ZFS is working fine including letting me detach the bad drive from a mirror and not screwing up my data. Thanks again to everyone.