I''m wondering how to interpret what ZFS is telling me in regard to the errors being reported. 1 of my disks (in a 5 disc raidZ array) reports about 4-5 write/read errors every few days. All 5 are directly connected to the motheboard SATA ports, no raid controller card in between. How bad is it? Should I think about replacing the drive? (I imagine it will be difficult to get it RMAed when most OS''s won''t even realise its screwing up) Or are these small enough not to bother with, and I should just keep zpool clearing and ignoring it until something major happens? (As you might be able to tell, I''m new to Opensolaris/ZFS) An example of my output pool: storage state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 0h0m with 0 errors on Tue Oct 13 18:34:39 2009 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c8d1 ONLINE 0 4 0 2.60M resilvered c9d0 ONLINE 0 0 0 c9d1 ONLINE 0 0 0 c10d0 ONLINE 0 0 0 -- This message posted from opensolaris.org
On Tue, 13 Oct 2009, Ren Pillay wrote:> I''m wondering how to interpret what ZFS is telling me in regard to > the errors being reported. 1 of my disks (in a 5 disc raidZ array) > reports about 4-5 write/read errors every few days. All 5 are > directly connected to the motheboard SATA ports, no raid controller > card in between. > > How bad is it? Should I think about replacing the drive? (I imagine > it will be difficult to get it RMAed when most OS''s won''t even > realise its screwing up)Recurring problems usually indicate failing hardware and since you are only using raidz1 you should be concerned (but not alarmed) about it. It is wise to obtain a replacement drive. You didn''t mention if you periodically do a zfs scrub of your pool, but if you haven''t been, you may find that many more issues are turned up by ''zfs scrub''. The failing drive may be riddled with errors. It is wise to do a full ''zfs scrub'' before voluntarily replacing the suspect drive in case there is some undetected data error on one of the other drives which can still be corrected. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/