Hello, I have a question about ZFS-8000-8A and block volumes. I have 2 mirror sets in one zpool. Build 134 amd64 (upgraded since it was released from 2009.06) Pool version is still 13 pool: data state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 4h16m with 2 errors on Fri Sep 17 13:19:04 2010 config: NAME STATE READ WRITE CKSUM data DEGRADED 0 0 28 mirror-0 DEGRADED 0 0 56 c0t0d0 DEGRADED 0 0 56 too many errors c9t0d0 DEGRADED 0 0 56 too many errors mirror-1 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: data/zlun02:<0x1> data/zlun03:<0x1> I SAN boot my systems and this block volume is a windows install, windows boot and run fine. it does however indicate that the disk has a bad block in event viewer. I have been running this setup since build 99 and boot CentOS, win2k8 and Vista/7 from it. ZFS is now unable to get data from the mirror but was able to write it before, so I assume this is either a controller/system fault, disk fault or a ZFS fault? Could a client side FC driver/HBA fault cause this? I did a full scrub of the pool twice. 1st only zlun02 showed up. then I accessed zlun03 via the windows 7 install running on zlun02 and now it shows as a problem also. The system also run 2 virtual CentOS machines from this pool "data". I also have a samba share configured and data is accessed sometimes heavily from it and no problems so far for anything else. All other pools are also fine on the system The system has been running for 64 days (uptime) no ungraceful shutdown on either the server or the system accessing the block volume and then the problems started. I did a full shutdown and power on, did a zpool clear and did a scrub. Still it remains. My real question here is how can I make a backup/move of the block volume zlun02 via ZFS or is this impossible. Due to licensing on some software it is a real nightmare to reinstall (once I found out what the problem is) I tried making a snapshot of the fs and tried to use zfs send/recv, but this fails as can be expected. Any ideas would be welcomed. Also if anyone knows of a tool I can use to test the disk(s) without causing damage to zfs please post, offline online does not matter any tool that has been tested with ZFS on an affected disk. Needles to say these disks are under heavy load and it could be that I am real unlucky for it to fail at the same time. I even split each mirror set between 2 controllers. I have 4 disks spare (cold), but I do not know what the result would be to replace each disk. I assume the rebuild/resilver will fail since not all data is available anymore. Thanks, -- This message posted from opensolaris.org
On Sat, 18 Sep 2010, Heinrich wrote:> > I SAN boot my systems and this block volume is a windows install, > windows boot and run fine. it does however indicate that the disk > has a bad block in event viewer. I have been running this setup > since build 99 and boot CentOS, win2k8 and Vista/7 from it. ZFS is > now unable to get data from the mirror but was able to write it > before, so I assume this is either a controller/system fault, disk > fault or a ZFS fault? Could a client side FC driver/HBA fault cause > this? I did a full scrub of the pool twice. 1st only zlun02 showed > up. then I accessed zlun03 via the windows 7 install running on > zlun02 and now it shows as a problem also.It is very unusual to obtain the same number of errors (probably same errors) from two devices in a pair. This should indicate a common symptom such as a memory error (does your system have ECC?), controller glitch, or a shared power supply issue.> My real question here is how can I make a backup/move of the block > volume zlun02 via ZFS or is this impossible. Due to licensing on > some software it is a real nightmare to reinstall (once I found out > what the problem is) I tried making a snapshot of the fs and tried > to use zfs send/recv, but this fails as can be expected. Any ideas > would be welcomed. Also if anyone knows of a tool I can use to testIt seems like you could use ''dd'' with the ''noerror'' option and "sync conversion" to do a low-level copy of the data: noerror Does not stop processing on an input error. When an input error occurs, a diagnostic message is written on stan- dard error, followed by the current input and output block counts in the same format as used at completion. If the sync conversion is specified, the missing input is replaced with null bytes and processed normally. Otherwise, the input block will be omitted from the output. The copy could be to a new zvol in the same pool (assuming you trust the disks) or you could pipe it over ssh to another ''dd''. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Bob Friesenhahn > > It is very unusual to obtain the same number of errors (probably same > errors) from two devices in a pair. This should indicate a common > symptom such as a memory error (does your system have ECC?), > controller glitch, or a shared power supply issue.Bob''s right. I didn''t notice that both sides of the mirror have precisely 56 checksum errors. Ignore what I said about adding a 3rd disk to the mirror. It won''t help. The 3rd mirror would have only been useful if the block corruption on these 2 disks weren''t the same blocks. I think you have to acknowledge the fact that you have corrupt data. And you should run some memory diagnostics on your system to see if you can detect some failing memory. The cause is not necessarily memory, as Bob pointed out, but a typical way to produce the result you''re seeing is ... ZFS calculates a checksum of a block it''s about to write to disk, and of course that checksum is stored in ram. Unfortunately, if it''s stored in corrupt ram, then ... when it''s written to disk, of course the checksum will mismatch. And the faulty checksum gets written to both sides of the mirror. It is discovered later during your scrub. There is no un-corrupt copy of the data that ZFS thought it wrote. At least it''s detected by ZFS. Without checksumming, that error would pass undetected.
I have registered ECC memory in the system. I will run some memory diagnostics also, but mentioning the power supply got me thinking that around the same time of the errors we had a storm and the lights dimmed in my house quite a few times. It was not enough of a drop to shut the system down but perhaps it had something to do with it. Hopefully it is as simple as that. A UPS is now on my list. I took Bob''s advice, added more disks and created another pool since I do not trust the old pool. I used dd with noerror and sync to a new block volume and that did the trick, thanks Bob and thanks Edward for the explanation. I was a bit unsure using dd on the zvol directly so I added another LUN (on the new pool) to the system''s view and used clonezilla; booted it to the command prompt and use dd from there to duplicate the dev. Any thoughts on directly accessing the zvol via dd? I assume it the same as any other device and should not be a problem. Another thing I noticed is the high % of wait I/O on the disks of the problematic pool. I am not sure if it was ever this high before. My new pool is on a different controller and it is a different raid type so I cannot compare. This time I selected raidz2 Thanks for the replies really appreciate it. -- This message posted from opensolaris.org