Brian Lionberger wrote:> Is there a preferred method to test a raidz2.
> I would like to see the the disks recover on there own after simulating
> a disk failure.
> I''m have a 4 disk configuration.
It really depends on what failure mode you''re interested in. The
most common failure we see from disks in the field is an uncorrectable
read. Pulling a disk will not simulate an uncorrectable read.
For such tests, there are really two different parts of the system
you are exercising: the fault detection and the recovery/reconfiguration.
When we do RAS benchmarking, we ofen find that the recovery/reconfiguration
code path is the interesting part and the fault detection less so.
In other words, there will be little difference in the recovery/
reconfiguration between initiating a zpool replace from the command line
vs fault injection. Unless you are really interested in the maze of
fault detection code, you might want to stick with the command line
interfaces to stimulate a reconfiguration.
If you really do want to stimulate the fault detection code, then a
simple online test which requires no hands-on changes, is to change
the partition table to zero out the size of the partition or slice.
This will have the effect of causing an I/O to receive an ENXIO error
which should then kick off the recovery.
prtvtoc will show you a partition map which can be sent to fmthard -s
to populate the VTOC. Be careful here, this is a place where mistakes
can be painful to overcome.
Dtrace can be used to perform all sorts of nasty fault injection,
but that may be more than you want to bite off at first.
b77 adds a zpool failmode property which will allow you to set the
mode to something other than panic -- options are: wait(default),
continue, and panic. See zpool(1m) for more info. You will want to
know the failmode if you are experimenting with fault injection.
Finally, you will want to be aware of the FMA commands for viewing
reports and diagnosis status. See fmadm(1m), fmdump(1m), and fmstat(1m)
If you want to experiment with fault injection, you''ll want to pay
particular attention to the SERD engines and reset them between runs.
-- richard