On Tue, Jan 18, 2005 at 12:21:59PM +0100, David Elsing
wrote:> Quote from the manual of the 4th example of the chapter "HOW TO SET UP
VINUM":
> "In addition, the volume specification includes the keyword
> setupstate, which ensures that all plexes are up after creation."
>
> But a couple of weeks later I read the following in the manual:
>
> "Note that you must use the init command with RAID-5 plexes: otherwise
> extreme data corruption will result if one subdisk fails."
Yes, this particular gotcha bit me a while back and I lost quite a bit
of data (my fault for not having good backups) due to it. IMO, I still
consider it a documentation bug though. That particular bit is buried
in a command reference section rather than being in bold in the "HOW
TO"
guide.
> I read this to my horror after I filled the volume with data. You'll
> probably noticed I didn't init my volume. The disks are in good
> condition. The volume is almost filled to the maximum capacity. So a
> backup is a bit difficult due to the size of it. Are there any other
> options? If one disks fails, do I still get corrupted data?
Yes, if one disk fails for any reason, every third sector will be
garbage and it's unlikely you'll be able to recover anything useful from
it.
I would highly recommend backing up whatever is critically important to
you asap. If you like living dangerously and all the drives are in good
health, the parity data can be (theoretically) repaired with the "vinum
rebuildparity" command, but do so at your own risk... That did allow me
to recover a couple of my partitions that hadn't been trashed yet.
Also, if a good disk gets marked as "down" somehow before you can
correct this, whatever you do, do NOT issue a "vinum start" command on
it. In the current state of the array, that would be destructive and
irreversible. That's what happened to me: ATA timeout caused one of the
drives to temporarily detach, corrupt filesystems caused a panic. If
this happens, you're better off using setstate to force it to up, as
wrong as that would normally be.
I've also noticed that sometimes the parity gets out of sync on its own,
but I don't know the cause. I used to have a cron job that ran vinum
checkparity once a month in the background and occasionally it would
find sectors that needed to be rebuilt. Unfortunately this doesn't seem
to be implemented with gvinum (5.3) yet.
> ...know if any manual maintainer is reading this, but is it possible
> to add this warning to the RAID5 example at the end of vinum(8)?
I'll second this idea, at least for 4.x. I'm not sure how gvinum
handles things as I haven't built any RAID5 volumes from scratch using
it.
Craig