We're just using Linux software RAID for the first time - RAID1, and the other day, a drive failed. We have a clone machine to play with, so it's not that critical, but.... I partitioned a replacement drive. On the clone, I marked the RAID partitions on /dev/sda failed, and remove, and pulled the drive. After several iterations, I waited a minute or two, until all messages had stopped, and there was only /dev/sdb*, and then put the new one in... and it appears as /dev/sdc. I don't want to reboot the box, and after googling a bit, it looks as though I *might* be able to use udevadm to change that... but the manpage leaves something to be desired... like a man page. The one that appears interesting to me is udevadm --test --action=<string>... and the actual command that would run, but there is *ZERO* information as to what actions are available, other than the default of "add". Clues for the poor, folks? mark
Am 06.12.2011 19:28, schrieb m.roth at 5-cent.us:> We're just using Linux software RAID for the first time - RAID1, and the > other day, a drive failed. We have a clone machine to play with, so it's > not that critical, but.... > > I partitioned a replacement drive. On the clone, I marked the RAID > partitions on /dev/sda failed, and remove, and pulled the drive.dd if=/dev/sdx of=/dev/sdy bs=512 count=1 reboot tp close the whole MBR and partition-table> several iterations, I waited a minute or two, until all messages had > stopped, and there was only /dev/sdb*, and then put the new one in... and > it appears as /dev/sdc.the device name is totally uninteresting, the IDs are mdadm /dev/mdx --add /dev/sdex -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20111206/a3546013/attachment.sig>
Reindl Harald wrote:> > > Am 06.12.2011 19:28, schrieb m.roth at 5-cent.us: >> We're just using Linux software RAID for the first time - RAID1, and the >> other day, a drive failed. We have a clone machine to play with, so it's >> not that critical, but.... >> >> I partitioned a replacement drive. On the clone, I marked the RAID >> partitions on /dev/sda failed, and remove, and pulled the drive.<snip>>> several iterations, I waited a minute or two, until all messages had >> stopped, and there was only /dev/sdb*, and then put the new one in... >> and it appears as /dev/sdc. > > the device name is totally uninteresting, the IDs are > mdadm /dev/mdx --add /dev/sdexNo, it's not uninteresting. I can't be sure that when it reboots, it won't come back as /dev/sda. And the few places I find that have howtos on replacing failed RAID drives don't seem to have run into this issue with udev (I assume) and /dev/sda. mark
On Tuesday, December 06, 2011 02:21:09 PM m.roth at 5-cent.us wrote:> Reindl Harald wrote: > > the device name is totally uninteresting, the IDs are > > mdadm /dev/mdx --add /dev/sdex> No, it's not uninteresting. I can't be sure that when it reboots, it won't > come back as /dev/sda.The RAIDsets will be assembled by UUID, not by device name. It doesn't matter whether it comes up as /dev/sda or /dev/sdah or whatever, except for booting purposes, which you'll need to handle manually by making sure all the bootloader sectors (some of which can be outside any partition) are properly copied over; just getting the MBR is not enough in some cases. I have, on upstream EL 6.1, a box with two 750G drives in RAID1 (they are not the boot drives). The device names for the component devices do not come up the same on every boot; some boots they come up as /dev/sdx and /dev/sdy, and some boots /dev/sdw and /dev/sdab, and others. The /dev/md devices come up fine and get mounted fine even though the component device names are somewhat nondeterministic.
On Tue, Dec 6, 2011 at 12:28 PM, <m.roth at 5-cent.us> wrote:> We're just using Linux software RAID for the first time - RAID1, and the > other day, a drive failed. We have a clone machine to play with, so it's > not that critical, but.... > > I partitioned a replacement drive. On the clone, I marked the RAID > partitions on /dev/sda failed, and remove, and pulled the drive. After > several iterations, I waited a minute or two, until all messages had > stopped, and there was only /dev/sdb*, and then put the new one in... and > it appears as /dev/sdc. I don't want to reboot the box, and after googling > a bit, it looks as though I *might* be able to use udevadm to change > that... but the manpage leaves something to be desired... like a man page. > The one that appears interesting to me is udevadm --test > --action=<string>... and the actual command that would run, but there is > *ZERO* information as to what actions are available, other than the > default of "add". > > Clues for the poor, folks?If your drive controllers support hot-swap, a freshly swapped drive should appear at the lowest available sd? letter, and a removed one should disappear fairly quickly leaving its identifier available for re-use. But, the disk name does not matter at all. Put the disk in, do a 'dmesg' to see the name the kernel picks for it, add the matching partition and mark it as type 'FD' for future autoassembly. Do a 'cat /proc/mdstat' to see the current raid status. You probably need to 'mdadm --remove /dev/md? /dev/sd?' to remove the failed partition from the running array. Then use 'madam --add /dev/md? /dev/sd? ' with the raid device and new partition names. This will start the mirror sync and should be all you need to do. Then, assuming you are using kernel autodetect to assemble at boot time, it won't matter if the disk is recognized as the same name at bootup, it will still be paired correctly. -- Les Mikesell lesmikesell at gmail.com