On Fri, 2020-11-06 at 12:08 +0100, Thomas Bendler wrote:> Am Fr., 6. Nov. 2020 um 00:52 Uhr schrieb hw <hw at gc-24.de>: > > > [...] > > logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild) > > [...] > > Have you checked the rebuild priority: > > ? ssacli ctrl slot=0 show config detail | grep "Rebuild Priority" > ~ > Rebuild Priority: Medium > ? > > Slot needs to be adjusted to your configuration.Yes, I've set it to high: ssacli ctrl slot=3 show config detail | grep Prior Rebuild Priority: High Expand Priority: Medium Some search results indicate that it's possible that other disks in the array have read errors and might prevent rebuilding for RAID 5. I don't know if there are read errors, and if it's read errors, I think it would mean that these errors would have to affect just the disk which is mirroring the disk that failed, this being a RAID 1+0. But if the RAID is striped across all the disks, that could be any or all of them. The array is still in production and still works, so it should just rebuild. Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 with the two new disks and copy the data over. The remaining 4TB disks can then be used to make a new array. Learn from this that it can be a bad idea to use a RAID 0 for backups and that least one generation of backups must be on redundant storage ...
Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw <hw at gc-24.de>:> [...] > Some search results indicate that it's possible that other disks in the > array have read errors and might prevent rebuilding for RAID 5. I don't > know if there are read errors, and if it's read errors, I think it would > mean that these errors would have to affect just the disk which is > mirroring the disk that failed, this being a RAID 1+0. But if the RAID > is striped across all the disks, that could be any or all of them. > > The array is still in production and still works, so it should just > rebuild. > Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 > with the two new disks and copy the data over. The remaining 4TB disks can > then be used to make a new array. > > Learn from this that it can be a bad idea to use a RAID 0 for backups and > that > least one generation of backups must be on redundant storage ... >Just checked on one of my HP boxes, you can indeed not figure out if one of the discs has read errors. Do you have the option to reboot the box and check on the controller directly? Kind regards Thomas
On Mon, 2020-11-09 at 16:30 +0100, Thomas Bendler wrote:> Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw <hw at gc-24.de>: > > > [...] > > Some search results indicate that it's possible that other disks in the > > array have read errors and might prevent rebuilding for RAID 5. I don't > > know if there are read errors, and if it's read errors, I think it would > > mean that these errors would have to affect just the disk which is > > mirroring the disk that failed, this being a RAID 1+0. But if the RAID > > is striped across all the disks, that could be any or all of them. > > > > The array is still in production and still works, so it should just > > rebuild. > > Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 > > with the two new disks and copy the data over. The remaining 4TB disks can > > then be used to make a new array. > > > > Learn from this that it can be a bad idea to use a RAID 0 for backups and > > that > > least one generation of backups must be on redundant storage ... > > > > Just checked on one of my HP boxes, you can indeed not figure out if one of > the discs has read errors. Do you have the option to reboot the box and > check on the controller directly? >Thanks! The controller (it's BIOS) doesn't show up during boot, so I can't check there for errors. The controller is extremely finicky: The plan to make a RAID 1 from the two new drives has failed because the array with the failed drive is unusable when the failed is missing entirely. In the process of moving the 8TB drives back and forth, it turned out that when an array that was made from them is missing one drive, that array is unusable --- and when putting the missing drive is put back in, the array remains 'Ready for Rebuild' without the rebuild starting. There is also no way to delete an array that is missing a drive. So the theory that the array isn't being rebuilt because other disks have errors is likely wrong. That means that whenenver a disk fails and is being replaced, there is no way to rebuild the array (unless it would happen automatically, which it doesn't). With this experience, these controllers are now deprecated. RAID controllers that can't rebuild an array after a disk has failed and has been replaced are virtually useless.