thr3ads.net - CentOS - [CentOS] OT, hardware: HP smart array drive issue [Jul 2015]

If this information is useful, please help other people find it:
Share via:

m.roth at 5-cent.us

2015-Jul-10 17:49 UTC

[CentOS] OT, hardware: HP smart array drive issue

Jason Warr wrote:> On July 10, 2015 11:47:09 AM CDT, m.roth at 5-cent.us wrote:
>> Hi. Anyone working with these things? I've got a drive in
"predictive
>> failure" on in a RAID5. Now here's the thing: there was an
issue
>> yesterday when I got in, and I wound up power cycling the RAID;
>> first boot of attached server had issues, and said the controller
>> had a failure, and a drive had failed, and wouldn't continue
>> booting; when I gave it the three-finger salute, this time on the
>> way up, during POST, it noted the controller issue... but the
>> thing came up, looking like it did a couple of days ago.
>>
>> Trying to prevent this from happening again, I've decided to
replace
>> the drive that's in predictive failure. The array has a hot spare.
>> I tried to remove, using hpacucli, it refuses "operation not
>> permitted", and there doesn't *seem* to be a "mark as
failed"
>> command. *Do* I just yank the drive?
>>
> Yep, just yank it.  It should start auto rebuilding on the spare.
>
> If you didn't have a spare you would pull the suspect drive and replace
it
> with one of equal or greater capacity and it would auto rebuild as well.
>
> I have a bunch of them at home and have been using them at work for years.
Thanks for your quick reply, Jason. I'm used to LSI/MegaRAID/PERCs, where
you have to fail it, first. Oddity: I had the drive out for more then five
minutes while getting it out of the sled, putting the new one in, oh, and
dusting out the slot (gotta do that for all of them, next maintenance
window), but after I put in the replacement, and used hpacucli to check,
to my surprise it was rebuilding with the replacement, *not* with the
spare.

        mark

Jason Warr

2015-Jul-10 18:13 UTC

head link

[CentOS] OT, hardware: HP smart array drive issue

On 7/10/2015 12:49 PM, m.roth at 5-cent.us wrote:> Jason Warr wrote:
>> On July 10, 2015 11:47:09 AM CDT, m.roth at 5-cent.us wrote:
>>> Hi. Anyone working with these things? I've got a drive in
"predictive
>>> failure" on in a RAID5. Now here's the thing: there was an
issue
>>> yesterday when I got in, and I wound up power cycling the RAID;
>>> first boot of attached server had issues, and said the controller
>>> had a failure, and a drive had failed, and wouldn't continue
>>> booting; when I gave it the three-finger salute, this time on the
>>> way up, during POST, it noted the controller issue... but the
>>> thing came up, looking like it did a couple of days ago.
>>>
>>> Trying to prevent this from happening again, I've decided to
replace
>>> the drive that's in predictive failure. The array has a hot
spare.
>>> I tried to remove, using hpacucli, it refuses "operation not
>>> permitted", and there doesn't *seem* to be a "mark as
failed"
>>> command. *Do* I just yank the drive?
>>>
>> Yep, just yank it.  It should start auto rebuilding on the spare.
>>
>> If you didn't have a spare you would pull the suspect drive and
replace it
>> with one of equal or greater capacity and it would auto rebuild as
well.
>>
>> I have a bunch of them at home and have been using them at work for
years.
> Thanks for your quick reply, Jason. I'm used to LSI/MegaRAID/PERCs,
where
> you have to fail it, first. Oddity: I had the drive out for more then five
> minutes while getting it out of the sled, putting the new one in, oh, and
> dusting out the slot (gotta do that for all of them, next maintenance
> window), but after I put in the replacement, and used hpacucli to check,
> to my surprise it was rebuilding with the replacement, *not* with the
> spare.
>
>          markIt has been a while since I have used a spare but what might have 
happened is the spare went back to being a spare when the real drive was 
replaced.  It seems to me that is the default behavior as a spare can be 
attached to more than one raid group.  That way it keeps your physical 
drive placement consistent.

Thomas Eriksson

2015-Jul-10 18:18 UTC

head link

[CentOS] OT, hardware: HP smart array drive issue

On 07/10/2015 10:49 AM, m.roth at 5-cent.us wrote:> Jason Warr wrote:
>> On July 10, 2015 11:47:09 AM CDT, m.roth at 5-cent.us wrote:
>>> Hi. Anyone working with these things? I've got a drive in
"predictive
>>> failure" on in a RAID5. Now here's the thing: there was an
issue
>>> yesterday when I got in, and I wound up power cycling the RAID;
>>> first boot of attached server had issues, and said the controller
>>> had a failure, and a drive had failed, and wouldn't continue
>>> booting; when I gave it the three-finger salute, this time on t
>>> way up, during POST, it noted the controller issue... but the
>>> thing came up, looking like it did a couple of days ago.
>>>
>>> Trying to prevent this from happening again, I've decided to
replace
>>> the drive that's in predictive failure. The array has a hot
spare.
>>> I tried to remove, using hpacucli, it refuses "operation not
>>> permitted", and there doesn't *seem* to be a "mark as
failed"
>>> command. *Do* I just yank the drive?
>>>
>> Yep, just yank it.  It should start auto rebuilding on the spare.
>>
>> If you didn't have a spare you would pull the suspect drive and
replace it
>> with one of equal or greater capacity and it would auto rebuild as
well.
>>
>> I have a bunch of them at home and have been using them at work for
years.
> 
> Thanks for your quick reply, Jason. I'm used to LSI/MegaRAID/PERCs,
where
> you have to fail it, first. Oddity: I had the drive out for more then five
> minutes while getting it out of the sled, putting the new one in, oh, and
> dusting out the slot (gotta do that for all of them, next maintenance
> window), but after I put in the replacement, and used hpacucli to check,
> to my surprise it was rebuilding with the replacement, *not* with the
> spare.
> 
HP's raid controllers appears to have some logic that if the rebuild to
spare disk have not yet reached 50% when you insert the replacement, it
will abandon the rebuild to the spare and rebuild to the replacement
instead.

I don't have any documentation to prove it, but I have observed it
numerous of times.

	Thomas

Seemingly Similar Threads

Search for more reasonably related threads

CentOS - Jul 2015 - OT, hardware: HP smart array drive issue

[CentOS] OT, hardware: HP smart array drive issue

[CentOS] OT, hardware: HP smart array drive issue

[CentOS] OT, hardware: HP smart array drive issue

Seemingly Similar Threads