thr3ads.net - CentOS - [CentOS] C7, mdadm issues [Jan 2019]

If this information is useful, please help other people find it:
Share via:

mark

2019-Jan-29 19:42 UTC

[CentOS] C7, mdadm issues

Alessandro Baggi wrote:> Il 29/01/19 18:47, mark ha scritto:
>> Alessandro Baggi wrote:
>>> Il 29/01/19 15:03, mark ha scritto:
>>>
>>>> I've no idea what happened, but the box I was working on
last week
>>>> has a *second* bad drive. Actually, I'm starting to wonder
about
>>>> that particulare hot-swap bay.
>>>>
>>>> Anyway, mdadm --detail shows /dev/sdb1 remove. I've added
>>>> /dev/sdi1...
>>>> but see both /dev/sdh1 and /dev/sdi1 as spare, and have yet to
find
>>>> a reliable way to make either one active.
>>>>
>>>> Actually, I would have expected the linux RAID to replace a
failed
>>>> one with a spare....
>>> can you report your raid configuration like raid level and raid
devices
>>> and the current status from /proc/mdstat?
>>>
>> Well, nope. I got to the point of rebooting the system (xfs had the
>> RAID
>> volume, and wouldn't let go; I also commented out the RAID volume.
>>
>> It's RAID 5, /dev/sdb *also* appears to have died. If I do
>> mdadm --assemble --force -v /dev/md0  /dev/sd[cefgdh]1 mdadm: looking
for
>> devices for /dev/md0 mdadm: /dev/sdc1 is identified as a member of
>> /dev/md0, slot 0.
>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot -1.
>> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
>> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
>> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 4.
>> mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot -1.
>> mdadm: no uptodate device for slot 1 of /dev/md0
>> mdadm: added /dev/sde1 to /dev/md0 as 2
>> mdadm: added /dev/sdf1 to /dev/md0 as 3
>> mdadm: added /dev/sdg1 to /dev/md0 as 4
>> mdadm: no uptodate device for slot 5 of /dev/md0
>> mdadm: added /dev/sdd1 to /dev/md0 as -1
>> mdadm: added /dev/sdh1 to /dev/md0 as -1
>> mdadm: added /dev/sdc1 to /dev/md0 as 0
>> mdadm: /dev/md0 assembled from 4 drives and 2 spares - not enough to
>> start the array.
>>
>> --examine shows me /dev/sdd1 and /dev/sdh1, but that both are spares.
> Hi Mark,
> please post the result from
>
> cat /sys/block/md0/md/sync_action
There is none. There is no /dev/md0. mdadm refusees, saying that it's lost
too many drives.

      mark

Alessandro Baggi

2019-Jan-30 08:45 UTC

head link

[CentOS] C7, mdadm issues

Il 29/01/19 20:42, mark ha scritto:> Alessandro Baggi wrote:
>> Il 29/01/19 18:47, mark ha scritto:
>>> Alessandro Baggi wrote:
>>>> Il 29/01/19 15:03, mark ha scritto:
>>>>
>>>>> I've no idea what happened, but the box I was working
on last week
>>>>> has a *second* bad drive. Actually, I'm starting to
wonder about
>>>>> that particulare hot-swap bay.
>>>>>
>>>>> Anyway, mdadm --detail shows /dev/sdb1 remove. I've
added
>>>>> /dev/sdi1...
>>>>> but see both /dev/sdh1 and /dev/sdi1 as spare, and have yet
to find
>>>>> a reliable way to make either one active.
>>>>>
>>>>> Actually, I would have expected the linux RAID to replace a
failed
>>>>> one with a spare....
> 
>>>> can you report your raid configuration like raid level and raid
devices
>>>> and the current status from /proc/mdstat?
>>>>
>>> Well, nope. I got to the point of rebooting the system (xfs had the
>>> RAID
>>> volume, and wouldn't let go; I also commented out the RAID
volume.
>>>
>>> It's RAID 5, /dev/sdb *also* appears to have died. If I do
>>> mdadm --assemble --force -v /dev/md0  /dev/sd[cefgdh]1 mdadm:
looking for
>>> devices for /dev/md0 mdadm: /dev/sdc1 is identified as a member of
>>> /dev/md0, slot 0.
>>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot -1.
>>> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
>>> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
>>> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 4.
>>> mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot -1.
>>> mdadm: no uptodate device for slot 1 of /dev/md0
>>> mdadm: added /dev/sde1 to /dev/md0 as 2
>>> mdadm: added /dev/sdf1 to /dev/md0 as 3
>>> mdadm: added /dev/sdg1 to /dev/md0 as 4
>>> mdadm: no uptodate device for slot 5 of /dev/md0
>>> mdadm: added /dev/sdd1 to /dev/md0 as -1
>>> mdadm: added /dev/sdh1 to /dev/md0 as -1
>>> mdadm: added /dev/sdc1 to /dev/md0 as 0
>>> mdadm: /dev/md0 assembled from 4 drives and 2 spares - not enough
to
>>> start the array.
>>>
>>> --examine shows me /dev/sdd1 and /dev/sdh1, but that both are
spares.
>> Hi Mark,
>> please post the result from
>>
>> cat /sys/block/md0/md/sync_action
> 
> There is none. There is no /dev/md0. mdadm refusees, saying that it's
lost
> too many drives.
> 
>        mark
> 
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
> 

I suppose that your config is 5 drive and 1 spare with 1 drive failed.
It's strange that your spare was not used for resync.
Then you added a new drive but it does not start because it marks the 
new disk as spare and you have a raid5 with 4 devices and 2 spares.

First I hope that you have a backup for all your data and don't run some 
exotic command before backupping your data. If you can't backup your 
data, it's a problem.

Have you tried to remove the last added device sdi1 and restart the raid 
and force to start a resync?

Have you tried to remove this 2 devices and re-add only the device that 
will be usefull for resync?  Maybe you can set 5 devices for your raid 
and not 6, if it works (after resync) you can add your spare device 
growing your raid set.

Reading on google many users use --zero-superblock before re-add the device.

Other user reassemble the raid using --assume-clean but I don't know 
what effect it will produces

Hope that this helps.

mark

2019-Jan-30 13:02 UTC

head link

[CentOS] C7, mdadm issues

On 01/30/19 03:45, Alessandro Baggi wrote:> Il 29/01/19 20:42, mark ha scritto:
>> Alessandro Baggi wrote:
>>> Il 29/01/19 18:47, mark ha scritto:
>>>> Alessandro Baggi wrote:
>>>>> Il 29/01/19 15:03, mark ha scritto:
>>>>>
>>>>>> I've no idea what happened, but the box I was
working on last week
>>>>>> has a *second* bad drive. Actually, I'm starting to
wonder about
>>>>>> that particulare hot-swap bay.
>>>>>>
>>>>>> Anyway, mdadm --detail shows /dev/sdb1 remove. I've
added
>>>>>> /dev/sdi1...
>>>>>> but see both /dev/sdh1 and /dev/sdi1 as spare, and have
yet to find
>>>>>> a reliable way to make either one active.
>>>>>>
>>>>>> Actually, I would have expected the linux RAID to
replace a failed
>>>>>> one with a spare....
>>
>>>>> can you report your raid configuration like raid level and
raid devices
>>>>> and the current status from /proc/mdstat?
>>>>>
>>>> Well, nope. I got to the point of rebooting the system (xfs had
the
>>>> RAID
>>>> volume, and wouldn't let go; I also commented out the RAID
volume.
>>>>
>>>> It's RAID 5, /dev/sdb *also* appears to have died. If I do
>>>> mdadm --assemble --force -v /dev/md0? /dev/sd[cefgdh]1 mdadm:
looking for
>>>> devices for /dev/md0 mdadm: /dev/sdc1 is identified as a member
of
>>>> /dev/md0, slot 0.
>>>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot
-1.
>>>> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
>>>> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
>>>> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 4.
>>>> mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot
-1.
>>>> mdadm: no uptodate device for slot 1 of /dev/md0
>>>> mdadm: added /dev/sde1 to /dev/md0 as 2
>>>> mdadm: added /dev/sdf1 to /dev/md0 as 3
>>>> mdadm: added /dev/sdg1 to /dev/md0 as 4
>>>> mdadm: no uptodate device for slot 5 of /dev/md0
>>>> mdadm: added /dev/sdd1 to /dev/md0 as -1
>>>> mdadm: added /dev/sdh1 to /dev/md0 as -1
>>>> mdadm: added /dev/sdc1 to /dev/md0 as 0
>>>> mdadm: /dev/md0 assembled from 4 drives and 2 spares - not
enough to
>>>> start the array.
>>>>
>>>> --examine shows me /dev/sdd1 and /dev/sdh1, but that both are
spares.
>>> Hi Mark,
>>> please post the result from
>>>
>>> cat /sys/block/md0/md/sync_action
>>
>> There is none. There is no /dev/md0. mdadm refusees, saying that
it's lost
>> too many drives.
>>
>> ?????? mark
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>>
> 
> 
> I suppose that your config is 5 drive and 1 spare with 1 drive failed.
> It's strange that your spare was not used for resync.
> Then you added a new drive but it does not start because it marks the new
disk
> as spare and you have a raid5 with 4 devices and 2 spares.
> 
> First I hope that you have a backup for all your data and don't run
some
> exotic command before backupping your data. If you can't backup your
data,
> it's a problem.
This is at work. We have automated nightly backups, and I do offline backups 
of the backups every two weeks.> 
> Have you tried to remove the last added device sdi1 and restart the raid
and
> force to start a resync?
The thing is, it had one? two? spares when /dev/sdb1 started dying, and it 
didn't use them.> 
> Have you tried to remove this 2 devices and re-add only the device that
will
> be usefull for resync?? Maybe you can set 5 devices for your raid and not
6,
> if it works (after resync) you can add your spare device growing your raid
set.
I tried, and that's when I lost it (again), and it refuses to assemble/start
the RAID "not enough devices".> 
> Reading on google many users use --zero-superblock before re-add the
device.
I can take one out, and re-add, but I think I'm going to have to recreate
the
RAID again, and again restore from backup.> 
> Other user reassemble the raid using --assume-clean but I don't know
what
> effect it will produces
> 
> Hope that this helps.
Thanks.

	mark

Maybe Matching Threads

Search for more maybe matching threads

CentOS - Jan 2019 - C7, mdadm issues

[CentOS] C7, mdadm issues

[CentOS] C7, mdadm issues

[CentOS] C7, mdadm issues

Maybe Matching Threads