thr3ads.net - Gluster users - [Gluster-users] Replica 3 - how to replace failed node (peer) [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Martin Toth

2019-Apr-10 10:38 UTC

[Gluster-users] Replica 3 - how to replace failed node (peer)

Thanks, this looks ok to me, I will reset brick because I don't have any
data anymore on failed node so I can use same path / brick name.

Is reseting brick dangerous command? Should I be worried about some possible
failure that will impact remaining two nodes? I am running really old 3.7.6 but
stable version.

Thanks,
BR!

Martin
 
> On 10 Apr 2019, at 12:20, Karthik Subrahmanya <ksubrahm at
redhat.com> wrote:
> 
> Hi Martin,
> 
> After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:
> 
> - If you are going to use a different name to the new brick you can run
> gluster volume replace-brick <volname> <old-brick>
<new-brick> commit force
> 
> - If you are planning to use the same name for the new brick as well then
you can use
> gluster volume reset-brick <volname> <old-brick>
<new-brick> commit force
> Here old-brick & new-brick's hostname &  path should be same.
> 
> After replacing the brick, make sure the brick comes online using volume
status.
> Heal should automatically start, you can check the heal status to see all
the files gets replicated to the newly added brick. If it does not start
automatically, you can manually start that by running gluster volume heal
<volname>.
> 
> HTH,
> Karthik
> 
> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth <snowmailer at gmail.com
<mailto:snowmailer at gmail.com>> wrote:
> Hi all,
> 
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
all disks are showing errors and raid is in fault state.
> 
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is
down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> 
> So one of my bricks is totally failed (node2). It went down and all data
are lost (failed raid on node2). Now I am running only two bricks on 2 servers
out from 3.
> This is really critical problem for us, we can lost all data. I want to add
new disks to node2, create new raid array on them and try to replace failed
brick on this node.
> 
> What is the procedure of replacing Brick2 on node2, can someone advice? I
can?t find anything relevant in documentation.
> 
> Thanks in advance,
> Martin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190410/89286473/attachment.html>

Martin Toth

2019-Apr-11 07:12 UTC

head link

[Gluster-users] Replica 3 - how to replace failed node (peer)

Hi Karthik,

more over, I would like to ask if there are some recommended settings/parameters
for SHD in order to achieve good or fair I/O while volume will be healed when I
will replace Brick (this should trigger healing process).
I had some problems in past when healing was triggered, VM disks became
unresponsive because healing took most of I/O. My volume containing only big
files with VM disks.

Thanks for suggestions.
BR, 
Martin
> On 10 Apr 2019, at 12:38, Martin Toth <snowmailer at gmail.com>
wrote:
> 
> Thanks, this looks ok to me, I will reset brick because I don't have
any data anymore on failed node so I can use same path / brick name.
> 
> Is reseting brick dangerous command? Should I be worried about some
possible failure that will impact remaining two nodes? I am running really old
3.7.6 but stable version.
> 
> Thanks,
> BR!
> 
> Martin
>  
> 
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya <ksubrahm at
redhat.com <mailto:ksubrahm at redhat.com>> wrote:
>> 
>> Hi Martin,
>> 
>> After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:
>> 
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brick <volname> <old-brick>
<new-brick> commit force
>> 
>> - If you are planning to use the same name for the new brick as well
then you can use
>> gluster volume reset-brick <volname> <old-brick>
<new-brick> commit force
>> Here old-brick & new-brick's hostname &  path should be
same.
>> 
>> After replacing the brick, make sure the brick comes online using
volume status.
>> Heal should automatically start, you can check the heal status to see
all the files gets replicated to the newly added brick. If it does not start
automatically, you can manually start that by running gluster volume heal
<volname>.
>> 
>> HTH,
>> Karthik
>> 
>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth <snowmailer at gmail.com
<mailto:snowmailer at gmail.com>> wrote:
>> Hi all,
>> 
>> I am running replica 3 gluster with 3 bricks. One of my servers failed
- all disks are showing errors and raid is in fault state.
>> 
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick
is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> 
>> So one of my bricks is totally failed (node2). It went down and all
data are lost (failed raid on node2). Now I am running only two bricks on 2
servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
add new disks to node2, create new raid array on them and try to replace failed
brick on this node.
>> 
>> What is the procedure of replacing Brick2 on node2, can someone advice?
I can?t find anything relevant in documentation.
>> 
>> Thanks in advance,
>> Martin
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190411/41859236/attachment.html>

Gluster users - Apr 2019 - Replica 3 - how to replace failed node (peer)

[Gluster-users] Replica 3 - how to replace failed node (peer)

[Gluster-users] Replica 3 - how to replace failed node (peer)