thr3ads.net - Gluster users - [Gluster-users] Replica 3 - how to replace failed node (peer) [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Martin Toth

2019-Apr-10 09:42 UTC

[Gluster-users] Replica 3 - how to replace failed node (peer)

Hi all,

I am running replica 3 gluster with 3 bricks. One of my servers failed - all
disks are showing errors and raid is in fault state.

Type: Replicate
Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is down
Brick3: node3.san:/tank/gluster/gv0imagestore/brick1

So one of my bricks is totally failed (node2). It went down and all data are
lost (failed raid on node2). Now I am running only two bricks on 2 servers out
from 3.
This is really critical problem for us, we can lost all data. I want to add new
disks to node2, create new raid array on them and try to replace failed brick on
this node.

What is the procedure of replacing Brick2 on node2, can someone advice? I can?t
find anything relevant in documentation.

Thanks in advance,
Martin

David Spisla

2019-Apr-10 10:09 UTC

head link

[Gluster-users] Replica 3 - how to replace failed node (peer)

Hello Martin,

look here:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/pdf/administration_guide/Red_Hat_Gluster_Storage-3.4-Administration_Guide-en-US.pdf
on page 324. There is a manual how to replace a brick in case of a hardware
failure

Regards
David Spisla

Am Mi., 10. Apr. 2019 um 11:42 Uhr schrieb Martin Toth <snowmailer at
gmail.com>:
> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is
down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
> can?t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190410/718e869f/attachment.html>

Karthik Subrahmanya

2019-Apr-10 10:20 UTC

head link

[Gluster-users] Replica 3 - how to replace failed node (peer)

Hi Martin,

After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:

- If you are going to use a different name to the new brick you can run
gluster volume replace-brick <volname> <old-brick> <new-brick>
commit force

- If you are planning to use the same name for the new brick as well then
you can use
gluster volume reset-brick <volname> <old-brick> <new-brick>
commit force
Here old-brick & new-brick's hostname &  path should be same.

After replacing the brick, make sure the brick comes online using volume
status.
Heal should automatically start, you can check the heal status to see all
the files gets replicated to the newly added brick. If it does not start
automatically, you can manually start that by running gluster volume heal
<volname>.

HTH,
Karthik

On Wed, Apr 10, 2019 at 3:13 PM Martin Toth <snowmailer at gmail.com>
wrote:
> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is
down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
> can?t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190410/a2b266d9/attachment.html>

Gluster users - Apr 2019 - Replica 3 - how to replace failed node (peer)

[Gluster-users] Replica 3 - how to replace failed node (peer)

[Gluster-users] Replica 3 - how to replace failed node (peer)

[Gluster-users] Replica 3 - how to replace failed node (peer)