thr3ads.net - Gluster users - [Gluster-users] [External] Replica 3 - how to replace failed node (peer) [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Martin Toth

2019-Apr-10 10:34 UTC

[Gluster-users] [External] Replica 3 - how to replace failed node (peer)

I?ve read this documentation but step 4 is really unclear to me. I don?t
understand related mkdir/rmdir/setfattr and so on.

Step 4:
Using the gluster volume fuse mount (In this example: /mnt/r2) set up metadata
so that data will be synced to new brick (In this case it is from
Server1:/home/gfs/r2_1 to Server1:/home/gfs/r2_5)

Why should I change trusted.non-existent-key on this volume? 
It is even more confusing because other mentioned howtos does not contain this
step at all.

BR, 
Martin
> On 10 Apr 2019, at 11:54, Davide Obbi <davide.obbi at booking.com>
wrote:
> 
>
https://docs.gluster.org/en/v3/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick
<https://docs.gluster.org/en/v3/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick>
> 
> On Wed, Apr 10, 2019 at 11:42 AM Martin Toth <snowmailer at gmail.com
<mailto:snowmailer at gmail.com>> wrote:
> Hi all,
> 
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
all disks are showing errors and raid is in fault state.
> 
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is
down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> 
> So one of my bricks is totally failed (node2). It went down and all data
are lost (failed raid on node2). Now I am running only two bricks on 2 servers
out from 3.
> This is really critical problem for us, we can lost all data. I want to add
new disks to node2, create new raid array on them and try to replace failed
brick on this node.
> 
> What is the procedure of replacing Brick2 on node2, can someone advice? I
can?t find anything relevant in documentation.
> 
> Thanks in advance,
> Martin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> -- 
> Davide Obbi
> Senior System Administrator
> 
> Booking.com B.V.
> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> Direct +31207031558
>  <https://www.booking.com/>
> Empowering people to experience the world since 1996
> 43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190410/eee4101c/attachment.html>

Karthik Subrahmanya

2019-Apr-10 12:26 UTC

head link

[Gluster-users] [External] Replica 3 - how to replace failed node (peer)

Hi Martin,

The reset-brick command is introduced in 3.9.0 and not present in 3.7.6.
You can try using the same replace-brick command with the force option even
if you want to use the same name for the brick being replaced.
3.7.6 is EOLed long back and glusterfs-6 is the latest version with lots of
improvements, bug fixes and new features. The release schedule can be found
at [1]. Upgrading to one of the maintained branch is highly recommended.

On Wed, Apr 10, 2019 at 4:14 PM Martin Toth <snowmailer at gmail.com>
wrote:
> I?ve read this documentation but step 4 is really unclear to me. I don?t
> understand related mkdir/rmdir/setfattr and so on.
>
> Step 4:
>
> *Using the gluster volume fuse mount (In this example: /mnt/r2) set up
> metadata so that data will be synced to new brick (In this case it is
> from Server1:/home/gfs/r2_1 to Server1:/home/gfs/r2_5)*
>
> Why should I change trusted.non-existent-key on this volume?
> It is even more confusing because other mentioned howtos does not contain
> this step at all.
>Those steps were needed in the older releases to set some metadata on the
good bricks so that heal should not happen from the replaced brick to good
bricks, which can lead to data loss. Since you are on 3.7.6, we have
automated all these steps for you in that branch. You just need to run the
replace-brick command, which will take care of all those things.

[1] https://www.gluster.org/release-schedule/

Regards,
Karthik
>
> BR,
> Martin
>
> On 10 Apr 2019, at 11:54, Davide Obbi <davide.obbi at booking.com>
wrote:
>
>
>
https://docs.gluster.org/en/v3/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick
>
> On Wed, Apr 10, 2019 at 11:42 AM Martin Toth <snowmailer at
gmail.com> wrote:
>
>> Hi all,
>>
>> I am running replica 3 gluster with 3 bricks. One of my servers failed
-
>> all disks are showing errors and raid is in fault state.
>>
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick
is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>
>> So one of my bricks is totally failed (node2). It went down and all
data
>> are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to
replace
>> failed brick on this node.
>>
>> What is the procedure of replacing Brick2 on node2, can someone advice?
I
>> can?t find anything relevant in documentation.
>>
>> Thanks in advance,
>> Martin
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Davide Obbi
> Senior System Administrator
>
> Booking.com B.V.
> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> Direct +31207031558
> [image: Booking.com] <https://www.booking.com/>
> Empowering people to experience the world since 1996
> 43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
> million reported listings
> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190410/56c4f3a9/attachment-0001.html>

Gluster users - Apr 2019 - [External] Replica 3 - how to replace failed node (peer)

[Gluster-users] [External] Replica 3 - how to replace failed node (peer)

[Gluster-users] [External] Replica 3 - how to replace failed node (peer)