Karthik Subrahmanya
2019-Apr-11  04:53 UTC
[Gluster-users] Replica 3 - how to replace failed node (peer)
Hi Strahil, Can you give us some more insights on - the volume configuration you were using? - why you wanted to replace your brick? - which brick(s) you tried replacing? - what problem(s) did you face? Regards, Karthik On Thu, Apr 11, 2019 at 10:14 AM Strahil <hunter86_bg at yahoo.com> wrote:> Hi Karthnik, > I used only once the brick replace function when I wanted to change my > Arbiter (v3.12.15 in oVirt 4.2.7) and it was a complete disaster. > Most probably I should have stopped the source arbiter before doing that, > but the docs didn't mention it. > > Thus I always use reset-brick, as it never let me down. > > Best Regards, > Strahil Nikolov > On Apr 11, 2019 07:34, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > > Hi Strahil, > > Thank you for sharing your experience with reset-brick option. > Since he is using the gluster version 3.7.6, we do not have the > reset-brick [1] option implemented there. It is introduced in 3.9.0. He has > to go with replace-brick with the force option if he wants to use the same > path & name for the new brick. > Yes, it is recommended to have the new brick to be of the same size as > that of the other bricks. > > [1] > https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command > > Regards, > Karthik > > On Wed, Apr 10, 2019 at 10:31 PM Strahil <hunter86_bg at yahoo.com> wrote: > > I have used reset-brick - but I have just changed the brick layout. > You may give it a try, but I guess you need your new brick to have same > amount of space (or more). > > Maybe someone more experienced should share a more sound solution. > > Best Regards, > Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth <snowmailer at gmail.com> > wrote: > > > > Hi all, > > > > I am running replica 3 gluster with 3 bricks. One of my servers failed - > all disks are showing errors and raid is in fault state. > > > > Type: Replicate > > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a > > Status: Started > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1 > > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is > down > > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1 > > > > So one of my bricks is totally failed (node2). It went down and all data > are lost (failed raid on node2). Now I am running only two bricks on 2 > servers out from 3. > > This is really critical problem for us, we can lost all data. I want to > add new disks to node2, create new raid array on them and try to replace > failed brick on this node. > > > > What is the procedure of replacing Brick2 on node2, can someone advice? > I can?t find anything relevant in documentation. > > > > Thanks in advance, > > Martin > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190411/da8d768d/attachment.html>
Karthik Subrahmanya
2019-Apr-11  04:55 UTC
[Gluster-users] Replica 3 - how to replace failed node (peer)
On Thu, Apr 11, 2019 at 10:23 AM Karthik Subrahmanya <ksubrahm at redhat.com> wrote:> Hi Strahil, > > Can you give us some more insights on > - the volume configuration you were using? > - why you wanted to replace your brick? > - which brick(s) you tried replacing? >- if you remember the commands/steps that you followed, please give that as well.> - what problem(s) did you face? >> Regards, > Karthik > > On Thu, Apr 11, 2019 at 10:14 AM Strahil <hunter86_bg at yahoo.com> wrote: > >> Hi Karthnik, >> I used only once the brick replace function when I wanted to change my >> Arbiter (v3.12.15 in oVirt 4.2.7) and it was a complete disaster. >> Most probably I should have stopped the source arbiter before doing that, >> but the docs didn't mention it. >> >> Thus I always use reset-brick, as it never let me down. >> >> Best Regards, >> Strahil Nikolov >> On Apr 11, 2019 07:34, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: >> >> Hi Strahil, >> >> Thank you for sharing your experience with reset-brick option. >> Since he is using the gluster version 3.7.6, we do not have the >> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has >> to go with replace-brick with the force option if he wants to use the same >> path & name for the new brick. >> Yes, it is recommended to have the new brick to be of the same size as >> that of the other bricks. >> >> [1] >> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command >> >> Regards, >> Karthik >> >> On Wed, Apr 10, 2019 at 10:31 PM Strahil <hunter86_bg at yahoo.com> wrote: >> >> I have used reset-brick - but I have just changed the brick layout. >> You may give it a try, but I guess you need your new brick to have same >> amount of space (or more). >> >> Maybe someone more experienced should share a more sound solution. >> >> Best Regards, >> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth <snowmailer at gmail.com> >> wrote: >> > >> > Hi all, >> > >> > I am running replica 3 gluster with 3 bricks. One of my servers failed >> - all disks are showing errors and raid is in fault state. >> > >> > Type: Replicate >> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a >> > Status: Started >> > Number of Bricks: 1 x 3 = 3 >> > Transport-type: tcp >> > Bricks: >> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1 >> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is >> down >> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1 >> > >> > So one of my bricks is totally failed (node2). It went down and all >> data are lost (failed raid on node2). Now I am running only two bricks on 2 >> servers out from 3. >> > This is really critical problem for us, we can lost all data. I want to >> add new disks to node2, create new raid array on them and try to replace >> failed brick on this node. >> > >> > What is the procedure of replacing Brick2 on node2, can someone advice? >> I can?t find anything relevant in documentation. >> > >> > Thanks in advance, >> > Martin >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190411/eac73ab1/attachment.html>
Strahil Nikolov
2019-Apr-11  08:10 UTC
[Gluster-users] Replica 3 - how to replace failed node (peer)
Hi Karthik,
- the volume configuration you were using?I used oVirt 4.2.6 Gluster Wizard, so
I guess - we need to involve the oVirt devs here.
- why you wanted to replace your brick?I have deployed the arbiter on another
location as I thought I can deploy the Thin Arbiter (still waiting the docs to
be updated), but once I realized that GlusterD doesn't support Thin Arbiter,
I had to build another machine for a local arbiter - thus a replacement was
needed.- which brick(s) you tried replacing?I was replacing the old arbiter with
a new one- what problem(s) did you face?All oVirt VMs got paused due to I/O
errors.
At the end, I have rebuild the whole setup and I never tried to replace the
brick this way (used only reset-brick which didn't cause any issues).
As I mentioned that was on v3.12, which is not the default for oVirt 4.3.x - so
my guess is that it is OK now (current is v5.5).
Just sharing my experience.
Best Regards,Strahil Nikolov
    ? ?????????, 11 ????? 2019 ?., 0:53:52 ?. ???????-4, Karthik Subrahmanya
<ksubrahm at redhat.com> ??????:
 
 Hi Strahil,
Can you give us some more insights on- the volume configuration you were using?-
why you wanted to replace your brick?- which brick(s) you tried replacing?- what
problem(s) did you face?
Regards,Karthik
On Thu, Apr 11, 2019 at 10:14 AM Strahil <hunter86_bg at yahoo.com> wrote:
Hi Karthnik,
I used only once the brick replace function when I wanted to change my Arbiter
(v3.12.15 in oVirt 4.2.7)? and it was a complete disaster.
Most probably I should have stopped the source arbiter before doing that, but
the docs didn't mention it.
Thus I always use reset-brick, as it never let me down.
Best Regards,
Strahil Nikolov
On Apr 11, 2019 07:34, Karthik Subrahmanya <ksubrahm at redhat.com> wrote:
Hi Strahil,
Thank you for sharing your experience with reset-brick option.Since he is using
the gluster version 3.7.6, we do not have the reset-brick [1] option implemented
there. It is introduced in 3.9.0. He has to go with replace-brick with the force
option if he wants to use the same path & name for the new brick.?Yes, it is
recommended to have the new brick to be of the same size as that of the other
bricks.
[1]?https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
Regards,Karthik
On Wed, Apr 10, 2019 at 10:31 PM Strahil <hunter86_bg at yahoo.com> wrote:
I have used reset-brick - but I have just changed the brick layout.
You may give it a try, but I guess you need your new brick to have same amount
of space (or more).
Maybe someone more experienced should share a more sound solution.
Best Regards,
Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth <snowmailer at
gmail.com> wrote:>
> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <? this brick is
down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
are lost (failed raid on node2). Now I am running only two bricks on 2 servers
out from 3.
> This is really critical problem for us, we can lost all data. I want to add
new disks to node2, create new raid array on them and try to replace failed
brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
can?t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190411/22ec0aa7/attachment.html>