thr3ads.net - Gluster users - [Gluster-users] Rsync in place of heal after brick failure [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Strahil

2019-Apr-09 15:34 UTC

[Gluster-users] Rsync in place of heal after brick failure

Correct me if I'm wrong but  I have been left with the impression that
cluster heal is multi-process ,  multi-connection event and would benefit from a
bonding like balance-alb.

I don't have much experience with xfsdump, but it looks like a single
process that uses single connection and thus only LACP can be beneficial.

Am I wrong?

Best Regards,
Strahil NikolovOn Apr 9, 2019 07:10, Aravinda <avishwan at redhat.com>
wrote:>
> On Mon, 2019-04-08 at 09:01 -0400, Tom Fite wrote: 
> > Thanks for the idea, Poornima. Testing shows that xfsdump and 
> > xfsrestore is much faster than rsync since it handles small files 
> > much better. I don't have extra space to store the dumps but I was
> > able to figure out how to pipe the xfsdump and restore via ssh. For 
> > anyone else that's interested: 
> > 
> > On source machine, run: 
> > 
> > xfsdump -J - /dev/mapper/[vg]-[brick] | ssh root@[destination fqdn] 
> > xfsrestore -J - [/path/to/brick] 
>
> Nice. Thanks for sharing 
>
> > 
> > -Tom 
> > 
> > On Mon, Apr 1, 2019 at 9:56 PM Poornima Gurusiddaiah < 
> > pgurusid at redhat.com> wrote: 
> > > You could also try xfsdump and xfsrestore if you brick filesystem
> > > is xfs and the destination disk can be attached locally? This
will
> > > be much faster. 
> > > 
> > > Regards, 
> > > Poornima 
> > > 
> > > On Tue, Apr 2, 2019, 12:05 AM Tom Fite <tomfite at
gmail.com> wrote:
> > > > Hi all, 
> > > > 
> > > > I have a very large (65 TB) brick in a replica 2 volume that
> > > > needs to be re-copied from scratch. A heal will take a very
long
> > > > time with performance degradation on the volume so I
investigated
> > > > using rsync to do the brunt of the work. 
> > > > 
> > > > The command: 
> > > > 
> > > > rsync -av -H -X --numeric-ids --progress
server1:/data/brick1/gv0
> > > > /data/brick1/ 
> > > > 
> > > > Running with -H assures that the hard links in .glusterfs
are
> > > > preserved, and -X preserves all of gluster's extended
attributes.
> > > > 
> > > > I've tested this on my test environment as follows: 
> > > > 
> > > > 1. Stop glusterd and kill procs 
> > > > 2. Move brick volume to backup dir 
> > > > 3. Run rsync 
> > > > 4. Start glusterd 
> > > > 5. Observe gluster status 
> > > > 
> > > > All appears to be working correctly. Gluster status reports
all
> > > > bricks online, all data is accessible in the volume, and I
don't
> > > > see any errors in the logs. 
> > > > 
> > > > Anybody else have experience trying this? 
> > > > 
> > > > Thanks 
> > > > -Tom 
> > > > _______________________________________________ 
> > > > Gluster-users mailing list 
> > > > Gluster-users at gluster.org 
> > > > https://lists.gluster.org/mailman/listinfo/gluster-users 
> > 
> > _______________________________________________ 
> > Gluster-users mailing list 
> > Gluster-users at gluster.org 
> > https://lists.gluster.org/mailman/listinfo/gluster-users 
> -- 
> regards 
> Aravinda 
>
> _______________________________________________ 
> Gluster-users mailing list 
> Gluster-users at gluster.org 
> https://lists.gluster.org/mailman/listinfo/gluster-users

Alvin Starr

2019-Apr-09 17:32 UTC

head link

[Gluster-users] Rsync in place of heal after brick failure

The performance needs to be compared between the two in a real environment.

For example I have a system where xfsdump takes something like 4 hours 
for a complete dump to /dev/null but a "find . -type f > /dev/null"
takes well over a day.
So it seems that xfsdump is very disk read efficient.

Another thing to take into consideration is the latency.

If the hosts are on the same lan then life is good but if the systems 
are milliseconds or more away from each other then you start getting 
side effects from BDP(bandwidth delay product) and this can quickly take 
a multi-gigabit link and turn it into a multi-megabit link.

BBCP supports piping data into and out of the program allowing for 
better use of the available bandwidth.
So that may be another way to get better performance out of multiple 
links or links with latency issues.

On 4/9/19 11:34 AM, Strahil wrote:> Correct me if I'm wrong but  I have been left with the impression that
cluster heal is multi-process ,  multi-connection event and would benefit from a
bonding like balance-alb.
>
> I don't have much experience with xfsdump, but it looks like a single
process that uses single connection and thus only LACP can be beneficial.
>
> Am I wrong?
>
> Best Regards,
> Strahil NikolovOn Apr 9, 2019 07:10, Aravinda <avishwan at
redhat.com> wrote:
>> On Mon, 2019-04-08 at 09:01 -0400, Tom Fite wrote:
>>> Thanks for the idea, Poornima. Testing shows that xfsdump and
>>> xfsrestore is much faster than rsync since it handles small files
>>> much better. I don't have extra space to store the dumps but I
was
>>> able to figure out how to pipe the xfsdump and restore via ssh. For
>>> anyone else that's interested:
>>>
>>> On source machine, run:
>>>
>>> xfsdump -J - /dev/mapper/[vg]-[brick] | ssh root@[destination fqdn]
>>> xfsrestore -J - [/path/to/brick]
>> Nice. Thanks for sharing
>>
>>> -Tom
>>>
>>> On Mon, Apr 1, 2019 at 9:56 PM Poornima Gurusiddaiah <
>>> pgurusid at redhat.com> wrote:
>>>> You could also try xfsdump and xfsrestore if you brick
filesystem
>>>> is xfs and the destination disk can be attached locally? This
will
>>>> be much faster.
>>>>
>>>> Regards,
>>>> Poornima
>>>>
>>>> On Tue, Apr 2, 2019, 12:05 AM Tom Fite <tomfite at
gmail.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> I have a very large (65 TB) brick in a replica 2 volume
that
>>>>> needs to be re-copied from scratch. A heal will take a very
long
>>>>> time with performance degradation on the volume so I
investigated
>>>>> using rsync to do the brunt of the work.
>>>>>
>>>>> The command:
>>>>>
>>>>> rsync -av -H -X --numeric-ids --progress
server1:/data/brick1/gv0
>>>>> /data/brick1/
>>>>>
>>>>> Running with -H assures that the hard links in .glusterfs
are
>>>>> preserved, and -X preserves all of gluster's extended
attributes.
>>>>>
>>>>> I've tested this on my test environment as follows:
>>>>>
>>>>> 1. Stop glusterd and kill procs
>>>>> 2. Move brick volume to backup dir
>>>>> 3. Run rsync
>>>>> 4. Start glusterd
>>>>> 5. Observe gluster status
>>>>>
>>>>> All appears to be working correctly. Gluster status reports
all
>>>>> bricks online, all data is accessible in the volume, and I
don't
>>>>> see any errors in the logs.
>>>>>
>>>>> Anybody else have experience trying this?
>>>>>
>>>>> Thanks
>>>>> -Tom
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> -- 
>> regards
>> Aravinda
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-- 
Alvin Starr                   ||   land:  (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
alvin at netvel.net              ||

Gluster users - Apr 2019 - Rsync in place of heal after brick failure

[Gluster-users] Rsync in place of heal after brick failure

[Gluster-users] Rsync in place of heal after brick failure