Correct me if I'm wrong but I have been left with the impression that cluster heal is multi-process , multi-connection event and would benefit from a bonding like balance-alb. I don't have much experience with xfsdump, but it looks like a single process that uses single connection and thus only LACP can be beneficial. Am I wrong? Best Regards, Strahil NikolovOn Apr 9, 2019 07:10, Aravinda <avishwan at redhat.com> wrote:> > On Mon, 2019-04-08 at 09:01 -0400, Tom Fite wrote: > > Thanks for the idea, Poornima. Testing shows that xfsdump and > > xfsrestore is much faster than rsync since it handles small files > > much better. I don't have extra space to store the dumps but I was > > able to figure out how to pipe the xfsdump and restore via ssh. For > > anyone else that's interested: > > > > On source machine, run: > > > > xfsdump -J - /dev/mapper/[vg]-[brick] | ssh root@[destination fqdn] > > xfsrestore -J - [/path/to/brick] > > Nice. Thanks for sharing > > > > > -Tom > > > > On Mon, Apr 1, 2019 at 9:56 PM Poornima Gurusiddaiah < > > pgurusid at redhat.com> wrote: > > > You could also try xfsdump and xfsrestore if you brick filesystem > > > is xfs and the destination disk can be attached locally? This will > > > be much faster. > > > > > > Regards, > > > Poornima > > > > > > On Tue, Apr 2, 2019, 12:05 AM Tom Fite <tomfite at gmail.com> wrote: > > > > Hi all, > > > > > > > > I have a very large (65 TB) brick in a replica 2 volume that > > > > needs to be re-copied from scratch. A heal will take a very long > > > > time with performance degradation on the volume so I investigated > > > > using rsync to do the brunt of the work. > > > > > > > > The command: > > > > > > > > rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 > > > > /data/brick1/ > > > > > > > > Running with -H assures that the hard links in .glusterfs are > > > > preserved, and -X preserves all of gluster's extended attributes. > > > > > > > > I've tested this on my test environment as follows: > > > > > > > > 1. Stop glusterd and kill procs > > > > 2. Move brick volume to backup dir > > > > 3. Run rsync > > > > 4. Start glusterd > > > > 5. Observe gluster status > > > > > > > > All appears to be working correctly. Gluster status reports all > > > > bricks online, all data is accessible in the volume, and I don't > > > > see any errors in the logs. > > > > > > > > Anybody else have experience trying this? > > > > > > > > Thanks > > > > -Tom > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > -- > regards > Aravinda > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users
Alvin Starr
2019-Apr-09 17:32 UTC
[Gluster-users] Rsync in place of heal after brick failure
The performance needs to be compared between the two in a real environment. For example I have a system where xfsdump takes something like 4 hours for a complete dump to /dev/null but a "find . -type f > /dev/null" takes well over a day. So it seems that xfsdump is very disk read efficient. Another thing to take into consideration is the latency. If the hosts are on the same lan then life is good but if the systems are milliseconds or more away from each other then you start getting side effects from BDP(bandwidth delay product) and this can quickly take a multi-gigabit link and turn it into a multi-megabit link. BBCP supports piping data into and out of the program allowing for better use of the available bandwidth. So that may be another way to get better performance out of multiple links or links with latency issues. On 4/9/19 11:34 AM, Strahil wrote:> Correct me if I'm wrong but I have been left with the impression that cluster heal is multi-process , multi-connection event and would benefit from a bonding like balance-alb. > > I don't have much experience with xfsdump, but it looks like a single process that uses single connection and thus only LACP can be beneficial. > > Am I wrong? > > Best Regards, > Strahil NikolovOn Apr 9, 2019 07:10, Aravinda <avishwan at redhat.com> wrote: >> On Mon, 2019-04-08 at 09:01 -0400, Tom Fite wrote: >>> Thanks for the idea, Poornima. Testing shows that xfsdump and >>> xfsrestore is much faster than rsync since it handles small files >>> much better. I don't have extra space to store the dumps but I was >>> able to figure out how to pipe the xfsdump and restore via ssh. For >>> anyone else that's interested: >>> >>> On source machine, run: >>> >>> xfsdump -J - /dev/mapper/[vg]-[brick] | ssh root@[destination fqdn] >>> xfsrestore -J - [/path/to/brick] >> Nice. Thanks for sharing >> >>> -Tom >>> >>> On Mon, Apr 1, 2019 at 9:56 PM Poornima Gurusiddaiah < >>> pgurusid at redhat.com> wrote: >>>> You could also try xfsdump and xfsrestore if you brick filesystem >>>> is xfs and the destination disk can be attached locally? This will >>>> be much faster. >>>> >>>> Regards, >>>> Poornima >>>> >>>> On Tue, Apr 2, 2019, 12:05 AM Tom Fite <tomfite at gmail.com> wrote: >>>>> Hi all, >>>>> >>>>> I have a very large (65 TB) brick in a replica 2 volume that >>>>> needs to be re-copied from scratch. A heal will take a very long >>>>> time with performance degradation on the volume so I investigated >>>>> using rsync to do the brunt of the work. >>>>> >>>>> The command: >>>>> >>>>> rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 >>>>> /data/brick1/ >>>>> >>>>> Running with -H assures that the hard links in .glusterfs are >>>>> preserved, and -X preserves all of gluster's extended attributes. >>>>> >>>>> I've tested this on my test environment as follows: >>>>> >>>>> 1. Stop glusterd and kill procs >>>>> 2. Move brick volume to backup dir >>>>> 3. Run rsync >>>>> 4. Start glusterd >>>>> 5. Observe gluster status >>>>> >>>>> All appears to be working correctly. Gluster status reports all >>>>> bricks online, all data is accessible in the volume, and I don't >>>>> see any errors in the logs. >>>>> >>>>> Anybody else have experience trying this? >>>>> >>>>> Thanks >>>>> -Tom >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> -- >> regards >> Aravinda >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Alvin Starr || land: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 alvin at netvel.net ||