Nathan Aldridge
2014-Dec-13 19:02 UTC
[Gluster-users] Is it possible to start geo-replication between two volumes with data already present in the slave?
Hi, I have a large volume that I want to geo-replicate using Gluster (> 1Tb). I have the data rsynced on both servers and up to date. Can I start a geo-replication session without having to send the whole contents over the wire to the slave, since it's already there? I'm running Gluster 3.6.1. I've read through all the various on-line documents I can find but nothing pops out that describes this scenario. Thanks in advance, Nathan Aldridge -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141213/dac897b9/attachment.html>
Aravinda
2014-Dec-15 08:33 UTC
[Gluster-users] Is it possible to start geo-replication between two volumes with data already present in the slave?
When you directly sync files using rsync, rsync creates a file in slave if not exists. Due to this, rsync creates a file in slave with different GFID than the one available in master. This is a problem for geo-rep to continue syncing to slave. Before starting geo-replication as a prerequisite steps you can do the following to fix the GFID changes, Run this in a master node, If you downloaded glusterfs source directory, cd $GLUSTER_SRC/extras/geo-rep sh generate-gfid-file.shlocalhost:<MASTER VOL NAME> $PWD/get-gfid.sh/tmp/master-gfid-values.txt Copy the generated file to slave scp /tmp/master-gfid-values.txt root at slavehost:/tmp/ Run this script in slave, cd $GLUSTER_SRC/extras/geo-rep sh slave-upgrade.sh localhost:<SLAVE VOL NAME>/tmp/master-gfid-values.txt $PWD/gsync-sync-gfid Once all these steps complete, GFID in master volume matches with GFID in slave. Now update the stime xattr in each brick root in master volume. Enclosed a Python script to update stime of each brick root to current time, run it in each master node for each brick. sudo python set_stime.py <MASTER VOLUME ID> <SLAVE VOLUME ID> <BRICK PATH> For example, sudo python set_stime.py f8c6276f-7ab5-4098-b41d-c82909940799 563681d7-a8fd-4cea-bf97-eca74203a0fe /exports/brick1 You can get master volume ID and slave volume ID using gluster volume info command, gluster volume info <MASTER VOL> | grep -i "volume id" gluster volume info <SLAVE VOL> | grep -i "volume id" Once this is done, create the geo-rep session using force option, gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL>create push-pem force Start geo-replication, gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> start force Now onwards, geo-rep picks only new changes and syncs to slave. Let me know if you face any issues. -- regards Aravinda http://aravindavk.in On 12/14/2014 12:32 AM, Nathan Aldridge wrote:> > Hi, > > I have a large volume that I want to geo-replicate using Gluster (> > 1Tb). I have the data rsynced on both servers and up to date. Can I > start a geo-replication session without having to send the whole > contents over the wire to the slave, since it?s already there? I?m > running Gluster 3.6.1. > > I?ve read through all the various on-line documents I can find but > nothing pops out that describes this scenario. > > Thanks in advance, > > Nathan Aldridge > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141215/f4ce57e7/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: set_stime.py Type: text/x-python Size: 979 bytes Desc: not available URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141215/f4ce57e7/attachment.py>