thr3ads.net - Gluster users - [Gluster-users] Possible to preload data on a georeplication target? First sync taking forever... [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Tony Maro

2013-Jul-08 18:04 UTC

[Gluster-users] Possible to preload data on a georeplication target? First sync taking forever...

I have about 4 TB of data in a Gluster mirror configuration on top of ZFS,
mostly consisting of 20KB files.

I've added a georeplication target and the sync started ok.  The target is
using an SSH destination.  It ran pretty quick for a while but it's taken
over 2 weeks to sync just under 1 TB of data to the target server and it
appears to be getting slower.

The two servers are connected to the same switch on a private segment with
Gigabit ethernet, so the bottleneck is not the network.  I haven't
physically moved the georeplication target to the other end of the WAN yet.

I really don't want to wait another 6 weeks (or worse) to get my starting
full sync done before sending the server out.  Is it possible to manually
rsync the data over myself to get it's starting position?

If so, what steps should I take?  In other words, break replication, delete
index, are the special rsync flags I should use if I rsync the data over
myself, etc.?

For reference before anyone asks, the source brick that's running the
georeplication is reporting the following:

top - 14:01:55 up  3:55,  1 user,  load average: 0.31, 0.74, 0.85
Tasks: 221 total,   1 running, 220 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.6%us,  2.9%sy,  0.0%ni, 83.2%id, 10.2%wa,  0.0%hi,  0.1%si,
 0.0%st
Mem:  12297148k total, 12147752k used,   149396k free,    11684k buffers
Swap:    93180k total,        0k used,    93180k free,  3201636k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 1711 root      20   0  835m  28m 2484 S  155  0.2  38:25.90 glusterfsd

CPU usage for glusterfsd bounces between around 20% and 160%.

Thanks,
Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130708/80d04cf3/attachment.html>

Jan Hudoba

2013-Jul-19 16:52 UTC

head link

[Gluster-users] Possible to preload data on a georeplication target? First sync taking forever...

hi,
i dont know if you can preload data (if can, i want to know it too).
but maybe you can try to set this setting and then stop and start
geo-repl. it can be faster.

gluster volume geo-replication $VOL ssh://$USER@$GEO_HOST::$GEO_VOL
config sync-jobs 4

i suggest number based on cpu cores/raid disks on slave.
beware it will make same linux load on slave.


-- 
S pozdravom / Yours sincerely
Ing. Jan Hudoba
http://www.facebook.com/jan.hudoba
http://www.jahu.sk

Seemingly Similar Threads

Search for more apparently analagous threads

Gluster users - Jul 2013 - Possible to preload data on a georeplication target? First sync taking forever...

[Gluster-users] Possible to preload data on a georeplication target? First sync taking forever...

[Gluster-users] Possible to preload data on a georeplication target? First sync taking forever...

Seemingly Similar Threads