thr3ads.net - Gluster users - [Gluster-users] Script and tips for parallelizing rsync [Jul 2014]

If this information is useful, please help other people find it:
Share via:

Alan Orth

2014-Jul-09 11:42 UTC

[Gluster-users] Script and tips for parallelizing rsync

Hi,

I recently had a RAID failure on one of my Gluster replicas; luckily my
replica was ok, and I could re-sync all the data to the bad node's
bricks.  I used rsync to pre-seed the brick data, rather than having
Gluster's self-heal daemon try to figure it out.

It turns out I had way more files than I realized, which exposed some
problems with "traditional" rsync invocation.  I found some clever
ways
to optimize the transfer and speed up the process, and wrote up my
experiences on my blog:

http://mjanja.co.ke/2014/07/parallelizing-rsync/

Hope this helps someone!

-- 
Alan Orth
alan.orth at gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my telephone;
my wish has come true because I can no longer figure out how to use my
telephone." -Bjarne Stroustrup, inventor of C++
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140709/0d645a7e/attachment.sig>

Dan Mons

2014-Jul-10 01:27 UTC

head link

[Gluster-users] Script and tips for parallelizing rsync

We do something similar for our nightly backups (100TB between two
Gluster setups).

Each of our 6 Gluster nodes gets a set of top level folders
(representing each department in the org), and within each we thread
based on folders in the top level of each major section.  That nets us
around 200+ rsync threads, which makes the nightly sync happen a lot
faster.

I played around with parallel rsync, but could never make it work the
way I wanted.  Just doing a simple "ls -d * | while read DIR ; do
rsync /$DIR/ remote:/$DIR/ & done" works out far better.

-Dan

----------------
Dan Mons
Unbreaker of broken things
Cutting Edge
http://cuttingedge.com.au


On 9 July 2014 21:42, Alan Orth <alan.orth at gmail.com>
wrote:> Hi,
>
> I recently had a RAID failure on one of my Gluster replicas; luckily my
> replica was ok, and I could re-sync all the data to the bad node's
> bricks.  I used rsync to pre-seed the brick data, rather than having
> Gluster's self-heal daemon try to figure it out.
>
> It turns out I had way more files than I realized, which exposed some
> problems with "traditional" rsync invocation.  I found some
clever ways
> to optimize the transfer and speed up the process, and wrote up my
> experiences on my blog:
>
> http://mjanja.co.ke/2014/07/parallelizing-rsync/
>
> Hope this helps someone!
>
> --
> Alan Orth
> alan.orth at gmail.com
> http://alaninkenya.org
> http://mjanja.co.ke
> "I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out how to use
my telephone." -Bjarne Stroustrup, inventor of C++
> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Gluster users - Jul 2014 - Script and tips for parallelizing rsync

[Gluster-users] Script and tips for parallelizing rsync

[Gluster-users] Script and tips for parallelizing rsync