Joe Julian
2015-Apr-27 22:56 UTC
[Gluster-users] Disastrous performance with rsync to mounted Gluster volume.
On 04/27/2015 03:00 PM, Ernie Dunbar wrote:> On 2015-04-27 14:09, Joe Julian wrote: >> >> I've also noticed that if I increase the count of those writes, the >> transfer speed increases as well: >> >> 2097152 bytes (2.1 MB) copied, 0.036291 s, 57.8 MB/s >> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >> count=2048 bs=1024; sync >> 2048+0 records in >> 2048+0 records out >> 2097152 bytes (2.1 MB) copied, 0.0362724 s, 57.8 MB/s >> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >> count=2048 bs=1024; sync >> 2048+0 records in >> 2048+0 records out >> 2097152 bytes (2.1 MB) copied, 0.0360319 s, 58.2 MB/s >> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >> count=10240 bs=1024; sync >> 10240+0 records in >> 10240+0 records out >> 10485760 bytes (10 MB) copied, 0.127219 s, 82.4 MB/s >> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >> count=10240 bs=1024; sync >> 10240+0 records in >> 10240+0 records out >> 10485760 bytes (10 MB) copied, 0.128671 s, 81.5 MB/s >> >> This is correct, there is overhead that happens with small files and >> the smaller the file the less throughput you get. That said, since >> files are smaller you should get more files / second but less MB / >> second. I have found that when you go under 16k changing files size >> doesn't matter, you will get the same number of 16k files / second as >> you do 1 k files. >> >> The overhead happens regardless. You just notice it more when you're >> doing it a lot more frequently. > > > Well, it would be helpful to know what specifically rsync is trying to > do when it's sitting there making overhead, and whether it's possible > to tell rsync to avoid doing it, and just copy files instead (which it > does quite quickly). > > I suppose technically speaking, it's an rsync-specific question, but > it's all about making rsync and glusterfs play nice, and we pretty > much all need to know that!Yes, that's very rsync specific. Rsync not only checks the files metadata, but it also does a hash comparison. Each lookup() of each file requires a lookup from *each* replica server. Lookup's are triggered on open, or even fstat. Since rsync requests the stat of every file for comparison, this requires a little extra network time. The client queries all the replica in case one is out of date to ensure it's returning accurate results (and heal if a replica needs it). After bulding a list of files that differ between the source and the target, rsync copies the file to a temporary filename. After completing the temporary file, rsync then renames the temporary to the target filename. This has the disadvantage of putting the target file on the "wrong" dht subvolume because the hash for the temporary filename is different from the target filename. rsync's network transfer is over ssh by default, so you're also hitting encryption and buffer overhead. The optimum process for *initially* copying large numbers of files to gluster would be to blindly copy a list of files from the source to the target without reading the target. If copying across a network, maximizing the packet size is also advantageous. I've found tar ( + pv if you want to monitor throughput) + netcat (or socat) to be much faster than rsync.
David Robinson
2015-Apr-29 22:20 UTC
[Gluster-users] Disastrous performance with rsync to mounted Gluster volume.
Joe, I switched my mounts from FUSE to NFS and the rsync is traversing the filesystem 5x faster. Previously I had: rsync -axv --numeric-ids --inplace -e \"/root/hpnssh_for_backup/bin/ssh -p222 -ax -oNoneSwitch=yes -oNoneEnabled=yes\" --delete --whole-file gfsib02b:/homegfs /backup/backup.0 where gfsib02b is one of the gluster nodes and homegfs is a FUSE mounted volume. /backup is a FUSE mount on my backup machine. I changed this to: rsync -axv --numeric-ids --inplace -e \"/root/hpnssh_for_backup/bin/ssh -p222 -ax -oNoneSwitch=yes -oNoneEnabled=yes\" --delete --whole-file gfsib02b:/homegfs_nfs /backup/backup.0 where gfsib02b:/homegfs_nfs is an NFS mount of the same volume. [root at gfs02b homegfs_nfs]# mount | grep homegfs gfsib02b.corvidtec.com:/homegfs.tcp on /homegfs type fuse.glusterfs (rw,allow_other,max_read=131072) gfsib02b.corvidtec.com:/homegfs on /homegfs_nfs type nfs (rw,vers=3,intr,bg,rsize=32768,wsize=32768,addr=10.200.71.2) Is a 5x speed difference expected between the FUSE and NFS mounts? Note that very little data is being transferred. And, when data is being transferred, the transfer rate seems fine. My issue with gluster and rsync was that it seemed to be taking an extremely long time to figure out what to transfer and this seems to largely be mitigated by using NFS instead of FUSE on the server side (side where data to be backed up resides). I am still using gluster for the backup machine as this seemed to have a very small affect on the timing. David ------ Original Message ------ From: "Joe Julian" <joe at julianfamily.org> To: gluster-users at gluster.org Sent: 4/27/2015 6:56:04 PM Subject: Re: [Gluster-users] Disastrous performance with rsync to mounted Gluster volume.> > >On 04/27/2015 03:00 PM, Ernie Dunbar wrote: >>On 2015-04-27 14:09, Joe Julian wrote: >>> >>>I've also noticed that if I increase the count of those writes, the >>>transfer speed increases as well: >>> >>>2097152 bytes (2.1 MB) copied, 0.036291 s, 57.8 MB/s >>>root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >>>count=2048 bs=1024; sync >>>2048+0 records in >>>2048+0 records out >>>2097152 bytes (2.1 MB) copied, 0.0362724 s, 57.8 MB/s >>>root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >>>count=2048 bs=1024; sync >>>2048+0 records in >>>2048+0 records out >>>2097152 bytes (2.1 MB) copied, 0.0360319 s, 58.2 MB/s >>>root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >>>count=10240 bs=1024; sync >>>10240+0 records in >>>10240+0 records out >>>10485760 bytes (10 MB) copied, 0.127219 s, 82.4 MB/s >>>root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile >>>count=10240 bs=1024; sync >>>10240+0 records in >>>10240+0 records out >>>10485760 bytes (10 MB) copied, 0.128671 s, 81.5 MB/s >>> >>>This is correct, there is overhead that happens with small files and >>>the smaller the file the less throughput you get. That said, since >>>files are smaller you should get more files / second but less MB / >>>second. I have found that when you go under 16k changing files size >>>doesn't matter, you will get the same number of 16k files / second as >>>you do 1 k files. >>> >>> The overhead happens regardless. You just notice it more when you're >>>doing it a lot more frequently. >> >> >>Well, it would be helpful to know what specifically rsync is trying to >>do when it's sitting there making overhead, and whether it's possible >>to tell rsync to avoid doing it, and just copy files instead (which it >>does quite quickly). >> >>I suppose technically speaking, it's an rsync-specific question, but >>it's all about making rsync and glusterfs play nice, and we pretty >>much all need to know that! > >Yes, that's very rsync specific. Rsync not only checks the files >metadata, but it also does a hash comparison. > >Each lookup() of each file requires a lookup from *each* replica >server. Lookup's are triggered on open, or even fstat. Since rsync >requests the stat of every file for comparison, this requires a little >extra network time. The client queries all the replica in case one is >out of date to ensure it's returning accurate results (and heal if a >replica needs it). > >After bulding a list of files that differ between the source and the >target, rsync copies the file to a temporary filename. After completing >the temporary file, rsync then renames the temporary to the target >filename. This has the disadvantage of putting the target file on the >"wrong" dht subvolume because the hash for the temporary filename is >different from the target filename. > >rsync's network transfer is over ssh by default, so you're also hitting >encryption and buffer overhead. > >The optimum process for *initially* copying large numbers of files to >gluster would be to blindly copy a list of files from the source to the >target without reading the target. If copying across a network, >maximizing the packet size is also advantageous. I've found tar ( + pv >if you want to monitor throughput) + netcat (or socat) to be much >faster than rsync. >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://www.gluster.org/mailman/listinfo/gluster-users