On Sun, Jan 24, 2016 at 12:29 PM, < dbonde+forum+rsync.lists.samba.org at gmail.com> wrote:> On 2016-01-24 03:51, Kevin Korb wrote: > >> Are you rsyncing from one to the other? Both of them to somewhere >> else? One at a time to somewhere else? Why won't you just show your >> actual command line and an ls -li of the correct source and incorrect >> target? >> > > > Are you trolling me? All the information you ask for above has been > clearly spelled out in previous messages, messages you have replied to. >Sorry for butting in, but hope this helps: The command line you posted earlier reads % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" /source/ /destination/ I think Kevin is asking you write out that /source/ and /destination exactly as used on the command line so that one could understand what is going on better. The issues you're facing are rather unusual so a more complete description may help figure what's going on. Sure, you can mask username/password etc but do not simplify source and destination paths. Also the the description "The destination is a sparse disk image bundle mounted locally (but its "source file" is on a network storage)" is too cryptic. What kind of network storage? How is it mounted -- NFS? SMB? What kind of sparse disk image? What's a bundle? Not that I have any clue why the transfer could be so slow or why rsync is not detecting hardlinks in your case (it should, as Kevin initially pointed out), but someone else may be able to shed some light.. Just trying to help, Selva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20160124/cfc692ec/attachment.html>
dbonde+forum+rsync.lists.samba.org at gmail.com
2016-Jan-24 21:48 UTC
Why is my rsync transfer slow?
On 2016-01-24 20:39, Selva Nair wrote:> Sorry for butting in, but hope this helps: > > The command line you posted earlier reads > > % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" > /source/ /destination/ > > I think Kevin is asking you write out that /source/ and /destination > exactly as used on the command line so that one could understand what is > going on better.That doesn't make sense. Both the source and destination path contains simple alphanumeric characters, no more no less. Why would it matter whether the path is /abc/ or /def/ or even /123/? The issues you're facing are rather unusual so a more> complete description may help figure what's going on. Sure, you can mask > username/password etc but do not simplify source and destination paths. > > Also the the description "The destination is a sparse disk image bundle > mounted locally (but its > "source file" is on a network storage)" is too cryptic. What kind of > network storage? How is it mounted -- NFS? SMB? What kind of sparse disk > image? What's a bundle?It is exactly as I wrote. On a network volume (A) a "sparse disk image bundle" (B), i.e., a type of disk image used in OS X, is stored. B is then mounted locally (i.e., local to where rsync is run) on a computer (C) where it appears as one of many volumes. In other words, B is stored on A. A is then mounted (using AFP) on C. C then mounts B (=opens a file on a network volume, but instead of opening e.g., a spreadsheet in Excel, opening B shows a new volume on the desktop of C) stored on A. The computer where it is mounted just sees a mounted volume - it can't distinguish between a disk image stored remotely or stored on the computers internal hard drive. I assume you are familiar with the idea of disk images?
On Sun, Jan 24, 2016 at 4:48 PM, < dbonde+forum+rsync.lists.samba.org at gmail.com> wrote:> That doesn't make sense. Both the source and destination path contains > simple alphanumeric characters, no more no less. Why would it matter > whether the path is /abc/ or /def/ or even /123/?Hmm.. I thought your are the one who has been asking for help. It does very much matter what your source and destination exactly are.> I assume you are familiar with the idea of disk images?There are many different kinds of disk images, so are there many ways of network mounting. If you say I do "rsync /a /b/" and it runs too slow, you are not going to get any useful responses.. Good luck, Selva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20160124/c6ddef71/attachment.html>
dbonde+forum+rsync.lists.samba.org at gmail.com wrote:> It is exactly as I wrote. On a network volume (A) a "sparse disk image bundle" (B), i.e., a type of disk image used in OS X, is stored. B is then mounted locally (i.e., local to where rsync is run) on a computer (C) where it appears as one of many volumes. > > In other words, B is stored on A. A is then mounted (using AFP) on C. C then mounts B (=opens a file on a network volume, but instead of opening e.g., a spreadsheet in Excel, opening B shows a new volume on the desktop of C) stored on A.> The computer where it is mounted just sees a mounted volume - it can't distinguish between a disk image stored remotely or stored on the computers internal hard drive.I wouldn't count on that !> I assume you are familiar with the idea of disk images?I think most are familiar with disk images - but not so many with the specific implementations used by OS X. OS X has the concept of a "bundle". To the user this appears as a single file with it's own name and icon. Internally it's a folder tree with a number of files/folders. As a quick test, I've just created a 100M sparse image, here's the contents before I've added any files :> $ ls -lRh a.sparsebundle/ > total 16 > -rw-r--r-- 1 simon staff 496B 25 Jan 14:36 Info.bckup > -rw-r--r-- 1 simon staff 496B 25 Jan 14:36 Info.plist > drwxr-xr-x 8 simon staff 272B 25 Jan 14:36 bands > -rw-r--r-- 1 simon staff 0B 25 Jan 14:36 token > > a.sparsebundle//bands: > total 34952 > -rw-r--r-- 1 simon staff 2.1M 25 Jan 14:37 0 > -rw-r--r-- 1 simon staff 2.4M 25 Jan 14:36 1 > -rw-r--r-- 1 simon staff 2.0M 25 Jan 14:36 2 > -rw-r--r-- 1 simon staff 912K 25 Jan 14:36 6 > -rw-r--r-- 1 simon staff 8.0M 25 Jan 14:36 b > -rw-r--r-- 1 simon staff 1.7M 25 Jan 14:36 cIt is **NOT** the same as a unix sparse file ! The contents are divided up into chunks, with each chunk stored in a file of it's own. I suspect this may also have an impact on performance. As the disk is filled, the "bands" files grow in number and size - with the disk filled, the bands are are complete from 0 through c, with all but c being 8M. As an aside, there is also an unfortunate combination of name and Finder behaviour. If you set the Finder to show file extensions, it will show (eg in this case) "a.sparsebundle" - but if the name is a bit longer, it shows the begging of the name, an ellipsis "...", and the end of the name including extension. My mother was "a little confused" when she saw a folder on my screen with several "...arsebundle"s ! There are a lot of layers in your setup - any of them (or some combination thereof) could be slowing things down. Rsync Filesystem on B Loppback mount (and associated systems) on B AFP between A and B - is the host for A an OS X machine running native AFP, or something like Linux running Netatalk ? Filesystem on A - inc sparse bundle file support Disk subsystem on A A few things come to mind ... 1) I am aware that AFP has some performance issues with some combinations of operations - no I don't know if this is one of them. 2) More importantly, if you look back through the archives, there was a thread not long ago about poor performance of rsync for "very large" file counts - and 45 million is "large". I didn't pay much attention, but IIRC the originator of that thread was proposing some alterations to improve things. 3) While rsync is designed to operate efficiently over slow/high latency links - 100MBps is always going to have an impact on throughput. As an experiment, can you mount the disk of A locally on B ? Shut down the system hosting A and put it in FireWire Target Mode then connect it to B - A's disk then appears as a local FireWire disk on B. This will show whether AFP has any bearing on performance. If the computer hosting A doesn't support target mode then your a bit stuffed - but there may be other options. Or alternately, connect the external disk directly to A's host rather than to B. Either way, you can then run rsync as a local copy without the network element. But as I write this, something far far more important comes to mind.Files on HFS disks are not like files on other filesystems (though I believe NTFS has a feature which adds similar complications). I am not sure exactly how rsync handled this - I do recall that Apple's version adds support for the triplet of "metadata + resource fork + data fork". From memory this results in many files getting re-copied every time regardless of whether they were modified or not. Memory is only vague, but I think it was something to do with comparing source and dest doesn't work properly when one end is looking at "whole file" and the other is only looking at one part. I would suggest doing a test copy using only a small part of the tree, and do the copy again (so no files actually changed) and watch carefully what's been copied. I vaguely recall (from a looong time ago) that any file with a resource fork was re-copied each time even though it's not changed. If this is the case, and I'm not misremembering, then it's possible that the combination of "rsync not handling very large file sets well" and "resource forks causing issues" could be (at least partly) behind your performance problem. Another test I;d be inclined to try would be to copy things one restore point at a time. As you'll be aware, each restore point is it's own timestamped directory - hardlinked to the previous one for files that haven't changed. Try rsyncing only the last one, then the last two, then the last 3, then the last 4, and so on. You can use --include and --exclude to do this. See how performance varies as the number of included trees increases - I suspect it increases more than linearly given the work involved in tracking hard-links.