Hello, The company I work for uses rsync for backups from client computers, the memory usage is a problem for a lot of them since they're already busy doing other important things (databases, web serving, etc).>From the FAQ:--- out of memory The usual reason for "out of memory" when running rsync is that you are transferring a _very_ large number of files. The size of the files doesn't matter, only the total number of files. As a rule of thumb you should expect rsync to consume about 100 bytes per file in the file list. This happens because rsync builds a internal file list structure containing all the vital details of each file. rsync needs to hold structure in memory because it is being constantly traversed. I do have a plan for how to rewrite rsync so that it consumes a fixed (small) amount of memory no matter how many files are transferred, but I haven't yet found a spare week of coding time to implement it! --- Unfortunately there's no indication of who needs a spare week of coding time, or how much a week would cost. Since it's important (to us, probably others) to have a more memory friendly rsync, and we're thankful for the work already done, could the person responsible for that comment please respond (publicly or privately) with a dollar figure to make the necessary changes. This would be for public consumption, we believe in supporting open source software. A CC:'d reply would help (I am not subscribed to the list), but I will also watch the archives. Thanks. -- Matthew S. Hallacy
On Wed, Jul 06, 2005 at 01:13:23AM -0500, Matthew S. Hallacy wrote:> Unfortunately there's no indication of who needs a spare week of > coding time, or how much a week would cost.That's a really old comment, so I'm not sure if it was written by Martin Pool or Dave Dykstra or someone else. I'm also fairly sure that a week of time would not suffice to solve this problem in rsync, but you may wish to try to contact one of those guys and see if they wrote the comment in question and what they were thinking of as a solution. Quite a while back I looked into the changes necessary to have rsync perform a more incremental update of a hierarchy of files, and they required very extensive changes to the protocol. I coded up a test-bed application which I jokingly named rZync. This program got to the stage of being able to transfer and update hierarchies of files, but I never took it to the next stage of re-examining the design and coding up a next-gen version of rsync. (The test-bed doesn't have enough features to be a full replacement of rsync for most people, since options like --delete and --hard-links aren't implemented.) Working on this is actually something that I'd like to revisit in the near future. If you had any interest in helping to fund this, please get in touch with me. ..wayne..
> The company I work for uses rsync for backups from client computers, > the memory usage is a problem for a lot of them since they're already > busy doing other important things (databases, web serving, etc).>> From the FAQ: > out of memory > The usual reason for "out of memory" when running rsync is that you > are transferring a _very_ large number of files. The size of the > files doesn't matter, only the total number of files.One possible scheme would be to store the file info compressed, this might be a smaller impact than rewriting the protocol. Typically gzip gets 3x compression on ascii, no reason why this couldn't be accomplished. It would slow things down so would need an option to enable. This would work best with streaming compression rather than gzips block-oriented style else every entry would consume the minimum block size. It _is_ a hack I admit...
Maybe Matching Threads
- Other possible solutions to: rsync memory usage, paid feature request
- looking for superlifter souce code and related information
- Need help with the rsync library and the communication protocol
- Release 3 of "rzync" new-protocol test
- Request for spandsp paid support