Yes, I've read the FAQ, just hoping for a boon... I'm in the process of relocating a large amount of data from one nfs server to another (Network Appliance filers). The process I've been using is to nfs mount both source and destination to a server (solaris8) and simply use rsync -a /source/ /dest . It works great except for the few that have > 10 million files. On these I get the following: ERROR: out of memory in make_file rsync error: error allocating core memory buffers (code 22) at util.c(232) It takes days to resync these after the cutover with tar, rather than the few hours it would take with rsync -- this is making for some angry users. If anyone has a work-around, I'd very much appreciate it. Thanks, Mark Crowder Texas Instruments, KFAB Computer Engineering email: m-crowder@ti.com -------------- next part -------------- HTML attachment scrubbed and removed
tim.conway@philips.com
2002-Oct-21 17:53 UTC
Any work-around for very large number of files yet?
Mark: You are S.O.L. There's been a lot of discussion on the subject, and so far, the only answer is faster machines with more memory. For my own application, I have had to write my own system, which can be best described as find, sort, diff, grep, cut, tar, gzip. It's a bit more complicated than that, and the find, sort, diff, grep, and cut are implemented in perl code. It also gets to use some assumptions I can make about our data, concerning file naming, dating, and sizing, and has no replacement for rsync's main magic, the incremental update of a file. Nonetheless, a similar approach might do well for you, as chances are, most of your changes are the addition and removal of files, with changes to existing files always entailing a change in size and/or timestamp. Tim Conway conway.tim@sphlihp.com reorder name and reverse domain 303.682.4917 office, 303.921.0301 cell Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, caesupport2 on AIM "There are some who call me.... Tim?" "Crowder, Mark" <m-crowder@ti.com> Sent by: rsync-admin@lists.samba.org 10/21/2002 08:37 AM To: rsync@lists.samba.org cc: (bcc: Tim Conway/LMT/SC/PHILIPS) Subject: Any work-around for very large number of files yet? Classification: Yes, I've read the FAQ, just hoping for a boon... I'm in the process of relocating a large amount of data from one nfs server to another (Network Appliance filers). The process I've been using is to nfs mount both source and destination to a server (solaris8) and simply use rsync -a /source/ /dest . It works great except for the few that have > 10 million files. On these I get the following: ERROR: out of memory in make_file rsync error: error allocating core memory buffers (code 22) at util.c(232) It takes days to resync these after the cutover with tar, rather than the few hours it would take with rsync -- this is making for some angry users. If anyone has a work-around, I'd very much appreciate it. Thanks, Mark Crowder Texas Instruments, KFAB Computer Engineering email: m-crowder@ti.com
On Mon, Oct 21, 2002 at 09:37:45AM -0500, Crowder, Mark wrote:> Yes, I've read the FAQ, just hoping for a boon... > > I'm in the process of relocating a large amount of data from one nfs server > to another (Network Appliance filers). The process I've been using is to > nfs mount both source and destination to a server (solaris8) and simply use > rsync -a /source/ /dest . It works great except for the few that have > 10 > million files. On these I get the following: > > ERROR: out of memory in make_file > rsync error: error allocating core memory buffers (code 22) at util.c(232) > > It takes days to resync these after the cutover with tar, rather than the > few hours it would take with rsync -- this is making for some angry users. > If anyone has a work-around, I'd very much appreciate it.Sorry. If you want to use rsync you'll need to break the job up into manageable pieces. If, and only if, mod_times reflect updates (most likely) you will get better performance in this particular case using find|cpio. After it uses the meta-data to pick candidates rsync will read both the source and destination files to generate the checksums. This means that your changed files will be pulled in their entirety across the network twice before even starting to copy them. --whole-file will disable that part. Rsync is at a severe disadvantage when running on nfs mounts; nfs->nfs is even worse. In the past i found that using find was quite good for this. Use touch to create a file with a mod_time just before you started the last sync. Then from inside $src run find . -newer $touchfile -print|cpio -pdm $dest Without the -u option to cpio it will skip (and warn about) any files where the mod_dates haven't change but that is faster than transferring the file. The use of the touchfile is, in my opinion, bettern than -mtime and related options because it can have been created as part of the earlier cycle and it is less prone to user-error. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
JW (and others), Thanks for the input. --whole-file did indeed allow it to reach the failure point faster... I've been experimenting with find/cpio, and there's probably an answer there. Thanks Again, Mark -----Original Message----- From: jw schultz [mailto:jw@pegasys.ws] Sent: Monday, October 21, 2002 4:27 PM To: rsync@lists.samba.org Subject: Re: Any work-around for very large number of files yet? On Mon, Oct 21, 2002 at 09:37:45AM -0500, Crowder, Mark wrote:> Yes, I've read the FAQ, just hoping for a boon... > > I'm in the process of relocating a large amount of data from one nfsserver> to another (Network Appliance filers). The process I've been using is to > nfs mount both source and destination to a server (solaris8) and simplyuse> rsync -a /source/ /dest . It works great except for the few that have >10> million files. On these I get the following: > > ERROR: out of memory in make_file > rsync error: error allocating core memory buffers (code 22) at util.c(232) > > It takes days to resync these after the cutover with tar, rather than the > few hours it would take with rsync -- this is making for some angry users. > If anyone has a work-around, I'd very much appreciate it.Sorry. If you want to use rsync you'll need to break the job up into manageable pieces. If, and only if, mod_times reflect updates (most likely) you will get better performance in this particular case using find|cpio. After it uses the meta-data to pick candidates rsync will read both the source and destination files to generate the checksums. This means that your changed files will be pulled in their entirety across the network twice before even starting to copy them. --whole-file will disable that part. Rsync is at a severe disadvantage when running on nfs mounts; nfs->nfs is even worse. In the past i found that using find was quite good for this. Use touch to create a file with a mod_time just before you started the last sync. Then from inside $src run find . -newer $touchfile -print|cpio -pdm $dest Without the -u option to cpio it will skip (and warn about) any files where the mod_dates haven't change but that is faster than transferring the file. The use of the touchfile is, in my opinion, bettern than -mtime and related options because it can have been created as part of the earlier cycle and it is less prone to user-error. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
jw> In the past i found that using find was quite good for this. jw> Use touch to create a file with a mod_time just before you jw> started the last sync. Then from inside $src run jw> find . -newer $touchfile -print|cpio -pdm $dest For pruning, how about to add the feature to rsync. Is it difficult ? --exclude-older=SECONDs exclude files older than SECONDs before --ignore-older=SECONDs ignore any operations with the files older than SECONDs before differ from --exclude-olders, these files are not affected from --include files or --delete-excluded -- MARUYAMA Shinichi <marya@st.jip.co.jp>