rsync -avnp remote::gif/ `find /home/www/html/ -maxdepth 1 -name "*.[j,g][pg,if]*"` /tmp/ If I run this on the local machine, the rsync server, it takes this long: ---> root@server (0.34)# time find /home/www/html/ -maxdepth 1 -name "*.[j,g][pg,if]*" -type f /home/www/html/comparestores_2.jpg /home/www/html/home.jpg /home/www/html/comparestores_3.jpg /home/www/html/specialoffer_apparel.jpg /home/www/html/bike_gary.gif /home/www/html/gary_bike.gif /home/www/html/none.gif real 0m0.015s user 0m0.000s sys 0m0.000s However if I run it from a client, it will take forever. Too much to run, it seems. Our directory structure has well over a million files. And this is just one directory under /home/www/html. We can't afford the cpu and system load to traverse everything, this is why I am using the find command. Shouldn't this work? It does come back with retrieving the list from the remote server. -- Jason G Helfman Network Administrator BizRate.com, Co-Owner 310.754.1264 desk 310.466.2319 cell Fingerprint: DA13 C109 072B CC12 B568 8D84 E9A2 6A7D C479 BCFB GnuPG http://www.gnupg.org Get Private! 1024D/D75E0A36
>rsync -avnp remote::gif/ `find /home/www/html/ -maxdepth 1 >-name "*.[j,g][pg,if]*"` /tmp/ > >If I run this on the local machine, the rsync server, it takes this >long: > >---> root@server (0.34)# time find /home/www/html/ -maxdepth 1 >-name "*.[j,g][pg,if]*" -type f >/home/www/html/comparestores_2.jpg >/home/www/html/home.jpg >/home/www/html/comparestores_3.jpg >/home/www/html/specialoffer_apparel.jpg >/home/www/html/bike_gary.gif >/home/www/html/gary_bike.gif >/home/www/html/none.gif > >real 0m0.015s >user 0m0.000s >sys 0m0.000s > >However if I run it from a client, it will take forever. Too much to >run, it seems. Our directory structure has well over a million files. >And this is just one directory under /home/www/html. We can't afford the >cpu and system load to traverse everything, this is why I am using the >find command. Shouldn't this work? It does come back with retrieving the >list from the remote server.What OS are you running on both systems?? AFAIK linux with ext2/ext3 has (currently) severe problems with large directories (>5000 files). [Work is done to avoid that: see ext2 directory index patch at http://kernelnewbies.org/~phillips/ ] Maybe that's your problem. (In my - and strictly my - opinion, a directory with that many files is "unmaintainable". I'd do some partitioning - and if it's only sorting by filetype (.html, .gif, .jpg, ...)) Regards, Phil
>It is a reiserfs system on the client, and ext2 on the rsync server. > >The file system is organized lovely. Just a ton of files.Sorry, the homepage is at http://people.nl.linux.org/~phillips/htree/ Regards, Phil
Martin Pool
2001-Nov-23 17:35 UTC
solution: Re: rsync takes way too long to perform this....
On 14 Nov 2001, Jason Helfman <jhelfman@bizrate.com> wrote:> rsync -avnp remote::gif/ `find /home/www/html/ -maxdepth 1 > -name "*.[j,g][pg,if]*"` /tmp/I can't see how this syntax will work. The shell will expand this to: rsync -avnp remote::gif/ /home/www/html/foo.jpg /home/www/html/bar.gif (more filenames here) /tmp/ which I think from rsync's point of view means: copy all of remote::gif/, plus a pile of local files, into /tmp/. Is that really what you want? Secondly, when running from an rsync daemon (two-colon syntax) rsync should never expand remote backtick substitutions, because we don't want to allow weakly-authenticated remote users to run commands on the server. Incidentally, your shell syntax is strange. (Not a flame, just a point of information. :-) Your wildcard means: "anything containing a dot then any character from the set (j, comma, or g), then any character from the set (p, g, comma, i or f)." If you wanted all images then it might be better to say find /home/www/html -maxdepth 1 '(' -name '*.jpg' -o -name '*.gif' ')' Note also that -n means "don't copy anything, just list filenames". What is that it you actually want to achieve? If you want all image files and directories to come back to /tmp/remoteimages, then try rsync -avp remote::gif/ --include '*.jpg' --include '*.gif' \ --include '*/' --exclude '*' /tmp/remoteimages/ If you just want a list of them: rsync -avp remote::gif/ --include '*.jpg' --include '*.gif' \ --include '*/' --exclude '*' -- Martin