Hi everybody, I have a question about a special way to use rsync. Is it possible to use rsync for retrieving files without a local dir compare? There for rsync needs to know in a catalog or logfile what was already retrieved. Why I need that: We receive meteorological data from a remote server to a local directory (every 5 min). If the data is here, it is imported by a special software, after the import it will be deleted from that directory. The deleting can't be disabled. Normally I would say, ok download all again, but we get 80GB data per day. If rsync compares the local dir it will download all again, because it's empty. So rsync has to know what is already downloaded, and only get the new files WITHOUT the dir compare. Does anybody know a way how to realize that? Til today we are doing this with ftp, but it's very unstable and not working good, so I would like to do this with rsync - if possible. thanks and regards marc
On Wed 14 Oct 2009, Marc Mertes wrote:> > Why I need that: > We receive meteorological data from a remote server to a local directory > (every 5 min). > If the data is here, it is imported by a special software, after the > import it will be deleted from that directory. The deleting can't be > disabled. > Normally I would say, ok download all again, but we get 80GB data per day. > > If rsync compares the local dir it will download all again, because it's > empty. > So rsync has to know what is already downloaded, and only get the new > files WITHOUT the dir compare.I would use --remove-source-files, but that will probably need some adjusting on the source end of the transfer. I've built a simple script that distributes uploaded images etc. to a number of different (load balanced) webservers. For each webserver I link the uploaded file to a directory, and the webservers fetch the files using --remove-source-files. That ensures that all webservers get their files, and no bulk of data remains on the distribution system. The principle of your "problem" sounds similar, but as I said is probably difficult to implement without the source system being modified.> Til today we are doing this with ftp, but it's very unstable and not > working good, so I would like to > do this with rsync - if possible.How do you know what files to FTP? With FTP you can also retrieve the full 80GB every time... If you know what files to fetch, that's easy to implement with rsync as well. You could also consider using --exclude-from to exclude those files you've already seen. Paul
>We receive meteorological data from a remote server to a local directory >(every 5 min). >If the data is here, it is imported by a special software, after the >import it will be deleted from that directory. The deleting can't be disabled. >Normally I would say, ok download all again, but we get 80GB data per day. > >If rsync compares the local dir it will download all again, because it's empty. >So rsync has to know what is already downloaded, and only get the new >files WITHOUT the dir compare. >Does anybody know a way how to realize that?One way: keep a local copy which rsync can update. Therefore only the new or changed files will get transferred. Then from this (e.g. from the rsync output) copy the new files into a separate folder where they will be imported from (and deleted). Another way: As Paul mentioned you first need to find out the files to copy, e.g. have a remote script that gathers all new files into a textfile. Then you first get this file and then feed it into rsync with --files-from. I thought there was a way to tell rsync to only sync files from a specific period (as cp can do) but I couldn't find it, maybe not possible. bye Fabi
On Wed, Oct 14, 2009 at 2:36 AM, Marc Mertes <mertes at uni-bonn.de> wrote:> If the data is here, it is imported by a special software, after the > import it will be deleted from that directory. The deleting can't be > disabled. >If you know what file just got processed, append its name onto an exclude file (one per line), and then use --exclude-from=EXCLUDE_FILE. ..wayne.. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20091017/94fd8a93/attachment.html>