Hello. I am launching a cron bash script that does the following: Day 1 /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-25 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-26 Day 2 /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-26 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-27 Day 3 /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-27 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-28 and etc. The backup server experiences a large flow of data when the quantity of files exceeds millions, as rsync scans the files of the previous day because of the link-dest option. Is it possible to use the batch-file mechanism in such a way, so that when using the link-dest option, the file with the metadata from the current day could the executed the following day without having to scan the folder, that is linked in the link-dest? Yours faithfully, Sergey Dugin mailto:drug at qwarta.ru QWARTA
I don't believe there is anything you can do with the batch options for this. If you added a --write-batch to each of those you would get 3 batch files that wouldn't be read without a --read-batch. If you also did a --read-batch that would contain differences between a backup and the backup before it so rsync would still have to read the backup before it to understand the batch (and this would continue on to the oldest backup making the problem worse). Anyway, what you were asking for sounds a lot like rdiff-backup. I didn't like it myself but maybe you would. BTW, my experience with many millions of files vs rsync --link-dest is that running the backup isn't a problem. The problem came when it was time to delete the oldest backup. An rm -rf took a lot longer than an rsync. If you haven't gotten there yet maybe you should try one and see if it is going to be as big a problem as I had. On 06/26/2018 03:02 PM, Дугин Сергей via rsync wrote:> Hello. > > I am launching a cron bash script that does the following: > > Day 1 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-25 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-26 > > Day 2 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-26 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-27 > > Day 3 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-27 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-28 > > and etc. > > > The backup server experiences a large flow of data when the quantity > of files exceeds millions, as rsync scans the files of the previous > day because of the link-dest option. Is it possible to use the > batch-file mechanism in such a way, so that when using the > link-dest option, the file with the metadata from the current day > could the executed the following day without having to scan the > folder, that is linked in the link-dest? > > > Yours faithfully, > Sergey Dugin mailto:drug at qwarta.ru > QWARTA > >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: https://sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180626/218a180c/signature.sig>
On Tue, Jun 26, 2018 at 12:02 PM, Дугин Сергей via rsync < rsync at lists.samba.org> wrote:> I am launching a cron bash script that does the following: > > Day 1 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-25 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-26 > > Day 2 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-26 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-27 > > Day 3 > /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-27 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-28 > > and etc. >This isn't really what you were asking, but with the "dated directories" scheme, what happens if one or your machines crashes during a backup? Don't you end up storing a lot more data in the next successful backup? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180626/98007112/attachment.html>
I don't know how the OP manages their backups. I write out a backupname.current symlink pointing to the new backup once it is completed. That is what I use as the --link-dest parameter and what I would restore from. If a backup is aborted in the middle, doesn't happen at all, or fails the symlink isn't changed and the last one that worked is still tagged current. I also name the rsync target dir backupname.incomplete and rename it to backup.$Date when the backup completes so that there won't be any extra date stamped dirs in the list to affect when it is time to delete old backups and so I won't ever try to restore an incomplete (at lest not without knowing that I am doing so). On 06/26/2018 04:36 PM, Dan Stromberg via rsync wrote:> > > On Tue, Jun 26, 2018 at 12:02 PM, Дугин Сергей via rsync > <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> wrote: > > I am launching a cron bash script that does the following: > > Day 1 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-25 root at 192.168.1.103:/home/ > /home/backuper/.BACKUP/0000009/2018-06-26 > > Day 2 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-26 root at 192.168.1.103:/home/ > /home/backuper/.BACKUP/0000009/2018-06-27 > > Day 3 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-27 root at 192.168.1.103:/home/ > /home/backuper/.BACKUP/0000009/2018-06-28 > > and etc. > > This isn't really what you were asking, but with the "dated directories" > scheme, what happens if one or your machines crashes during a backup? > Don't you end up storing a lot more data in the next successful backup? > > > >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: https://sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180626/0067d84a/signature.sig>
You have to have a script that places a "successful" file in the root of the completed rsync... And use that to figure out what to do for link-dest at the top of the script... I use something more like daily.0-daily.7 and monthly.0-monthly.3 for the folders and rotate them daily -if- the "successful" file exists. If it does not rotate, then the failed rsync from the day before is reused... (i.e. I always backup to daily.0 using daily.1 as the link-dest...) I make a hard-link replica of daily.1 into monthly.0 on the first of each month. That leaves me with 7 days of successful daily backups and 4 months of depth backups. -- Larry Irwin Email: lrirwin at alum.wustl.edu On 06/26/2018 04:36 PM, Dan Stromberg via rsync wrote:> > > On Tue, Jun 26, 2018 at 12:02 PM, Дугин Сергей via rsync > <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> wrote: > > I am launching a cron bash script that does the following: > > Day 1 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-25 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-26 > > Day 2 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-26 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-27 > > Day 3 > /usr/bin/rsync -aH --link-dest > /home/backuper/.BACKUP/0000009/2018-06-27 > root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-28 > > and etc. > > This isn't really what you were asking, but with the "dated > directories" scheme, what happens if one or your machines crashes > during a backup? Don't you end up storing a lot more data in the next > successful backup? > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180628/cfe82f52/attachment.html>
Hello. I need that during today's backup, the metadata about the files is saved in a file, so that tomorrow when creating a backup with the option "link-dest" instead of this option I would specify a file with metadata, then rsync will not scan the folder specified in "link-dest", but simply reads information about this folder from a file with metadata. This greatly saves time and load on the server with backups. I do not delete through rm -rf, but delete the ZFS partition, you can also delete via find -delete, there are other ways On 26 июня 2018 г., 22:47:56:> I don't believe there is anything you can do with the batch options for > this. If you added a --write-batch to each of those you would get 3 > batch files that wouldn't be read without a --read-batch. If you also > did a --read-batch that would contain differences between a backup and > the backup before it so rsync would still have to read the backup before > it to understand the batch (and this would continue on to the oldest > backup making the problem worse).> Anyway, what you were asking for sounds a lot like rdiff-backup. I > didn't like it myself but maybe you would.> BTW, my experience with many millions of files vs rsync --link-dest is > that running the backup isn't a problem. The problem came when it was > time to delete the oldest backup. An rm -rf took a lot longer than an > rsync. If you haven't gotten there yet maybe you should try one and see > if it is going to be as big a problem as I had.> On 06/26/2018 03:02 PM, Дугин Сергей via rsync wrote: >> Hello. >> >> I am launching a cron bash script that does the following: >> >> Day 1 >> /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-25 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-26 >> >> Day 2 >> /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-26 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-27 >> >> Day 3 >> /usr/bin/rsync -aH --link-dest /home/backuper/.BACKUP/0000009/2018-06-27 root at 192.168.1.103:/home/ /home/backuper/.BACKUP/0000009/2018-06-28 >> >> and etc. >> >> >> The backup server experiences a large flow of data when the quantity >> of files exceeds millions, as rsync scans the files of the previous >> day because of the link-dest option. Is it possible to use the >> batch-file mechanism in such a way, so that when using the >> link-dest option, the file with the metadata from the current day >> could the executed the following day without having to scan the >> folder, that is linked in the link-dest? >> >> >> Yours faithfully, >> Sergey Dugin mailto:drug at qwarta.ru >> QWARTA >> >>-- С уважением, Дугин Сергей mailto:drug at qwarta.ru QWARTA