Considering using rsync on a couple systems for backup, I was wondering if it's possible, and if so how difficult is it, to delete files which have been backed up (in order to save space on the backup media). Anyone with experience doing this?
On Fri, Jan 11, 2013 at 11:29 AM, ken <gebser at mousecar.com> wrote:> Considering using rsync on a couple systems for backup, I was wondering > if it's possible, and if so how difficult is it, to delete files which > have been backed up (in order to save space on the backup media). > > Anyone with experience doing this?Can you be more specific about the problem you are trying to solve? Backuppc normally expires/deletes backups at a specified rate by itself and it only stores one copy of any identical file regardless of how many times it is backed up. You aren't going to save any space by deleting old copies of something that is still on any target you are backing up. -- Les Mikesell lesmikesell at gmail.com
On Jan 11, 2013, at 10:29 AM, ken wrote:> Considering using rsync on a couple systems for backup, I was wondering > if it's possible, and if so how difficult is it, to delete files which > have been backed up (in order to save space on the backup media). > > Anyone with experience doing this?---- rysnc has 'delete, delete-before, delete-after' options which I do use sometimes. Do a 'man rsync' for details. Craig
On Fri, Jan 11, 2013 at 12:29 PM, ken <gebser at mousecar.com> wrote:> Considering using rsync on a couple systems for backup, I was wondering > if it's possible, and if so how difficult is it, to delete files which > have been backed up (in order to save space on the backup media). > > Anyone with experience doing this?it's certainly feasible for a a fairly lackluster backup solution (e.g. gonna rebuild machine, want all of /home saved to other machine, rsync then reinstall to try $new ditro!) but I wouldn't recommend rsync for product grade backups; it'd get very complex very quickly trying to figure a way to do versioning (rsync would be really good for 'oops, I removed X file, but I'd copied it over to M machine, so I can recover', not very good at 'someone changed this file 4 days ago and now it doesn't do what I want, I'd like to go back to a previous version). at least in my estimation. -- Even the Magic 8 ball has an opinion on email clients: Outlook not so good.
On 01/11/2013 07:08 PM Reindl Harald wrote:> .... > different versions of the backup > daily, weekly, monthlyAh, yes, in that kind of scenario hard links would be useful.> unchanged files are replaced with hard-links > the destination files are virtually on the same placeAnd so then could changes to a file be recorded in the daily version as a diff against the weekly?
On Sat, Jan 12, 2013 at 3:46 AM, ken <gebser at mousecar.com> wrote:> On 01/11/2013 07:08 PM Reindl Harald wrote: >> .... >> different versions of the backup >> daily, weekly, monthly > > Ah, yes, in that kind of scenario hard links would be useful.Backuppc does it by matching the content, so it can pool the duplicate data even if the copies are found in different places or on different backup targets.>> unchanged files are replaced with hard-links >> the destination files are virtually on the same place > > And so then could changes to a file be recorded in the daily version as > a diff against the weekly?I think rdiff-backup can save deltas. Rsync itself and backuppc will send deltas over the network but end up reconstructing a complete copy of the target file even for small differences. At least backuppc can compress the resulting file for storage. -- Les Mikesell lesmikesell at gmail.com
On 01/13/2013 02:59 AM, Reindl Harald wrote:>> There is no common >> >mechanism for making files and databases consistent and making a >> >snapshot for backups. Admins must do this on their own. If you aren't >> >actively taking steps to make your backups consistent, they aren't. > open-vm-tools 2012.12.26 changes: > * vmsync is not longer being compiled on newer (3.0+) Linux > kernels since they support FIFREEZE/FITHAW ioctls.That's only needed to make a filesystem consistent for a snapshot at a lower level (such as the raid controller or VM hypervisor). It doesn't integrate with higher level components to make files and databases consistent. So, like I said, Linux falls far short of Windows' backup infrastructure, and if you aren't actively making your files consistent for a backup, they won't be. All of the required components are present, but it's up to admins to do all of the integration work.
On Fri, Jan 11, 2013 12:29:48 PM -0500, ken wrote:> Considering using rsync on a couple systems for backup, I was > wondering if it's possible, and if so how difficult is it...sorry to step in so late, but I have another question on this very topic. I have noticed that if I just _change_ the name of a folder, rsync doesn't realizes it. That is, if folder holidays_2013 contains, say, 1000 pictures of 10 MB each, I rsync it to a remote computer and then change its name locally to family_holidays_2013, on the next run rsync: - deletes the remote holidays_2013 and all its content - creates a remote family_holidays_2013 - uploads again to it ALL the 1000 pictures of 10 MB each even if all the "rsyncing" needed would be something equivalent to "mv holidays_2013 family_holidays_2013" on the remote server. Is it possible to tell rsync to behave in that way? I think not, but I'd like to be proven wrong on this. TIA, Marco
rsync -v -d root at 192.168.200.10:/var/lib/ . Use rsync -d option to synchronize only directory tree from source to the destination. The below example, synchronize only directory tree in recursive manner, not the files in the directories -- With Thanks & Regards, Keshaba Mahapatra Technical Consultant Complete Open Source Solutions #512, Aditya Trade Centre Ameerpet, Hyderabad,500038 Ph - +91 40 66773365 Mob - +91 7569071776
Reindl Harald wrote:> > > Am 19.01.2013 15:46, schrieb Nicolas Thierry-Mieg: >> M. Fioretti wrote: >>> On Fri, Jan 18, 2013 08:07:40 AM -0500, SilverTip257 wrote: >>>> if you really want to eliminate that data being transferred, I >>>> suppose you could do the extra work and rename the directory at the >>>> same time on the source and destination. Not ideal in the least. >>> >>> Not ideal indeed, but I'll probably do it that way next time that some >>> renaming like this happens on very large folders. I assume that after >>> that, I'd also have to launch rsync with the options that says to not >>> consider modification time. >> >> no I don't think you will, since the file modification times won't have >> changed. > > and even if the did - who cares? > > * rsync does not transfer unchanged data ever > * rsync will sync the times to them from the sources > * so have nearly zero network trafficNot true: if you change the modification time on a file, by default rsync will copy the whole file again. See man rsync: Rsync finds files that need to be transferred using a ?quick check? algorithm (by default) that looks for files that have changed in size or in last-modified time. and yes I've tested this before posting ;-) to avoid this you need to use --size-only .
On 01/19/2013 10:28 AM, Nicolas Thierry-Mieg wrote:> Not true: if you change the modification time on a file, by default > rsync will copy the whole file againrsync uses an efficient algorithm to compare file contents and transfer only the differences. Reindl was correct. rsync will use very little bandwidth in this case. You can test this by rsyncing a large file from one system to another, "touch"ing the file, and then rsync again. rsync will take a little while to generate checksums of the data to determine what needs to be copied, but will not transfer the entire contents of the file. If you run rsync with the -v flag, it will report the saved bandwidth as its "speedup". IIRC, this is expressed as the ratio of the size of files which were detected as not matching based on the given criteria (mtime and size by default, but possibly by checksum if given -c) to the size of data that was actually transmitted.
Reindl Harald wrote:> > > Am 19.01.2013 19:28, schrieb Nicolas Thierry-Mieg: >>>> no I don't think you will, since the file modification times won't have >>>> changed. >>> >>> and even if the did - who cares? >>> >>> * rsync does not transfer unchanged data ever >>> * rsync will sync the times to them from the sources >>> * so have nearly zero network traffic >> >> Not true: if you change the modification time on a file, by default >> rsync will copy the whole file again. >> >> See man rsync: >> Rsync finds files that need to be transferred using a ?quick check? >> algorithm (by default) that looks for files that have changed in size or >> in last-modified time. >> >> and yes I've tested this before posting ;-) >> to avoid this you need to use --size-only > > bullshit > > yes it transfers - but with rsync algorithm > RTFM how rsync works - it will generate checksums on both > sides, tnrafser only the checksums and come to the conclusion > that the data are ident > > i am using rsync since many years for all sort of backups > and file transfers and even my thunderbird-profiles over > WAN is copied with a "virtual speed" of 200 Megabytes per second > ________________________________ > > [harry at srv-rhsoft:~]$ ls /mnt/data/profiles/thunderbird/harry/global-messages-db.sqlite > -rw-r----- 1 harry verwaltung 640M 2013-01-19 19:31 /mnt/data/profiles/thunderbird/harry/global-messages-db.sqlite > > this file will ALWAYS be changed, not only modification times > but that does not change the facht 99.8% of the file is unchanged > and rsync by design transfers only the changes over the wire > > since i am doing this DAILY between home and office machine > you do not need to explain me how rsync works and in which cases > in trafsers data - really you do not need > > i sync some TB of data daily inclduing GB large logfiles > where is also only the new part transferred all the time >woosh! chill out dude... again, read the man page. This is not true if source and dest are local: then the rsync algo is not used, and if the mod time is changed on the source the whole file will be copied. so if you're rsyncing locally, eg to a usb drive, you need --size-only as I said. Now if one of the source or dest is remote I agree with you, but this is not alwayss the case. I don't recall whether the OP expressed whether that was the case or not, though I think he mentioned wanting to backup family pictures, so it might very well be to a usb HD. Inany case I definitely know you mentioned testing things locally. Which I did, and you didn't... being wrong is ok, but you should really work on that attitude of yours.