thr3ads.net - rsync - Is there any way to restore/create hardlinks lost in incremental backups? [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Chris Green

2020-Dec-11 11:53 UTC

Is there any way to restore/create hardlinks lost in incremental backups?

Paul Slootman via rsync <rsync at lists.samba.org>
wrote:> On Thu 10 Dec 2020, Chris Green via rsync wrote:
> > 
> > Occasionally, because I've moved things around or because I've
done
> > something else that breaks things, the hard links aren't created
as
> > they should be and I get a very space consuming backup increment.
> > 
> > Is there any easy way that one can restore hard links in the *middle*
> > of a series?  For example say I have:-
> > 
> >     day1/pictures
> >     day2/pictures
> >     day3/pictures
> >     day4/pictures
> >     day5/pictures
> > 
> > and I notice that day4/pictures is using as much space as
> > day1/pictures but all the others are relatively small, i.e.
> > day2 day3 and day5 have correctly hard linked to the previous day but
> > day4 hasn't.
> > 
> > It needs a tool that can scan day4, check a file is identical with the
> > one in day3 then hardlink it without losing the link from day5.
> 
> If you have these files that are hardlinked:
> 
>     day1/pictures/1.jpg
>     day2/pictures/1.jpg
>     day3/pictures/1.jpg
> 
> And these are hardlinked, but to a different inode:
> 
>     day4/pictures/1.jpg
>     day5/pictures/1.jpg
> 
> then there is no way of linking the second group to the first in one
> step; you will have to individually link day3/pictures/1.jpg to
> day4/pictures/1.jpg and then day3/pictures/1.jpg (or
> day4/pictures/1.jpg) to day5/pictures/1.jpg.
> 
> It's not like a group of directory entries that are hardlinked to one
> inode are some sort of actual group; they just happen to be directory
> entries that point to the same inode number. There is no other relation
> between those directory entries.
> 
> So you will have to incrementally process each next day against the
> previous day.
> Yes, that's what I have done, wrote a trivial[ish] script that copied
all the backups to a new destination sequentially (using --link-dest)
and then removed the original tree, having checked the new backups
were OK of course.

Fortunately I have lots of spare space on the backup system at the
moment having just upgraded it with a new 8Tb drive, so duplicating
the whole backup wasn't an issue (though rather slow because it was
from and to the same drive).
> 
> If I make a significant change in such a directory structure (e.g.
> renaming a directory) I try to remember to do the same thing on the
> backup which some say is wrong, but it saves a lot of space, like you
> discovered :)
> Yes, I've sometimes done that.

-- 
Chris Green
?

Guillaume Outters

2020-Dec-11 14:30 UTC

head link

Is there any way to restore/create hardlinks lost in incremental backups?

On 2020-12-11 12:53, Chris Green wrote?:
> [?] wrote a trivial[ish] script that copied
> all the backups to a new destination sequentially (using --link-dest)
> and then removed the original tree, having checked the new backups
> were OK of course.
With the same cause as yours, I once worked out exactly the same 
solution.

But then, having to automate it, I worked a bit more on it, and ended 
up having a shell script that:
- recursively listed files as "file size - inode - path"
- with sort and awk, output the list of "every size that has different 
inodes"
- for each output size, cksumed one file for each inode
- if two different inodes (with the same file size) had their cksum 
match, then it replaced every file for the last inode, with a link to 
the first inode

If you have to run it frequently, you may want to implement something 
similar.
Although it ignores mtime info (and thus strips it when lning),
it has the great benefit of finding every duplicate, be it renamed and 
move to another dir
(as in 
./her.2020-12-01/Library/Mail/?/Sent.mbox/?/Attachments/?/PhotoDeFamille.JPG 
versus ./his.2020-11-26/perso/photos/100_9999.JPG).

(and by the way I reimplemented it in C, "just for fun" and for speed 
too: https://github.com/outtersg/dude/ . Hmm, in C but in French)

-- 
Guillaume

rsync - Dec 2020 - Is there any way to restore/create hardlinks lost in incremental backups?

Is there any way to restore/create hardlinks lost in incremental backups?

Is there any way to restore/create hardlinks lost in incremental backups?