thr3ads.net - rsync - rsync script for snapshot backups [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Dennis Steinkamp

2016-Jun-19 12:22 UTC

rsync script for snapshot backups

Hey guys,

i tried to create a simple rsync script that should create daily backups 
from a ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should 
only contain "new data" and the rest will be referenced via hardlinks 
(-link-dest)

This was at least a simple enough scenario to achieve it with my 
pathetic scripting skills. This is what i came up with:

#!/bin/sh

# rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero 
dataset

# Set variables
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Copy previous timefile to timeold.txt if it exists
if [ -f "/volume1/rsync/Buero/timenow.txt" ]
then
     yes | cp /volume1/rsync/Buero/timenow.txt 
/volume1/rsync/Buero/timeold.txt
fi
# Create current timefile
     echo `date +"%d-%m-%Y-%H%M"` >
/volume1/rsync/Buero/timenow.txt
# rsync command
if [ -f "/volume1/rsync/Buero/timeold.txt" ]
then
     rsync -aqzh \
     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
     --log-file=/volume1/Backup_Test/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \
     --link-dest=/volume1/Backup_Test/`cat 
/volume1/rsync/Buero/timeold.txt` \
Test at 192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
else
     rsync -aqzh \
     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
     --log-file=/volume1/Backup_Buero/logs/rsync-`date 
+"%d-%m-%Y-%H%M"`.log \
Test at 192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Buero/$EXPIRED-*
fi

Well, it works but there is a huge flaw with his approach and i am not 
able to solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but as 
soon as one backup job couldn`t be finished for some reason, (like it 
will be aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a new 
full backup which fills up my backup storage.

What i would like to achieve is, to improve the script so that a backup 
run that wasn`t finished properly will be resumed, next time the script 
triggers.
Only if that was successful should the next incremental backup be 
created so that the files that didn`t changed from the previous backup 
can be hardlinked properly.

I did a little bit of research and i am not sure if i am on the right 
track here but apparently this can be done with return codes, but i 
honestly don`t know how to do this.
Thank you in advance for your help and sorry if this question may seem 
foolish to most of you people.

Regards

Dennis








-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.samba.org/pipermail/rsync/attachments/20160619/ca565622/attachment.html>

Simon Hobson

2016-Jun-19 17:27 UTC

head link

rsync script for snapshot backups

Dennis Steinkamp <dennis at lightandshadow.tv> wrote:
> i tried to create a simple rsync script that should create daily backups
from a ZFS storage and put them into a timestamp folder.
> After creating the initial full backup, the following backups should only
contain "new data" and the rest will be referenced via hardlinks
(-link-dest)
> ...
> Well, it works but there is a huge flaw with his approach and i am not able
to solve it on my own unfortunately.
> As long as the backups are finishing properly, everything is fine but as
soon as one backup job couldn`t be finished for some reason, (like it will be
aborted accidently or a power cut occurs)
> the whole backup chain is messed up and usually the script creates a new
full backup which fills up my backup storage.
Yes indeed, this is a typical flaw with many systems - you often need to throw
away the partial backup.
One option that comes to mind is this :
Create the new backup in a directory called (for example) "new" or
"in-progress". If, and only if, the backup completes, then rename this
to a timestamp. If when you start a new backup, if the in-progress folder
exists, then use that and it'll be freshened to the current source state.

Also, have you looked at StoreBackup ? http://storebackup.org
I does most of this automagically, keeps a definable history (eg one/day for 14
days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so
can detect bit-rot in your backups.

Dennis Steinkamp

2016-Jun-19 18:16 UTC

head link

rsync script for snapshot backups

Am 19.06.2016 um 19:27 schrieb Simon Hobson:> Dennis Steinkamp <dennis at lightandshadow.tv> wrote:
>
>> i tried to create a simple rsync script that should create daily
backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should
only contain "new data" and the rest will be referenced via hardlinks
(-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not
able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but
as soon as one backup job couldn`t be finished for some reason, (like it will be
aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a
new full backup which fills up my backup storage.
> Yes indeed, this is a typical flaw with many systems - you often need to
throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new"
or "in-progress". If, and only if, the backup completes, then rename
this to a timestamp. If when you start a new backup, if the in-progress folder
exists, then use that and it'll be freshened to the current source state.
>
> Also, have you looked at StoreBackup ? http://storebackup.org
> I does most of this automagically, keeps a definable history (eg one/day
for 14 days, one/week for x weeks, one/30d for y years), plus it keeps file
hashes so can detect bit-rot in your backups.
>
>  Thank you for taking the time to answer me.
  Your suggestion is what i also had in mind but i wasn`t sure if this 
would be "best practice"
  To build this idea into my script i probably need to hardcode the 
target directory rsync writes to (e.g new or in-progress) and move the 
directory name to a timestamp only after rsync gave a return code of 0, 
am i correct? (or return code 0 and 24?)

  As for StoreBackup, it really does sound nice but i have to do all of 
this from the console of a 2bay synology nas, so its not that easy to 
use 3rd party software that may has other dependencies, the synology 
system doesn`t meet.

Joe

2016-Jun-20 03:53 UTC

head link

rsync script for snapshot backups

Rely on the other answers here as to how to do it right.

I just want to mention a few things in your script.>     yes | cp /volume1/rsync/Buero/timenow.txt 
> /volume1/rsync/Buero/timeold.txtYes is a program which puts out "Y" (or whatever you tell it to)
forever
- not what you want - and cp does not accept input from a pipe unless 
the first argument is "-" or some similar fancier construction. You
can
probably just leave off  the "yes | " and have the statement work 
exactly as it does now.

It looks like your EXPIRED logic will only find a directory which 
*exactly* matches that date.

You might look at using something like a find command to find 
directories older than 14 days.

Some find options which might help:

-ctime 14  specifies finding things modified more than 14 days ago
-type d specifies finding only directories
-maxdepth 1 specifies finding things only one level below the path find 
starts at
-exec ls -l {} \; specifies running a command on every result which is 
returned - in this case, an ls which can't hurt anything. You can 
replace ls with something like rm -rf {} when you're *very* sure the 
command is finding *exactly* what you want it to.

I didn't put the whole command together because until you understand how 
it works, you don't want to try something that might delete a bunch of 
things beyond what you actually want deleted.

Joe

On 06/19/2016 08:22 AM, Dennis Steinkamp wrote:> Hey guys,
>
> i tried to create a simple rsync script that should create daily 
> backups from a ZFS storage and put them into a timestamp folder.
> After creating the initial full backup, the following backups should 
> only contain "new data" and the rest will be referenced via
hardlinks
> (-link-dest)
>
> This was at least a simple enough scenario to achieve it with my 
> pathetic scripting skills. This is what i came up with:
>
> #!/bin/sh
>
> # rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero 
> dataset
>
> # Set variables
> EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`
>
> # Copy previous timefile to timeold.txt if it exists
> if [ -f "/volume1/rsync/Buero/timenow.txt" ]
> then
>     yes | cp /volume1/rsync/Buero/timenow.txt 
> /volume1/rsync/Buero/timeold.txt
> fi
> # Create current timefile
>     echo `date +"%d-%m-%Y-%H%M"` >
/volume1/rsync/Buero/timenow.txt
> # rsync command
> if [ -f "/volume1/rsync/Buero/timeold.txt" ]
> then
>     rsync -aqzh \
>     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
>     --log-file=/volume1/Backup_Test/logs/rsync-`date 
> +"%d-%m-%Y-%H%M"`.log \
>     --link-dest=/volume1/Backup_Test/`cat 
> /volume1/rsync/Buero/timeold.txt` \
> Test at 192.168.2.2::Test/volume1/Backup_Test/`date
+"%d-%m-%Y-%H%M"`
> else
>     rsync -aqzh \
>     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
>     --log-file=/volume1/Backup_Buero/logs/rsync-`date 
> +"%d-%m-%Y-%H%M"`.log \
> Test at 192.168.2.2::Test/volume1/Backup_Test/`date
+"%d-%m-%Y-%H%M"`
> fi
>
> # Delete expired snapshots (2 weeks old)
> if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
> then
> rm -Rf /volume1/Backup_Buero/$EXPIRED-*
> fi
>
> Well, it works but there is a huge flaw with his approach and i am not 
> able to solve it on my own unfortunately.
> As long as the backups are finishing properly, everything is fine but 
> as soon as one backup job couldn`t be finished for some reason, (like 
> it will be aborted accidently or a power cut occurs)
> the whole backup chain is messed up and usually the script creates a 
> new full backup which fills up my backup storage.
>
> What i would like to achieve is, to improve the script so that a 
> backup run that wasn`t finished properly will be resumed, next time 
> the script triggers.
> Only if that was successful should the next incremental backup be 
> created so that the files that didn`t changed from the previous backup 
> can be hardlinked properly.
>
> I did a little bit of research and i am not sure if i am on the right 
> track here but apparently this can be done with return codes, but i 
> honestly don`t know how to do this.
> Thank you in advance for your help and sorry if this question may seem 
> foolish to most of you people.
>
> Regards
>
> Dennis
>
>
>
>
>
>
>
>
>
>

Larry Irwin (gmail)

2016-Jun-20 20:01 UTC

head link

rsync script for snapshot backups

The scripts I use analyze the rsync log after it completes and then 
sftp's a summary to the root of the just completed rsync.
If no summary is found or the summary is that it failed, the folder 
rotation for that set is skipped and that folder is re-used on the 
subsequent rsync.
The key here is that the folder rotation script runs separately from the 
rsync script(s).
For each entity I want to rsync, I create a named folder to identify it 
and the rsync'd data is held in sub-folders:
daily.[1-7] and monthly.[1-3]
When I rsync, I rsync into daily.0 using daily.1 as the link-dest.
Then the rotation script checks daily.0/rsync.summary - and if it 
worked, it removes daily.7 and renames the daily folders.
On the first of the month, the rotation script removes monthly.3, 
renames the other 2 and makes a complete hard-link copy of daily.1 to 
monthly.1
It's been running now for about 4 years and, in my environment, the 10 
copies take about 4 times the space of a single copy.
(we do complete copies of linux servers - starting from /)
If there's a good spot to post the scripts, I'd be glad to put them up.

-- 
Larry Irwin
Cell: 864-525-1322
Email: lrirwin at alum.wustl.edu
Skype: larry_irwin
About: http://about.me/larry_irwin

On 06/19/2016 01:27 PM, Simon Hobson wrote:> Dennis Steinkamp <dennis at lightandshadow.tv> wrote:
>
>> i tried to create a simple rsync script that should create daily
backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should
only contain "new data" and the rest will be referenced via hardlinks
(-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not
able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but
as soon as one backup job couldn`t be finished for some reason, (like it will be
aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a
new full backup which fills up my backup storage.
> Yes indeed, this is a typical flaw with many systems - you often need to
throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new"
or "in-progress". If, and only if, the backup completes, then rename
this to a timestamp. If when you start a new backup, if the in-progress folder
exists, then use that and it'll be freshened to the current source state.
>
> Also, have you looked at StoreBackup ? http://storebackup.org
> I does most of this automagically, keeps a definable history (eg one/day
for 14 days, one/week for x weeks, one/30d for y years), plus it keeps file
hashes so can detect bit-rot in your backups.
>
>

Petros Angelatos

2016-Jun-22 05:14 UTC

head link

rsync script for snapshot backups

On 19 June 2016 at 10:27, Simon Hobson <linux at thehobsons.co.uk>
wrote:> Dennis Steinkamp <dennis at lightandshadow.tv> wrote:
>
>> i tried to create a simple rsync script that should create daily
backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should
only contain "new data" and the rest will be referenced via hardlinks
(-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not
able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but
as soon as one backup job couldn`t be finished for some reason, (like it will be
aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a
new full backup which fills up my backup storage.
>
> Yes indeed, this is a typical flaw with many systems - you often need to
throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new"
or "in-progress". If, and only if, the backup completes, then rename
this to a timestamp. If when you start a new backup, if the in-progress folder
exists, then use that and it'll be freshened to the current source state.
I have an extremely similar script for my backups and that's exactly
what I do to deal with backups that are stopped mid-way, either by
power failures or by me. I rsync to a .tmp-$target directory, where
$target is what I'm backing up. I have separate backups for my rootfs
and /home. I also start the whole thing under ionice so that my
computer doesn't get slow from all this I/O. Lastly, before renaming
the .tmp-$target to the final directory I do a `sync -f` because rsync
doesn't seem to call fsync() when copying files and you can have a
failed backup if a power failure happens after the rename().

Here is my script:

#!/bin/bash

set -o errexit
set -o pipefail

target=$1

case "$target" in
    home)
        source=/home
        ;;
    root)
        source=/
        ;;
esac

PATHTOBACKUP=/root/backup

date=$(date --utc "+%Y-%m-%dT%H:%M:%S")

ionice --class 3 rsync \
    --archive \
    --verbose \
    --one-file-system \
    --sparse \
    --delete \
    --compress \
    --log-file=$PATHTOBACKUP/.tmp-$target.log \
    --link-dest=$PATHTOBACKUP/$target-current \
    $source $PATHTOBACKUP/.tmp-$target

sync -f $PATHTOBACKUP/.tmp-$target

mv $PATHTOBACKUP/.tmp-$target.log $PATHTOBACKUP/$target-$date.log
mv $PATHTOBACKUP/.tmp-$target $PATHTOBACKUP/$target-$date

ln --symbolic --force --no-dereference $target-$date
$PATHTOBACKUP/$target-current

Seemingly Similar Threads

Search for more reasonably related threads

rsync - Jun 2016 - rsync script for snapshot backups

rsync script for snapshot backups

rsync script for snapshot backups

rsync script for snapshot backups

rsync script for snapshot backups

rsync script for snapshot backups

rsync script for snapshot backups

Seemingly Similar Threads