Askar Safin
2014-Dec-26 02:49 UTC
--link-dest --inplace updates files without unlinking. What to do?
Hi. This is bug report and simultaneously urgent asking for help. I am trying to write my rsync wrapper script, which will create minutely snapshots of my data using --link-dest. I want this script to be robust, it should work even if I do suspend/hibernate/reboot without notifying the script about this actions, it should work if I make hard-reset of the computer and if I disconnect network. Also this script should transfer large files with unstable network link. I. e. I want the following: I connected internet, it transferred some part of my large file, I disconnected inet, then I connected again, the script continued transferring. So, I wrote the following script (this script should be run locally, i. e. on my home computer with my data): set +e while :; do if rsync --archive --copy-unsafe-links --no-owner --no-group --one-file-system --compress --progress --inplace --append-verify --partial --delete-after --fuzzy --timeout=60 --link-dest=../latest-complete "$DIR/sync" "$HOST":backups/in-progress; then TIME="$(TZ=UTC date +%F-%H%M%S)" ssh "$HOST" "mv -T ~/backups/in-progress ~/backups/$TIME && ln -sfn $TIME ~/backups/latest-complete" || : fi sleep 60 done It seems for me that this script address all this issues and it is OK, right? As I said, I want the script to work with large files with unstable connection, so I decided to use "--inplace --append-verify" option. Am I right? Or just "--partial --append-verify" without "--inplace" will go? How exactly "--partial" without "--inplace" works? Where partial data is stored, in what files? How rsync knows it should continue transfer? Will --partial work in case of suddenly local computer hard reset? Recently I noticed the following: I have in "$HOST":backups/2014-11-22-003120 too new copy of some my file. I am sure. Also, this file has a lot of hard links. So, I think the following happened: at 2014-11-22 actual right copy of this file was created at backup host. Then, some more snapshots was created with hardlinks to this file. Then, rsync started creating new snapshot in "in-progress", it created one more hardlink to that file and then failed for some reason (reboot, inet disconnection, etc). Then the file was changed, and rsync started again. rsync continued to update "in-progress" dir. It noticed that the file changed and updated it *without unlinking it*. So, all hardlinks, including very old ones, was updated. And this was wrong. So, what to do? I. e. how to fix the script? Or maybe this is a bug in rsync? I. e. "--link-dest --inplace doesn't unlink"? Some quick fix will be good. Maybe I should remove --inplace and leave --partial? But will the script still be such robust in this case? =Askar Safin http://vk.com/safinaskar Kazan, Russia
Kevin Korb
2014-Dec-26 03:06 UTC
--link-dest --inplace updates files without unlinking. What to do?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --inplace and --append-verify are essentially irrelevant when - --link-dest is in play. With --link-dest in play the target system must write an entirely new file even for a change in permissions or timestamps so any potential benefit by these options are out the window from the start. The only thing they can do is add the possibility of incomplete or corrupt copies on the target. On 12/25/2014 09:49 PM, Askar Safin wrote:> Hi. This is bug report and simultaneously urgent asking for help. I > am trying to write my rsync wrapper script, which will create > minutely snapshots of my data using --link-dest. I want this script > to be robust, it should work even if I do suspend/hibernate/reboot > without notifying the script about this actions, it should work if > I make hard-reset of the computer and if I disconnect network. Also > this script should transfer large files with unstable network link. > I. e. I want the following: I connected internet, it transferred > some part of my large file, I disconnected inet, then I connected > again, the script continued transferring. So, I wrote the following > script (this script should be run locally, i. e. on my home > computer with my data): > > set +e > > while :; do if rsync --archive --copy-unsafe-links --no-owner > --no-group --one-file-system --compress --progress --inplace > --append-verify --partial --delete-after --fuzzy --timeout=60 > --link-dest=../latest-complete "$DIR/sync" > "$HOST":backups/in-progress; then TIME="$(TZ=UTC date +%F-%H%M%S)" > > ssh "$HOST" "mv -T ~/backups/in-progress ~/backups/$TIME && ln -sfn > $TIME ~/backups/latest-complete" || : fi > > sleep 60 done > > It seems for me that this script address all this issues and it is > OK, right? > > As I said, I want the script to work with large files with unstable > connection, so I decided to use "--inplace --append-verify" option. > Am I right? Or just "--partial --append-verify" without "--inplace" > will go? How exactly "--partial" without "--inplace" works? Where > partial data is stored, in what files? How rsync knows it should > continue transfer? Will --partial work in case of suddenly local > computer hard reset? > > Recently I noticed the following: I have in > "$HOST":backups/2014-11-22-003120 too new copy of some my file. I > am sure. Also, this file has a lot of hard links. So, I think the > following happened: at 2014-11-22 actual right copy of this file > was created at backup host. Then, some more snapshots was created > with hardlinks to this file. Then, rsync started creating new > snapshot in "in-progress", it created one more hardlink to that > file and then failed for some reason (reboot, inet disconnection, > etc). Then the file was changed, and rsync started again. rsync > continued to update "in-progress" dir. It noticed that the file > changed and updated it *without unlinking it*. So, all hardlinks, > including very old ones, was updated. And this was wrong. > > So, what to do? I. e. how to fix the script? Or maybe this is a bug > in rsync? I. e. "--link-dest --inplace doesn't unlink"? Some quick > fix will be good. > > Maybe I should remove --inplace and leave --partial? But will the > script still be such robust in this case? == Askar Safin > http://vk.com/safinaskar Kazan, Russia >- -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlSc0L4ACgkQVKC1jlbQAQcm7wCgqHNKL2pPMjTkBgatFYrhhV8i Al4AoM2N/YVwi2nVWhRkwDDWvcBPw5le =bDmb -----END PGP SIGNATURE-----
Askar Safin
2014-Dec-26 13:24 UTC
Re[2]: --link-dest --inplace updates files without unlinking. What to do?
>- --inplace and --append-verify are essentially irrelevant when >- --link-dest is in play. With --link-dest in play the target system >must write an entirely new file even for a change in permissions or >timestamps so any potential benefit by these options are out the >window from the start. The only thing they can do is add the >possibility of incomplete or corrupt copies on the target.Okey, so, I should remove --inplace and --append-verify? But then copying large file over unstable inet will not work, right? Is there any way to keep --link-dest working right and simultaneously have possibility to copy large file over unstable inet? =Askar Safin http://vk.com/safinaskar Kazan, Russia
Possibly Parallel Threads
- Re[2]: --link-dest --inplace updates files without unlinking. What to do?
- Re[2]: --link-dest --inplace updates files without unlinking. What to do?
- Re[2]: --link-dest --inplace updates files without unlinking. What to do?
- Re[2]: --link-dest --inplace updates files without unlinking. What to do?
- Re[2]: --link-dest --inplace updates files without unlinking. What to do?