[Now the horse has left the barn, I decided to finally implement that backup system I'd been thinking about for ages. Disk crashes can be great motivators] [Web-location for the living version of this document: http://www.lartmaker.nl/rsync/] Goal: Producing a working network backup / cloning system for Mac OS X systems. The system can be used for local backups as well, for example to FireWire disks. Problems: Many files on HFS+, the Mac's most common file system, have metadata. This is partly a leftover from the past (resource forks), and partly a new development (ACLs, extended attributes). Plain rsync doesn't (yet) cope with this metadata. Since OS X 10.4 (aka Tiger) the MacOS ships with a modified version of rsync. An added option, '-E', enables the transfer of extended attributes. This is done by encapsulating the resource fork, Finder data et al in a synthetic file which is added to the rsync transfer list. The name of this file is formed by prepending '._' to the name of the original file, a technique which is also used when copying data from HFS+ partitions to non-Apple file systems such as NFS mounts. It may not be pretty or foolproof (what happens when both foo and ._foo exist?), but at least it's documented by Apple and lot likely to change in the very near future. This rsync derivative is based on rsync-2.6.3. However, Googling and testing have revealed four problems with Apple's rsync. In order of severity, worst first: 1) The rsync sender will frequently crash with a Bus Error / Segmentation Fault after generating the file list, but before transferring any files. This turns out to be caused by a buffer overrun. 2) When used with the --delete option, the rsync receiver will try to unlink the (fake) synthetic files, flooding the syslog with failure reports, possibly filling the entire boot disk. 3) When files with extended attributes are transferred, the modification time will be set to the time of the transfer, even when the user has specified that modification times be preserved. As a result, using mtime to determine whether a file has changed is broken. 4) Extended attributes have no modification time of themselves. Since a file's mtime is not updated when its attributes are changed, only checksumming can be used to determine whether attribute data needs to be transferred. With default settings, this means that ALL extended attributes are ALWAYS copied. The patch: Problems 1-3 are fixed by the patch at http://www.lartmaker.nl/rsync/rsync-tiger-fixes.diff . This patch is released under version 2 of the GNU GPL. I know of no fix for problem 4, but consider it mostly an annoyance. Putting it all together: NOTE: This requires familiarity with the Terminal. I have no .dmg or whatnot, since I wouldn't know how to create one (and there are licensing issues, see below). Following these steps should get you a working rsync, though. 0) Update Tiger to 10.4.2. Install XCode, the Apple developer tools. If you don't have the disc (it's shipped with the Tiger install media), you can get the latest version from Apple's developer website (free registration required): http://developer.apple.com/tools/download/ 1) Get the sources. Open the terminal, and type: mkdir rsync-build cd rsync-build curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/rsync-2.6.3.tar.gz curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/EA.diff curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/PR-3945747-endian.diff curl -O http://www.lartmaker.nl/rsync/rsync-tiger-fixes.diff 2) If you don't already have it, install copyfile.h in /usr/include . Get it from Apple's developer website http://www.opensource.apple.com/darwinsource/10.4.3/Libc-391.2.3/darwin/copyfile.h (again, free registration required). In the Terminal: sudo mv -n copyfile.h /usr/include Copying to /usr/include requires root privileges; enter your password when prompted. The '-n' option to mv makes sure that you don't overwrite a (newer) installed version. NOTE: copyfile.h is *NOT* licensed under the GPL, but rather under the Apple Public Source Licence (http://www.opensource.apple.com/apsl). You may want to review this license; I Am Not A Lawyer so I cannot say and will not speculate on how this affects your rights. 4) Unpack the rsync source, and apply the patches. In the Terminal: tar zxf rsync-2.6.3.tar.gz cd rsync-2.6.3 patch -p0 < ../EA.diff patch -p0 < ../PR-3945747-endian.diff patch -p0 < ../rsync-tiger-fixes.diff 5) Configure and make rsync: ./configure --enable-ea-support make 6) You now have a patched rsync binary. If you're feeling brave, you can replace the Apple-supplied version with it (sudo cp -f rsync /usr/bin). Myself, I'd suggest installing it in /usr/local/bin (the default) by doing: sudo make install Note that this procedure is for a plain XCode install. If you're using Fink you'll need to change bits (but then, you'll probably know how). As is documented on other sites, you'll want to make sure that the target drive has 'Ignore Ownership on This Volume' DISABLED (Finder:Get Info on the disk, the button is under the 'Ownership & Permissions' - tab). Also, it helps to turn Spotlight off for the target volume. Bottom line: It Works For Me. I've run a few tests, both full and incremental, with ~60GB in just over half a million files with creation dates going back to 1994 (Pathways into Darkness, anyone ?). With rsync installed in Server mode (see the man pages) on a Mac mini, a no-changes full filesystem 'incremental' backup takes 45 minutes over Airport Extreme (and less over Ethernet), during which both machines are still mostly responsive. For reference, my command line is: sudo ./rsync/rsync-2.6.3-jdfix/rsync -aREx --delete --exclude='.Spotlight-*' --exclude '/private/var/vm/*' / [IP-address of Mac mini]::PowerBookBackup I have successfully booted the Mac mini from the resulting disk clone. Although I haven't stress-tested the system, all looked well (I could open Photoshop and iTunes with no problems). A similar procedure should work to an external disk attached to the source computer, although I haven't tested that configuration. So why didn't I just use RsyncX ? Googling revealed some (perceived?) compatibility issues between RsyncX and Tiger. Besides, RsyncX only works between Macs, and I *really* want to use my 1.5TB RAID-5 Linux box as backup target. About the rsync -H option: there have been rumors of incompatibility with OSX. I'll have to find out; however, on my PowerBook's boot drive only 4050 of the >500000 files have a link count greater than one. These bugs and fixes have been reported to Apple. What's next: Getting rsync-on-X to play nice with rsync-on-Linux. Porting all patches to rsync-2.6.6. DISCLAIMER: No warranties whatsoever. Do not ever trust a backup system you haven't thoroughly tested. I know that I claim it's working, but I might be lying or hallucinating (or missing important bits). I have tried to include all information which I'd hoped to find in one place when I started this journey a few days ago. Hope it's of some use to others. JDB [as they say: you'll get experience right after you needed it] -- LART. 250 MIPS under one Watt. Free hardware design files. http://www.lartmaker.nl/
My solution is to use a separate utility called "SplitForks" on OSX, then a unpatched rsync is used on a Mac to transfer files to a linux box for backup. SplitForks is a tool in OSX's Developer toolkit, which will make those "._xyz" files on HFS+ volume. I developed a shell script to use SplitForks on a volume, rsync it, then a find command to delete those "._xyz" files, coz, I suspect, those "._xyz" resource forks will not be updated by OSX automatically on HFS+ volumes. OSX might see the (internal) resource fork in HFS+ file, and use it. So, I run the SplitForks routine every backup cycle to "regenerate" a fresh copy. My 2 cents. Stephen Wong @ Hong Kong>From J.D. Bakker: > [Now the horse has left the barn, I decided to finally implement that > backup system I'd been thinking about for ages. Disk crashes can be > great motivators] > > [Web-location for the living version of this document: > http://www.lartmaker.nl/rsync/] > > Goal: > > Producing a working network backup / cloning system for Mac OS X > systems. The system can be used for local backups as well, for example > to FireWire disks. > > > Problems: > > Many files on HFS+, the Mac's most common file system, have metadata. > This is partly a leftover from the past (resource forks), and partly a > new development (ACLs, extended attributes). Plain rsync doesn't (yet) > cope with this metadata. > > Since OS X 10.4 (aka Tiger) the MacOS ships with a modified version of > rsync. An added option, '-E', enables the transfer of extended > attributes. This is done by encapsulating the resource fork, Finder data > et al in a synthetic file which is added to the rsync transfer list. The > name of this file is formed by prepending '._' to the name of the > original file, a technique which is also used when copying data from > HFS+ partitions to non-Apple file systems such as NFS mounts. It may not > be pretty or foolproof (what happens when both foo and ._foo exist?), > but at least it's documented by Apple and lot likely to change in the > very near future. This rsync derivative is based on rsync-2.6.3. > > However, Googling and testing have revealed four problems with Apple's > rsync. In order of severity, worst first: > > 1) The rsync sender will frequently crash with a Bus Error / > Segmentation Fault after generating the file list, but before > transferring any files. This turns out to be caused by a buffer overrun. > > 2) When used with the --delete option, the rsync receiver will try to > unlink the (fake) synthetic files, flooding the syslog with failure > reports, possibly filling the entire boot disk. > > 3) When files with extended attributes are transferred, the modification > time will be set to the time of the transfer, even when the user has > specified that modification times be preserved. As a result, using mtime > to determine whether a file has changed is broken. > > 4) Extended attributes have no modification time of themselves. Since a > file's mtime is not updated when its attributes are changed, only > checksumming can be used to determine whether attribute data needs to be > transferred. With default settings, this means that ALL extended > attributes are ALWAYS copied. > > > The patch: > > Problems 1-3 are fixed by the patch at > http://www.lartmaker.nl/rsync/rsync-tiger-fixes.diff . This patch is > released under version 2 of the GNU GPL. I know of no fix for problem 4, > but consider it mostly an annoyance. > > > Putting it all together: > > NOTE: This requires familiarity with the Terminal. I have no .dmg or > whatnot, since I wouldn't know how to create one (and there are > licensing issues, see below). Following these steps should get you a > working rsync, though. > > 0) Update Tiger to 10.4.2. Install XCode, the Apple developer tools. If > you don't have the disc (it's shipped with the Tiger install media), you > can get the latest version from Apple's developer website (free > registration required): http://developer.apple.com/tools/download/ > > 1) Get the sources. Open the terminal, and type: > > mkdir rsync-build > cd rsync-build > curl -O > http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/rsync-2.6.3.tar.gz > > curl -O > http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/EA.diff > > curl -O > http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/PR-3945747-endian.diff > > curl -O http://www.lartmaker.nl/rsync/rsync-tiger-fixes.diff > > 2) If you don't already have it, install copyfile.h in /usr/include . > Get it from Apple's developer website > http://www.opensource.apple.com/darwinsource/10.4.3/Libc-391.2.3/darwin/copyfile.h > (again, free registration required). In the Terminal: > > sudo mv -n copyfile.h /usr/include > > Copying to /usr/include requires root privileges; enter your password > when prompted. The '-n' option to mv makes sure that you don't overwrite > a (newer) installed version. > > NOTE: copyfile.h is *NOT* licensed under the GPL, but rather under the > Apple Public Source Licence (http://www.opensource.apple.com/apsl). You > may want to review this license; I Am Not A Lawyer so I cannot say and > will not speculate on how this affects your rights. > > 4) Unpack the rsync source, and apply the patches. In the Terminal: > > tar zxf rsync-2.6.3.tar.gz > cd rsync-2.6.3 > patch -p0 < ../EA.diff > patch -p0 < ../PR-3945747-endian.diff > patch -p0 < ../rsync-tiger-fixes.diff > > 5) Configure and make rsync: > > ./configure --enable-ea-support > make > > 6) You now have a patched rsync binary. If you're feeling brave, you can > replace the Apple-supplied version with it (sudo cp -f rsync /usr/bin). > Myself, I'd suggest installing it in /usr/local/bin (the default) by doing: > > sudo make install > > Note that this procedure is for a plain XCode install. If you're using > Fink you'll need to change bits (but then, you'll probably know how). > > As is documented on other sites, you'll want to make sure that the > target drive has 'Ignore Ownership on This Volume' DISABLED (Finder:Get > Info on the disk, the button is under the 'Ownership & Permissions' - > tab). Also, it helps to turn Spotlight off for the target volume. > > > Bottom line: > > It Works For Me. I've run a few tests, both full and incremental, with > ~60GB in just over half a million files with creation dates going back > to 1994 (Pathways into Darkness, anyone ?). With rsync installed in > Server mode (see the man pages) on a Mac mini, a no-changes full > filesystem 'incremental' backup takes 45 minutes over Airport Extreme > (and less over Ethernet), during which both machines are still mostly > responsive. For reference, my command line is: > > sudo ./rsync/rsync-2.6.3-jdfix/rsync -aREx --delete > --exclude='.Spotlight-*' --exclude '/private/var/vm/*' / [IP-address of > Mac mini]::PowerBookBackup > > I have successfully booted the Mac mini from the resulting disk clone. > Although I haven't stress-tested the system, all looked well (I could > open Photoshop and iTunes with no problems). A similar procedure should > work to an external disk attached to the source computer, although I > haven't tested that configuration. > > So why didn't I just use RsyncX ? Googling revealed some (perceived?) > compatibility issues between RsyncX and Tiger. Besides, RsyncX only > works between Macs, and I *really* want to use my 1.5TB RAID-5 Linux box > as backup target. > > About the rsync -H option: there have been rumors of incompatibility > with OSX. I'll have to find out; however, on my PowerBook's boot drive > only 4050 of the >500000 files have a link count greater than one. > > These bugs and fixes have been reported to Apple. > > > What's next: > > Getting rsync-on-X to play nice with rsync-on-Linux. > Porting all patches to rsync-2.6.6. > > DISCLAIMER: No warranties whatsoever. Do not ever trust a backup system > you haven't thoroughly tested. I know that I claim it's working, but I > might be lying or hallucinating (or missing important bits). > > > I have tried to include all information which I'd hoped to find in one > place when I started this journey a few days ago. Hope it's of some use > to others. > > JDB > [as they say: you'll get experience right after you needed it]
This patch seems to work nicely except for the problem that if you resync a directory that has been synced using this.. Every resource fork is resynced because OS 10.4 accepts the ._filename and stores it properly in the resource fork but the rsync doesn't notice this.. and decides the file is missing.. and tries to move it.. or conversely .. in some instances it is saying this file has vanished.. This all adds up to alot of wasted time.. and file pieces being moved.. with the resulting possibility of corrupting the whole. Otherwise.. way cool. This needs fixed immediately though.
... ._resource forks being synced over and over despite the target file itself not moving.. Missing fact... I am attempting a link-destination based multiple backup that creates hard links to files that are duplicates and only moves the changed files.. This implementation of rsync is not noticing the existing resource fork associated with the previous copy of the file so it moves the resource fork again. I hope this makes sense.
Reasonably Related Threads
- questions about extended attributes support across *nix & osx (hfs+) filesys
- rsync support of hfs+ (osx) metadata? solution from Apple ... ?
- OpenSSH patches for Mac OS X
- 1.1beta9 'make' fails on osx/Tiger
- Apple Mail/Tiger Bug came up on the pf mailing list. . .