To make rsync work better when rsyncing between OS X (whose filenames
are encoded in UTF-8 NFD form) and just about every other OS (whose
filenames are encoded in UTF-8 NFC form), I wrote a little patch for
rsync that converts the filenames to NFC before sending them to the
other side.
It uses libidn for that, and currently has no option to enable/disable
this behaviour, but since on OS X all filenames are UTF-8 no matter
what, this shouldn't matter much. When creating files, OS X handles
UTF-8 NFC fine, so a reverse conversion isn't needed.
I'm posting this patch here and not on sourceforge because it probably
isn't production quality, but it's useful for me and perhaps for the
countless other people who've reported this problem before as well.
If the rsync maintainers are interested in integrating this patch, I'll
be happy to polish it up a little. I'm not sure if it should be a
configure option (autoenabled on OS X maybe) or a command line option,
so in that case some input here would be nice. Or else, feel free to
adapt it.
After applying it, "-lidn" needs to be added to LIBS in the Makefile.
And without further ado, here is the patch:
----------------------------
diff -u rsync-2.6.6/flist.c rsync-2.6.6-nfc/flist.c
--- rsync-2.6.6/flist.c 2005-07-07 15:49:14.000000000 -0400
+++ rsync-2.6.6-nfc/flist.c 2005-10-26 15:18:01.000000000 -0400
@@ -1748,8 +1748,11 @@
* buffer. No size-checking is done because we checked the size when creating
* the file_struct entry.
*/
+#include <stringprep.h>
char *f_name_to(struct file_struct *f, char *fbuf)
{
+ char *norm;
+
if (!f || !f->basename)
return NULL;
@@ -1760,6 +1763,11 @@
strcpy(fbuf + len + 1, f->basename);
} else
strcpy(fbuf, f->basename);
+
+ norm = stringprep_utf8_nfkc_normalize(fbuf, -1);
+ strcpy(fbuf, norm);
+ free(norm);
+
return fbuf;
}
----------------------------
--
Josef Drexler | http://jdrexler.com/home/
---------------------------------+---------------------------------------
Please help Conserve Gravity | Email address is *valid*.
Don't do push ups | Don't remove the "nospam"
part.