Hello, I am trying to get rsync-3.0.0pre10 --iconv option working between two linux hosts in local network. The client host is running Fedora Core 4 (kernel 2.6.17) and is using iso8859-1 character set. LANG=en_US The daemon host is running Centos 5 (kernel 2.6.18) and is using utf-8 character set. LANG=en_US.UTF-8 Rsync is transferring files properly without --iconv switch: fc4: (connected using putty with translation set to ISO-8859-1:1998 (Latin-1, West Europe)) $ ls test/ example-file-???.txt example.txt centos5: (connected using putty with translation set to UTF-8) $ ls test/ example-file-???.txt example.txt $ ls test/ | iconv -f iso88591 -t utf8 example-file-???.txt example.txt The daemon settings in rsyncd.conf for the module has following line: charset = utf8 When I try to use --iconv=iso88591,utf8 on the client side, following errors are displayed: fc4: [receiver] cannot convert filename: test/example-file-???.txt (Invalid or incomplete multibyte or wide character) centos5: rsyncd[pid-number]: [receiver] cannot convert filename: test/example-file-???.txt (Invalid or incomplete multibyte or wide character) (high bit characters replaced with question marks above) I have tried the transfer with following combinations, but none of them have worked. client side: --iconv=. --iconv=iso88591 --iconv=iso88591,utf8 daemon side: charset = . charset = utf8 I am able to convert the filenames manually in the daemon host using convmv script, http://www.j3e.de/linux/convmv/ before convmv: $ ls test/ | od -c 0000000 e x a m p l e - f i l e - 366 344 345 0000020 . t x t \n e x a m p l e . t x t 0000040 \n 0000041 after convmv: $ ls test/ | od -c 0000000 e x a m p l e - f i l e - 303 266 303 0000020 244 303 245 . t x t \n e x a m p l e . 0000040 t x t \n 0000044 Should rsync be able to convert filenames from single byte (iso8859-1) to multibyte character set (utf-8)? Has anyone got iso8859-1 to utf-8 conversion working with rsync-3.0.0pre10? Regards, Sami -------------- next part -------------- HTML attachment scrubbed and removed
Hello,> Has anyone got iso8859-1 to utf-8 conversion working withrsync-3.0.0pre10? I am answering to my own question. I looked at the setup_iconv function in rsync.c and found following code snippets: ... if ((ic_send = iconv_open(UTF8_CHARSET, charset)) =(iconv_t)-1) { ... ... if ((ic_recv = iconv_open(charset, UTF8_CHARSET)) =(iconv_t)-1) { ... According to man 3 iconv_open: iconv_t iconv_open(const char *tocode, const char *fromcode); So if I'm interpreting that right, UTF8 is hardcoded to always be one of the conversion charsets. If rsync is sending, the conversion is always to UTF8. If rsync is receiving, the conversion is always from UTF8. To verify above, I tested to transfer the same two files to the other direction (conversion from utf-8 to iso8859-1) and it worked properly. Is --iconv option intentionally limited to only work from UTF8 charset? Regards, Sami
On Tue, Feb 26, 2008 at 02:14:34PM +0200, sami.pitko@vaisala.com wrote:> Has anyone got iso8859-1 to utf-8 conversion working with rsync-3.0.0pre10?It was failing when communicating with a daemon due to a missing setup_iconv() call. The attached patch fixes this. Thanks for your report! ..wayne.. -------------- next part -------------- --- clientserver.c +++ clientserver.c @@ -120,6 +120,10 @@ int start_socket_client(char *host, int remote_argc, char *remote_argv[], set_socket_options(fd, sockopts); +#ifdef ICONV_CONST + setup_iconv(); +#endif + ret = start_inband_exchange(fd, fd, user, remote_argc, remote_argv); return ret ? ret : client_run(fd, fd, -1, argc, argv);
> It was failing when communicating with a daemon due to a missing > setup_iconv() call. The attached patch fixes this. Thanks for > your report!Thanks for the patch. I can confirm that --iconv option works now from iso8859-1 to utf-8 transfer. It would be good to add link to convmv script into rsync --iconv documentation. convmv can be found from http://www.j3e.de/linux/convmv/ If --iconv option is being added to existing, for example backup script, one should prepare the destination file structure with convmv to prevent unnecessary retransmission of "renamed" files. This is very important, if there are characters needing conversion in top level folder names. All files in such folders will be retransmitted due to renaming of the top level folder. Regards, Sami
On Thu, Feb 28, 2008 at 12:55:11PM +0200, sami.pitko@vaisala.com wrote:> It would be good to add link to convmv script into rsync --iconv > documentation. convmv can be found from http://www.j3e.de/linux/convmv/Thanks. I have mentioned that utility on the rsync resources page. ..wayne..
Apparently Analagous Threads
- -e escape rule
- DO NOT REPLY [Bug 5701] New: deadlock on local rsyncing, bisected to commit f303b749f2843433c9acd8218a4b9096d0d1bb8d
- charset issue
- [Bug 8308] New: rsync: exclude.c:532: change_local_filter_dir: Assertion `dir_depth < 4096/2+1' failed
- Winbind and case sensitivity (revisited)