samba-bugs at samba.org
2020-Jun-04 16:22 UTC
[Bug 14401] New: unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 Bug ID: 14401 Summary: unicode character conversion problem from MacOS to Linux despite iconv Product: rsync Version: 3.1.3 Hardware: All OS: All Status: NEW Severity: normal Priority: P5 Component: core Assignee: wayne at opencoder.net Reporter: quickhelp at gmail.com QA Contact: rsync-qa at samba.org Target Milestone: --- // SOURCE (initiating rsync): ProductName: Mac OS X ProductVersion: 10.15.4 BuildVersion: 19E287 Homebrew rsync: rsync version 3.1.3 protocol version 31 Copyright (C) 1996-2018 by Andrew Tridgell, Wayne Davison, and others. Web site: http://rsync.samba.org/ Capabilities: 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, append, ACLs, xattrs, iconv, symtimes, no prealloc, file-flags // TARGET: Debian Linux 10.4 Linux 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux Debian rsync: rsync version 3.1.3 protocol version 31 Copyright (C) 1996-2018 by Andrew Tridgell, Wayne Davison, and others. Web site: http://rsync.samba.org/ Capabilities: 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, append, ACLs, xattrs, iconv, symtimes, prealloc rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. See the GNU General Public Licence for details. Problem: 2020/06/04 17:12:21 [12205] [sender] cannot convert filename: Users/me/Library/Mail/V7/59923E9C-ACCC-45B0-B179-4CD4EA4D87D5/Sent Messages.mbox/DEED0205-D544-48AF -BDB2-40C0E6D5380C/Data/5/3/5/1/Attachments/1535951/3/<F0><9F><9B><84> Danke! Ihre Buchung ist besta<CC><88>tigt: Dolly Waikiki.eml (Illegal byte sequence) 2020/06/04 17:12:21 [12205] IO error encountered -- skipping file deletion ls on the filename shows: -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 17:38 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 Wayne Davison <wayne at opencoder.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |WORKSFORME --- Comment #1 from Wayne Davison <wayne at opencoder.net> --- Macs use a weird utf-8-mac encoding that you need to make sure you're specifying. If the iconv library complains about the encoding, then either the encoding name is wrong or you have an invalid file that isn't named with the specified encoding. So, if you're running rsync on a mac and copying to/from a linux host, you should be able to specify: --iconv=utf-8-mac,utf-8 to specify the local & remote charset. -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 17:57 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #2 from Tormen <quickhelp at gmail.com> --- Hi @wayne. I only now noticed that my comment got truncated at the "ls" part. I had provided afterwards also the rsync command used: '/usr/local/bin/rsync' -e 'ssh -p 53146 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null' -D --numeric-ids --links --hard-links --one-file-system --itemize-changes --times --recursive --perms --owner --group --stats --human-readable --partial --delete --iconv=utf-8-mac,utf-8 --compress --log-file '/Users/admin/.rsync-tmbackup/2020-06-04-195558.log' --exclude-from '/LINKS/etc/rsync-tmbackup.exclude-from.macado' -- '/System/Volumes/Data/' 'root at jolie:/jbackup/macado/System.Volumes.Data/2020-06-04-1955.50___FULL-BACKUP/' As you can see I already use(d) "--iconv=utf-8-mac,utf-8" As the file is an Email stored by MacOS Mail app the encoding should be MacOS standard and not untypical for a Mac. -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 17:58 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #3 from Tormen <quickhelp at gmail.com> --- Created attachment 16018 --> https://bugzilla.samba.org/attachment.cgi?id=16018&action=edit File breaking the rsync --iconv=utf-8-mac,utf-8 -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 17:59 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 Tormen <quickhelp at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|WORKSFORME |--- Status|RESOLVED |REOPENED -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 18:00 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #4 from Tormen <quickhelp at gmail.com> --- Created attachment 16019 --> https://bugzilla.samba.org/attachment.cgi?id=16019&action=edit File breaking the rsync --iconv=utf-8-mac,utf-8 -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 18:04 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #5 from Tormen <quickhelp at gmail.com> ---> If the iconv library complains about the encoding, then either the encoding name is wrong or you have an invalid file that isn't named with the specified encoding.How can I "zoom in" here to verify if the filename is encoded in utf-8-mac or not ? -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 18:11 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #6 from Tormen <quickhelp at gmail.com> --- Created attachment 16020 --> https://bugzilla.samba.org/attachment.cgi?id=16020&action=edit File breaking the rsync --iconv=utf-8-mac,utf-8 -- in Finder -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-04 18:12 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 Tormen <quickhelp at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #16018|File breaking the rsync |File breaking the rsync description|--iconv=utf-8-mac,utf-8 |--iconv=utf-8-mac,utf-8 -- | |"ls" in "Terminal" App -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-05 06:14 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #7 from SATOH Fumiyasu <fumiyas at osstech.co.jp> --- The ` -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-05 06:16 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #8 from SATOH Fumiyasu <fumiyas at osstech.co.jp> --- The U+1F6C4 (BAGGAGE CLAIM emoji, <F0><9F><9B><84> in UTF-8) is a Unicode character and is located in surrogate pairs, but the UTF-8-MAC encoding by macOS's iconv(3) does not support surrogate pairs. Try to compile your rsync binary with my hacked GNU libiconv: https://github.com/fumiyas/libiconv-utf8mac ... Bugzilla does not support surrogate pairs... -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-05 06:26 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #9 from Wayne Davison <wayne at opencoder.net> --- One other thing you could do when sending files to Linux is to not translate the names. This is because Linux can create a filename with oddball character sequences (unlike macos) so it can store and retrieve the raw filenames just fine for something like backup and restore. It would just cause various names to display with "?" chars when listed on Linux, or displayed with escape sequences when listed with `ls -b`: \360\237\233\204\ Danke!\ Ihre\ Buchung\ ist\ best?tigt:\ Dolly\ Waikiki.eml In any case, this sounds like an iconv issue, not an rsync issue. -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-05 10:10 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 --- Comment #10 from Tormen <quickhelp at gmail.com> --- Thank you very much !!!!! for your comments. I agree that the problem I pointed out would be then a limitation of iconv. The comment about not converting was really helpful. I was not sure about leaving the --conv off, but it makes sense. And I will go down this road and hope that I always have a MAC to restore too ;) This ticket can now be closed... but I left it open, as I was not sure what status to pick in this case. -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2020-Jun-06 22:15 UTC
[Bug 14401] unicode character conversion problem from MacOS to Linux despite iconv
https://bugzilla.samba.org/show_bug.cgi?id=14401 Wayne Davison <wayne at opencoder.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |WORKSFORME Status|REOPENED |RESOLVED -- You are receiving this mail because: You are the QA Contact for the bug.
Apparently Analagous Threads
- [Bug 13827] New: despite --copy-unsafe-links, rsync does not copy the referent of symlinks that point one level outside the copied tree
- rsyncing from win1251 to UTF-8
- [Bug 11338] New: Rsync Crash - Segmentation fault
- [Bug 14371] New: Combined Exclude & Protect Filter Type
- [Bug 14338] New: ZSTD support