Leonard den Ottolander
2005-Oct-30 13:58 UTC
[Samba] smbmount codepage/iocharset settings vs NT4
Hi, I'm in the process of setting up a backup server for a somewhat antiquated NT4 server. Backup server is CentOS-4 (~ RHEL-4), kernel-2.6.9-11.EL, samba-client-3.0.10-1.4E, rsync-2.6.3-1, LANG=en_US.UTF-8. NT4 shares are mounted on the server and rsynced to local disk. This setup is working pretty well, however on the NT box there are some files with names containing odd characters like accented characters and ellipsis. I'm a bit at a loss as to the correct settings of the smbmount iocharset and codepage parameters to use, and whether the display charset and unix charset options in smb.conf are relevant to the mounts. I've setup a test share. An ls in smbclient gives me the correct output in a gnome-terminal and an mget gets me the files with their correctly utf8itfied names (console seemed ok until after a toggle to X): $ smbclient -U auser //david-bowie/Test Password: Domain=[EVERYTHING] OS=[Windows NT 4.0] Server=[NT LAN Manager 4.0] smb: \> ls . D 0 Sat Oct 29 18:58:49 2005 .. D 0 Sat Oct 29 18:58:49 2005 ellipsis zijn heel fijn (?).doc A 24064 Sat Oct 29 18:57:14 2005 Nogmaals ellipsis ?.doc A 24064 Sat Oct 29 18:58:31 2005 ??n document ? ?50.doc A 24064 Sat Oct 29 18:55:28 2005 ??n document.doc A 24064 Sat Oct 29 18:54:20 2005 ???.doc A 24064 Sat Oct 29 18:57:55 2005 ?quotes?.doc A 24064 Sat Oct 29 18:53:40 2005 52004 blocks of size 262144. 2165 blocks available However, an smbmount without any charset options gives me the following result: $ sudo mount -o username=auser //david-bowie/Test /mnt/tmp Password: $ ls /mnt/tmp `%'.doc ??n document.doc ellipsis zijn heel fijn (.).doc Nogmaals ellipsis ..doc ??n document ? ?50.doc "quotes".doc Using cp850 improves the output somewhat: $ sudo mount -o username=auser,codepage=cp850 //david-bowie/Test /mnt/tmp Password: $ ls /mnt/tmp `%'.doc ellipsis zijn heel fijn (.).doc ??n document ? ?50.doc Nogmaals ellipsis ..doc ??n document.doc "quotes".doc I assumed the code page used by NT4 was cp1252 ("MS-ANSI"), but using cp1252 for the codepage gives me the same output for these files as the mount with no codepage option set. To make a long story short: What are the proper options to pass to smbmount and/or set in /etc/samba/smb.conf? Thanks, Leonard.
On Sun, 2005-10-30 at 14:58 +0100, Leonard den Ottolander wrote:> Hi, > > I'm in the process of setting up a backup server for a somewhat > antiquated NT4 server. Backup server is CentOS-4 (~ RHEL-4), > kernel-2.6.9-11.EL, samba-client-3.0.10-1.4E, rsync-2.6.3-1, > LANG=en_US.UTF-8. NT4 shares are mounted on the server and rsynced to > local disk. > > This setup is working pretty well, however on the NT box there are some > files with names containing odd characters like accented characters and > ellipsis. I'm a bit at a loss as to the correct settings of the smbmount > iocharset and codepage parameters to use, and whether the display > charset and unix charset options in smb.conf are relevant to the mounts.You should use the CIFS VFS for your backup operations, as it will correctly use unicode on the wire, and therefore allow a correct utf8 translation. smbfs is considered deprecated, and certainly should not be used for new installations. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Samba Developer, SuSE Labs, Novell Inc. http://suse.de Authentication Developer, Samba Team http://samba.org Student Network Administrator, Hawker College http://hawkerc.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.samba.org/archive/samba/attachments/20051031/82def64f/attachment.bin