bugzilla-daemon at bugzilla.mindrot.org
2009-Aug-11 16:07 UTC
[Bug 1632] New: [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 Summary: [PATCH] UTF-8 hint sftp-server extension Product: Portable OpenSSH Version: 5.2p1 Platform: All OS/Version: Linux Status: NEW Severity: enhancement Priority: P2 Component: sftp-server AssignedTo: unassigned-bugs at mindrot.org ReportedBy: cus at fazekas.hu Created an attachment (id=1668) --> (http://bugzilla.mindrot.org/attachment.cgi?id=1668) Add support for utf8 hint sftp extension to sftp-server Currently the openssh sftp-server only supports sftp protocol version 3, and unless I am mistaken there are no plans to support newer versions of this protocol. The encoding of the filenames for this protocol version is unspecified, so there is no reliable way for an sftp client to detect the encoding of the filenames. To solve this problem, I am proposing here a new sftp extension to allow the sftp server to give a hint to the clients whether or not they should interpret the filenames as utf-8. I tried to keep the patch as simple as possible (introduced a new command line parameter to sftp-server, when specified, the server would send this extension) I also made contact with Martin Prikryl, the developer of WinSCP, he wrote that if the extension got in to openSSH, he would be happy to add support for it: http://winscp.net/forum/viewtopic.php?t=7108 -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2009-Aug-27 20:52 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #1 from Marton Balint <cus at fazekas.hu> 2009-08-28 06:52:37 EST --- Did any of you guys have time to review the patch? Is it acceptable? Sorry for bugging you but it's been over a month since my original mailing list message, and over two weeks since I posted the patch to the bugzilla. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2009-Aug-28 17:08 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #2 from Damien Miller <djm at mindrot.org> 2009-08-29 03:08:26 EST --- Hi, your patch is in the queue but unfortunately the queue is quite long and since your patch is a protocol change that we will need to support forever, we need to give it careful review. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2009-Oct-01 16:24 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #3 from Marton Balint <cus at fazekas.hu> 2009-10-02 02:24:43 EST --- I understand that the queue is long, but my patch adds only 16 lines of pretty trivial code. I really don't think it takes more than 5 minutes to review it and decide whether or not it is the correct approach for the desired function, because this new extension is only used to send 1 bit of information to the client - to treat filenames as utf-8. Please, if any of the developers have 5 free minutes, have a look at the patch. Thanks. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-06 16:35 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #4 from Marton Balint <cus at fazekas.hu> 2010-01-07 03:35:46 EST --- Happy new year developers! Any news on this enhancement? -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 04:56 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #5 from Damien Miller <djm at mindrot.org> 2010-01-07 15:56:15 EST --- I'm not sure what problem this patch solves - I suppose it is technically possible for platforms that OpenSSH runs on to use a non-UTF8 encoding, but in does anyone really do it in practice? (I don't know)>From a client perspective UTF-8 should be quite easily distinguishedfrom other non-ASCII encodings by looking at the first character sequence with the high bit set. Some other questions: Is it really the filesystem that encodes filenames as UTF-8? or is it a convention used by application developers using the filesystem? If is is the filesystem itself, then shouldn't it be detectable via a mount option so we don't need the commandline flag. Perhaps it would be better to just ensure that we always render filenames in UTF-8, but really sftp-server has no way of knowing what encoding has been used and since Unix filesystems have traditionally been pretty agnostic about the structure of filenames (other than to exclude '\0' and '/') they may be entirely unstructured or have multiple encodings active on the same filesystem. I'm not sure what the answer is, but I'm reluctant to add a protocol extension that we will have to honour perpetually without understanding it better. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 11:39 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 Salvador Fandi?o <sfandino at yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sfandino at yahoo.com --- Comment #6 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-07 22:39:29 EST --- Hi,>From my point of view the proposed patch is useless as file systemencoding is not an on-off thing. Most servers nowadays have their file systems encoded as utf8 (even if the OS knows nothing about it) and any modern SFTP client should already default to utf8. Using one bit for this task as in the patch would just say, "if I am set use utf8, if not, use also utf8 anyway because it is the encoding most likely".> I'm not sure what problem this patch solves - I suppose it is > technically possible for platforms that OpenSSH runs on to use a > non-UTF8 encoding, but in does anyone really do it in practice? (I > don't know)Well, a possible scenario is some server running an old application not supporting utf8 or hard coded to use some specific encoding or some server configured a long time ago when utf8 was not the default.> From a client perspective UTF-8 should be quite easily distinguished > from other non-ASCII encodings by looking at the first character > sequence with the high bit set.AFAIK this is not true, utf8 encoded strings do not necessarily have the high bit of the first byte set.> Some other questions: > > Is it really the filesystem that encodes filenames as UTF-8? or is it a convention used by application developers using the filesystem?On Unix file systems, the OS just sees null terminated strings, it does not perform any conversion itself and is up to the application to decide how to render that strings (usually taking into consideration the locale configuration).> Perhaps it would be better to just ensure that we always render > filenames in UTF-8.That would require linking sftp-server against one of the libraries supporting conversion of strings between different encodings. As the client should also perform the inverse operation in order to save the file using the local encoding, the full conversion process can be pushed there.> but really sftp-server has no way of knowing what > encoding has been used and since Unix filesystems have traditionally > been pretty agnostic about the structure of filenames (other than to > exclude '\0' and '/') they may be entirely unstructured or have > multiple encodings active on the same filesystem. I'm not sure what the > answer is, but I'm reluctant to add a protocol extension that we will > have to honour perpetually without understanding it better.My conclusion is that at least a string should be used to define the encoding. Maybe you can abuse the extension mechanism of the SSH_FXP_INIT packet passing something as fs_encoding(latin1)@openssh.org = 1 -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 11:54 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #7 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-07 22:54:17 EST --- (In reply to comment #6)> Maybe you can abuse the extension mechanism of the > SSH_FXP_INIT packet passing something as > > fs_encoding(latin1)@openssh.org = 1oops, forget about that! I believed extension_data was required to be a number but actually it is a string so... fs_encoding at openbsd.org = latin1 can be used, or fs_encoding at openbsd.org = 1;latin1 to allow for inclusion of the extension version number. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 13:12 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #8 from Marton Balint <cus at fazekas.hu> 2010-01-08 00:12:29 EST --- The specific problem I'm trying to solve is to be able to give a sign to WinSCP to treat remote file names as UTF-8, and convert from Windows internal charset (unicode) to UTF-8 instead of latin1 when you upload a file. Do you think that this is a bug in WinSCP, and the default remote charset should be UTF-8? Well, maybe, but currently it isn't, and I don't think the developer will be eager to change it because of compatibility reasons. That's why I came up with this solution. In the UNIX/BSD world the current logic is simple: don't convert the charset, it's not the filesystem's, or the file transfer program's job to decide the charset. I'm actually fine with that, it makes sense. However with a UNIX server and a Windows client, charset conversion is inevitable, and somehow we have to give a sign to the client which is the preferred remote charset, otherwise clients which historically defaulted to latin1 won't work with modern UTF-8 aware servers. So my extension would only mean two things: - If you do charset conversion anyway, convert to UTF-8. - On this system, most of the filenames are in UTF-8 encoding, so you may interpret them as such. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 21:43 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #9 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-08 08:43:55 EST ---> Do you think that this is a bug in WinSCP, and the default remote > charset should be UTF-8? Well, maybe, but currently it isn't, and I > don't think the developer will be eager to change it because of > compatibility reasons. That's why I came up with this solution.Well I would not call it a bug. There is no way to know the encoding so WinSCP guesses it as latin1 that used to be the most common encoding. That's fine. It is just that nowadays, utf8 would probably make more hits.> In the UNIX/BSD world the current logic is simple: don't convert the > charset, it's not the filesystem's, or the file transfer program's job > to decide the charset. I'm actually fine with that, it makes sense.No, it doesn't either. For instance, if you transfer files from a server using latin1 to a client using utf8, file names need to be converted or you will get broken ones.> However with a UNIX server and a Windows client, charset conversion is > inevitable, and somehow we have to give a sign to the client which is > the preferred remote charset, otherwise clients which historically > defaulted to latin1 won't work with modern UTF-8 aware servers.The world is not all utf8 or latin1, there are several other encodings in use. If you want to solve that problem do it right and in a general way. Instead of a bit, use a string to pass the encoding from server to client. Actually, I have found that later versions of the SFTP draft already define a similar extension: A server MAY include the following extension with it's version packet. string "filename-charset" string charset-name A server that can always provide a valid UTF-8 translation for filenames SHOULD NOT send this extension. Otherwise, the server SHOULD send this extension and include the encoding most likely to be used for filenames. This value will most likely be derived from the LC_CTYPE on most unix-like systems. (extracted from http://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/draft-ietf-secsh-filexfer-13.txt) -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-07 23:52 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #10 from Marton Balint <cus at fazekas.hu> 2010-01-08 10:52:55 EST --- Yes, charset problems may also arise between UNIX systems. The ultimate solution would be the filename-charset extension, but as far as I know that would require SFTP protocol version 6, which is not going to happen in openSSH in the near future. There is a really good comparison of protocol versions here: http://www.greenend.org.uk/rjk/2007/sftpversions.html I didn't want to solve everybody's problem, I tried to came up with an as simple as possible patch to solve one of the most common charset problems with a Windows client and a UTF-8 UNIX server. And it's important to mention that this problem will not go away in time unless WinSCP change the default charset to UTF-8. On the other hand, the standard charset of UNIX systems will be UTF-8, so the different charset problem between UNIX systems will hopefully be less and less common. Of course, if that's more acceptable I may recreate my patch, to give hint not just for utf-8 encoding, but for the encoding of your choice. But implementing the filename-charset extension seems too complicated to me, because that would require protocol version 6, in-server charset conversion, etc... -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-08 14:25 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #11 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-09 01:25:42 EST --- Created an attachment (id=1769) --> (https://bugzilla.mindrot.org/attachment.cgi?id=1769) filename-charser at openssh.com extension -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-08 14:26 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #12 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-09 01:26:48 EST --- (In reply to comment #10)> Yes, charset problems may also arise between UNIX systems. The ultimate > solution would be the filename-charset extension, but as far as I know > that would require SFTP protocol version 6, which is not going to > happen in openSSH in the near future.it is an *extension* so you can add it to implementations of previous protocol versions as long as no conflicts arise.> Of course, if that's more acceptable I may recreate my patch, to give > hint not just for utf-8 encoding, but for the encoding of your choice. > But implementing the filename-charset extension seems too complicated > to me, because that would require protocol version 6, in-server charset > conversion, etc...The filename-charset extension just tells the client the character encoding used on the filesystem. We don't need to perform any character conversion on the server. Anyway, just to be sure that no client gets confused by the slighly different semantics (as v6 requires the client to ask for deactivation of the local-encoding-to-utf8 conversion) we can name the extension as "filename-charset at openssh.com". I have just attached a patch that adds support for a new flag allowing to set the encoding on the server. In example: sftp-server -s UTF-8 will cause clients to get the pair "filename-charset at openssh.com", "UTF-8" in the SSH_FXP_VERSION packet. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-08 14:40 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #13 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-09 01:40:01 EST --- Created an attachment (id=1770) --> (https://bugzilla.mindrot.org/attachment.cgi?id=1770) ...and guess encoding from environment This patch is an extension of attachment 1769 that also tries to guess the encoding from the environment: sftp-server -S checks the locale configuration and gets the charset from there. When the locale config returns ASCII as the charset, it is automatically promoted to ISO-8591-1. OpenBSD libc does not get the locale configuration from the environment (or, AFAIK, any other globally configurable place) as GNU libc does so, under this OS, the charset is always found to be ASCII and then promoted to ISO-8591-1. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-09 00:27 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #14 from Marton Balint <cus at fazekas.hu> 2010-01-09 11:27:13 EST --- (In reply to comment #12)> The filename-charset extension just tells the client the character > encoding used on the filesystem. We don't need to perform any character > conversion on the server.Great! You're right, it's better not to restrict the charset hint to UTF-8.> Anyway, just to be sure that no client gets confused by the slighly > different semantics (as v6 requires the client to ask for deactivation > of the local-encoding-to-utf8 conversion) we can name the extension as > "filename-charset at openssh.com".I agree.> I have just attached a patch that adds support for a new flag allowing > to set the encoding on the server. In example: > > sftp-server -s UTF-8 > > will cause clients to get the pair > > "filename-charset at openssh.com", "UTF-8" > > in the SSH_FXP_VERSION packet.Perfect, thanks for the extended patch, charset detection also seems quite useful. Damien, what do you think? -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-09 10:28 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #15 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-09 21:28:24 EST --- (In reply to comment #14)> Perfect, thanks for the extended patch, charset detection also seems > quite useful.Actually, I would go further making charset detection the default behavior for sftp-server and using -S to disable it. That would make the feature work for most installation without requiring any custom setup. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-10 17:48 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 Salvador Fandi?o <sfandino at yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1769|0 |1 is obsolete| | Attachment #1770|0 |1 is obsolete| | --- Comment #16 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-11 04:48:47 EST --- Created an attachment (id=1773) --> (https://bugzilla.mindrot.org/attachment.cgi?id=1773) filename-charset at openssh.com with autodetection extension, enabled by default This patch makes the extension enabled by default. sftp-server tries to autodetect the charset from the locale configuration, defaulting to ISO-8859-1. "-s filename_charset" can be used to force a specific charset. "-S" dissables the extension. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-11 11:06 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #17 from Damien Miller <djm at mindrot.org> 2010-01-11 22:06:42 EST --- (From update of attachment 1773) I'm still not sure about this diff, but some comments:>diff --git a/sftp-server.8 b/sftp-server.8 >index 27b67ed..f8d1ff1 100644 >--- a/sftp-server.8 >+++ b/sftp-server.8 >@@ -30,10 +30,11 @@ > .Nd SFTP server subsystem > .Sh SYNOPSIS > .Nm sftp-server >-.Op Fl eh >+.Op Fl ehS > .Op Fl f Ar log_facility > .Op Fl l Ar log_level > .Op Fl u Ar umask >+.Op Fl s Ar filename_charsetIf we do this, I'd rather not use more option letters than necessary. Perhaps disable it with "-s none"?>diff --git a/sftp-server.c b/sftp-server.c >index 27e80f0..27984df 100644 >--- a/sftp-server.c >+++ b/sftp-server.c > typedef struct Stat Stat; >@@ -523,6 +530,19 @@ process_init(void) > /* fstatvfs extension */ > buffer_put_cstring(&msg, "fstatvfs at openssh.com"); > buffer_put_cstring(&msg, "2"); /* version */ >+ /* filename charset extension */ >+ if (!disable_filename_charset_ext) { >+ if (!filename_charset) { >+ setlocale(LC_CTYPE, "");Wouldn't the user locale be better use than the system locale? Each user may be using their own filename encoding...>+ filename_charset = nl_langinfo(CODESET); >+ setlocale(LC_CTYPE, "C");Wouldn't it be better to restore the original locale?>+ if ((strcmp(filename_charset, "646") == 0) || >+ (strcmp(filename_charset, "ANSI_X3.4-1968") == 0)) >+ filename_charset = "ISO-8859-1";How are these heuristics determined? Are there other aliases that we need to be aware of? -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-11 13:55 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #18 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-12 00:55:57 EST --- (In reply to comment #17)> >+ setlocale(LC_CTYPE, ""); > > Wouldn't the user locale be better use than the system locale? Each > user may be using their own filename encoding...Latest POSIX standard mandates that setlocale(..., "") must get the locale configuration from environment variables that, in our particular case, can be set by the user in ~/.ssh/environment or by the system, for instance, via pam. See http://www.opengroup.org/onlinepubs/009695399/functions/setlocale.html. Linux, Solaris and probably several other Unixes libc libraries already follow that standard. AFAIK, OpenBSD does not. Actually it doesn't have any way to configure the locale and applications have to do it in custom ways, as for instance, reading it from a configuration file. That means that most applications do not support locales under OpenBSD and always use the default ("C"). The charset associated to "C" is 646 (ASCII), but in practice, what you get is iso-5589-1 unless you customize the console/X-Window configuration to use a different font.> >+ filename_charset = nl_langinfo(CODESET); > >+ setlocale(LC_CTYPE, "C"); > > Wouldn't it be better to restore the original locale?That's "C"! Applications are always started with locale "C" and have to call setlocale(...) explicitly to change to another one.> >+ if ((strcmp(filename_charset, "646") == 0) || > >+ (strcmp(filename_charset, "ANSI_X3.4-1968") == 0)) > >+ filename_charset = "ISO-8859-1"; > > How are these heuristics determined?The idea is that when ASCII is returned as the charset, what it really means is that locale has probably not been configured and that the charset is iso8859-1. The OS could be configured to use a non iso8859-1 charset/font via wsfontload or similar, but then sftp-server can also be customized via "-s charset".> Are there other aliases that we need to be aware of?probably yes, "646" and "ANSI_X3.4-1968" is what you get from locale "C" in OpenBSD and Linux respectively. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Jan-12 22:13 UTC
[Bug 1632] [PATCH] UTF-8 hint sftp-server extension
https://bugzilla.mindrot.org/show_bug.cgi?id=1632 --- Comment #19 from Salvador Fandi?o <sfandino at yahoo.com> 2010-01-13 09:13:18 EST --- (In reply to comment #18)> AFAIK, OpenBSD does not.After looking at OpenBSD libc source code I have found that actually it *does* support setting the locale configuration from environment variables though it is not documented and no UTF-8 locale definition include LC_CTYPE data. In any case, that does not affect the proposed patch and it is still valid. If locale is configured and LC_CTYPE data exist for it, the charset will be detected and passed to the client. Otherwise, iso-8859-1 will be used. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
Seemingly Similar Threads
- [Bug 1572] New: accept SOCKS requests over the mux socket in master mode
- [Bug 396] sshd orphans processes when no pty allocated
- [Bug 1424] New: Cannot signal a process over a channel (rfc 4254, section 6 .9)
- [Bug 2416] New: [PATCH] Allow forwarding of stdio to streamlocal end points
- [Bug 2283] New: option to execute command without shell