On 4/27/22 05:40, Ingo Schwarze wrote:> Hi Demi, > > Demi Marie Obenour wrote on Tue, Apr 26, 2022 at 09:12:07PM -0400: >> On 4/25/22 08:23, Ingo Schwarze wrote: > >>> As discussed in the above writeup, the only way to make ssh(1) >>> connections safe it to manually make sure, *before connecting*, >>> that the same locale is set on both sides - ideally UTF-8. > >> It is also safe for the locale to be different, so long as the >> character encodings match. For instance, all UTF-8 locales are >> compatible. > > Yes, that is what i meant. In OpenBSD, we are used to the deliberate > decision that the C library ignores all aspects of the locale except > the character encoding, so the locale and the character encoding are > one and the same and your statement is obvious for us. Of course, > your statement is also true on arbitrary other operating systems, even > if they do take other parts of the locale into account.Off-topic: Why did OpenBSD make this decision? In particular, LC_MESSAGES seems to be essential to internationalization support, without being very problematic otherwise. Also, is it safe if the server uses the C locale (LC_ALL=C) and the client uses UTF-8?> Thanks for making this aspect explicit. You are right that it might > not be obvious for users of other systems.You?re welcome.> That said, on non-OpenBSD systems, if the locale used by a program does > not match watch the user thinks, the *semantics* of the program may still > screw up horribly, even if the character encoding matches. For example, > consider user input of floating point numbers with LC_NUMERIC set to a > cultural convention the user isn't aware of. But such issues are > only loose related to ssh(1) and to terminal security.When it comes to terminal security, another approach is to use a transient tmux(1) pane or terminal window that is closed once the session is complete. This assumes that the mismatch cannot be exploited for code execution, but I would be highly surprised if it could be, especially with the client in UTF-8 mode. -- Sincerely, Demi Marie Obenour (she/her/hers) -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB288B55FFF9C22C1.asc Type: application/pgp-keys Size: 4885 bytes Desc: OpenPGP public key URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220428/4b34c838/attachment-0001.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220428/4b34c838/attachment-0001.asc>
On Thu, 2022-04-28 at 20:29 -0400, Demi Marie Obenour wrote:> > That said, on non-OpenBSD systems, if the locale used by a program > > does > > not match watch the user thinks, the *semantics* of the program may > > still > > screw up horribly, even if the character encoding matches.? For > > example, > > consider user input of floating point numbers with LC_NUMERIC set > > to a > > cultural convention the user isn't aware of.? But such issues are > > only loose related to ssh(1) and to terminal security. > > When it comes to terminal security, another approach is to use > a transient tmux(1) pane or terminal window that is closed once > the session is complete.? This assumes that the mismatch cannot be > exploited for code execution, but I would be highly surprised if it > could be, especially with the client in UTF-8 mode.Maybe it's too late in the night and I just miss the obvious point,... ... but what exactly is the security problem here (if one sends LC_*/LANG ... or with locales in general)? With or with any locale/character encoding differences, the (possibly evil) remote side can send any arbitrary bytes to the terminal. But how could it use this to for code execution on the local machine? The only attack vector I see would be: A remote side tricking a user into believing that he left SSH (but is still on the remote side)... and then tricking him into e.g. entering a password. But that should be independent of the locale/character encoding. What should be the attack with e.g. LC_NUMERIC? A remote side tricking a user into using 3,14 instead of 3.14 and that having some attacking effect? But if the remote side can mess with the locale (on its own side)... it can anyway already do it's attack there? Similar, if a evil remote side could swap yesexpr and noexpr in LC_MESSAGES? So what? Tricking a user in to rm -ri / and then using 'n' which would then mean 'y'? If the remote side can do this, it can again just delete those (remote) files? Cheers, Chris.
Hi, Demi Marie Obenour wrote on Thu, Apr 28, 2022 at 08:29:24PM -0400:> On 4/27/22 05:40, Ingo Schwarze wrote: >> Demi Marie Obenour wrote on Tue, Apr 26, 2022 at 09:12:07PM -0400: >>> On 4/25/22 08:23, Ingo Schwarze wrote:>> In OpenBSD, we are used to the deliberate >> decision that the C library ignores all aspects of the locale except >> the character encoding, [...]> Off-topic: Why did OpenBSD make this decision? In particular, > LC_MESSAGES seems to be essential to internationalization support, > without being very problematic otherwise.I think having libc and POSIX utility programs always reliably print diagnostics in the same way, and always in US-ASCII rather than sometimes in UTF-8, is more valuable than internationalization of operating system diagnostics, both from the user perspective (predictability and comprehensibility) and from the OS maintainer perspective (code simplicity and hence better change for correctness and reliability). Even as a native German speaker, i regularly get confused when seeing German error messages because they usually feel quite incomprehensible. Besides, LC_CTYPE is essential for important functionality, but picking individual features from all the rest of LC_* for implementation isn't going to help. It will increase code complexity without really achieving internationalization (even full LC_* support is not really sufficient for complete internationalization...). So better ditch it outright than attempt some piece-meal approach. Besides, even LC_MESSAGES has features that are prone to causing trouble, for example changing the meaning of "yes" and "no".> Also, is it safe if the server uses the C locale (LC_ALL=C) and the > client uses UTF-8?Yes, because US-ASCII is a subset of UTF-8, so what a well-behaved server sends in the C locale is supposed to be a subset of what it might send in a UTF-8 locale. Of course, whether it is safe when both the server and the client use a UTF-8 locale obviously depends on the terminal or terminal emulator, but at least xterm(1) in UTF-8 mode [but not in the traditional 8-bit mode that may still be the default on some operating systems] is safe when the server runs either the C locale or a UTF-8 locale. [...]>> That said, on non-OpenBSD systems, if the locale used by a program does >> not match watch the user thinks, the *semantics* of the program may still >> screw up horribly, even if the character encoding matches. For example, >> consider user input of floating point numbers with LC_NUMERIC set to a >> cultural convention the user isn't aware of. But such issues are >> only loose related to ssh(1) and to terminal security.> When it comes to terminal security, another approach is to use > a transient tmux(1) pane or terminal window that is closed once > the session is complete.Frankly, i don't know anything about tmux(1) and simply don't know whether it can or cannot help with the topic at hand.> This assumes that the mismatch cannot be > exploited for code execution, but I would be highly surprised if it > could be, especially with the client in UTF-8 mode.xterm(1) in UTF-8 mode is quite good because it never interprets multibyte characters as in-band terminal control codes. Your mileage might vary with other terminals or emulators. Yours, Ingo