thr3ads.net - openssh unix dev - Call for testing: OpenSSH 7.8 [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Ingo Schwarze

2018-Aug-20 17:14 UTC

Call for testing: OpenSSH 7.8

Hi,

Michael Felt wrote on Mon, Aug 20, 2018 at 05:00:17PM +0200:
> ./nl_langinfo
> setlocale -> "C"
> nl_langinfo -> "ISO8859-1"
Thanks, that is helpful.

So i think i was wrong and Damien was right.  This means that
OpenSSH returns truncated messages when non-ASCII bytes occur
in them, even when the user requests LC_CTYPE=POSIX.
That's not good.

While there is no need to cater for any potential locale that users
might wilfully select, we should probably try to show complete
messages to users who specifically select the POSIX locale.

Admittedly, AIX is weird in calling ASCII "ISO8859-1", which is
probably going to mean something different elsewhere.  But given
that it is very unlikely that anything another system calls ISO8859-1
is an unsafe (ASCII-incompatible or state-dependent) encoding, i'm
proposing the following patch.

I suggest adding some comments because otherwise, we will
eventually forget where all these strings came from.

OK?
> There is a program - /usr/lib/nls/lsmle (just learned about it!)
That's non-standard.  The standard program for similar purposes is
locale(1), though that usually won't report CODESET, but only
LC_CTYPE.

Yours,
  Ingo


Index: utf8.c
==================================================================RCS file:
/cvs/src/usr.bin/ssh/utf8.c,v
retrieving revision 1.7
diff -u -p -r1.7 utf8.c
--- utf8.c	31 May 2017 09:15:42 -0000	1.7
+++ utf8.c	20 Aug 2018 17:11:33 -0000
@@ -51,9 +51,18 @@ dangerous_locale(void) {
 	char	*loc;
 
 	loc = nl_langinfo(CODESET);
-	return strcmp(loc, "US-ASCII") != 0 && strcmp(loc,
"UTF-8") != 0 &&
-	    strcmp(loc, "ANSI_X3.4-1968") != 0 && strcmp(loc,
"646") != 0 &&
-	    strcmp(loc, "") != 0;
+	return strcmp(loc, "UTF-8") != 0 &&
+	    strcmp(loc, "US-ASCII") != 0 &&
+
+	    /*
+	     * What nl_langinfo(CODESET) returns for US-ASCII
+	     * on various operating systems:
+	     */
+
+	    strcmp(loc, "ANSI_X3.4-1968") != 0 &&	/* Linux */
+	    strcmp(loc, "ISO8859-1") != 0 &&		/* AIX */
+	    strcmp(loc, "646") != 0 &&			/* Solaris, NetBSD */
+	    strcmp(loc, "") != 0;			/* Solaris 6 */
 }
 
 static int

Michael

2018-Aug-20 18:28 UTC

head link

Call for testing: OpenSSH 7.8

On 20/08/2018 19:14, Ingo Schwarze wrote:> Index: utf8.c
> ==================================================================> RCS
file: /cvs/src/usr.bin/ssh/utf8.c,v
> retrieving revision 1.7
> diff -u -p -r1.7 utf8.c
> --- utf8.c	31 May 2017 09:15:42 -0000	1.7
> +++ utf8.c	20 Aug 2018 17:11:33 -0000
> @@ -51,9 +51,18 @@ dangerous_locale(void) {
>  	char	*loc;
>  
>  	loc = nl_langinfo(CODESET);
> -	return strcmp(loc, "US-ASCII") != 0 && strcmp(loc,
"UTF-8") != 0 &&
> -	    strcmp(loc, "ANSI_X3.4-1968") != 0 && strcmp(loc,
"646") != 0 &&
> -	    strcmp(loc, "") != 0;
> +	return strcmp(loc, "UTF-8") != 0 &&
> +	    strcmp(loc, "US-ASCII") != 0 &&
> +
> +	    /*
> +	     * What nl_langinfo(CODESET) returns for US-ASCII
> +	     * on various operating systems:
> +	     */
> +
> +	    strcmp(loc, "ANSI_X3.4-1968") != 0 &&	/* Linux */
> +	    strcmp(loc, "ISO8859-1") != 0 &&		/* AIX */
> +	    strcmp(loc, "646") != 0 &&			/* Solaris, NetBSD */
> +	    strcmp(loc, "") != 0;			/* Solaris 6 */
>  }
>  
>  static int
After applying the patch:

root at
x064:[/data/prj/openbsd/mindrot/openssh-7.8.0.20]/data/prj/openbsd/mindrot/openssh-7.8.0.20/regress/unittests/utf8/test_utf8
test_utf8: .................................. 34 tests ok

August 20th snapshot patched, not the -r 1.7


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20180820/1fa51f48/attachment-0001.asc>

Michael

2018-Aug-20 18:32 UTC

head link

Call for testing: OpenSSH 7.8

On 20/08/2018 19:14, Ingo Schwarze wrote:>> There is a program - /usr/lib/nls/lsmle (just learned about it!)
> That's non-standard.  The standard program for similar purposes is
> locale(1), though that usually won't report CODESET, but only
> LC_CTYPE.
p.s. aka FYI: AIX also has the program locale(1), but that gives a lot
less information, imho.

And, maybe also a bit misleading - at least for the ignorant. I would
have expected UTF-8, but instead, it uses EN_US and a symbolic link.

root at x064:[/data/prj/openbsd/mindrot/openssh-7.8.0.20]locale
LANG=EN_US
LC_COLLATE="EN_US"
LC_CTYPE="EN_US"
LC_MONETARY="EN_US"
LC_NUMERIC="EN_US"
LC_TIME="EN_US"
LC_MESSAGES="EN_US"
LC_ALL
While the default language locale(1) is:

root at x066:[/data/prj/python/python3-3.8]locale
LANG=en_US
LC_COLLATE="en_US"
LC_CTYPE="en_US"
LC_MONETARY="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_MESSAGES="en_US"
LC_ALL
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20180820/444a3f21/attachment.asc>

Ingo Schwarze

2018-Aug-20 19:39 UTC

head link

Call for testing: OpenSSH 7.8

Hi Michael,

Michael wrote on Mon, Aug 20, 2018 at 08:32:03PM +0200:
> p.s. aka FYI: AIX also has the program locale(1), but that gives
> a lot less information, imho.
Sure it has, it's POSIX, and sure it provides less info than you
have shown from the non-standard tool.  But by definition, it is
unclear what the output of a non-standard tool may mean.
> And, maybe also a bit misleading - at least for the ignorant. I would
> have expected UTF-8, but instead, it uses EN_US and a symbolic link.
> 
> root at x064:[/data/prj/openbsd/mindrot/openssh-7.8.0.20]locale
> LANG=EN_US
> LC_COLLATE="EN_US"
> LC_CTYPE="EN_US"
> LC_MONETARY="EN_US"
> LC_NUMERIC="EN_US"
> LC_TIME="EN_US"
> LC_MESSAGES="EN_US"
> LC_ALL> 
> While the default language locale(1) is:
> 
> root at x066:[/data/prj/python/python3-3.8]locale
> LANG=en_US
> LC_COLLATE="en_US"
> LC_CTYPE="en_US"
> LC_MONETARY="en_US"
> LC_NUMERIC="en_US"
> LC_TIME="en_US"
> LC_MESSAGES="en_US"
> LC_ALLSeems quite unusual indeed and hard to understand - but AIX is free
to call their locales however they want.  Neither locale names nor
the return value from nl_langinfo(CODESET) are standardized.
According to POSIX, they need contain elements relating to languages,
territories, encodings, or even filenames, and the return value
from nl_langinfo(CODESET) need not match the LC_CTYPE variable.
AIX would be free to ask their users to specify "apples" or
"oranges"
in LC_CTYPE and yet report back "bananas" from nl_langinfo(CODESET).
;-)

Yours,
  Ingo

Val Baranov

2018-Aug-20 20:12 UTC

head link

Call for testing: OpenSSH 7.8

Ingo Schwarze wrote on Aug 20, 2018 at 1:15 PM +0500;
> While there is no need to cater for any potential locale that users might
wilfully select, we should probably try to show complete messages to
> users who specifically select the POSIX locale.
>
> Admittedly, AIX is weird in calling ASCII "ISO8859-1", which is
probably going to mean something different elsewhere.  But given that it is very
> unlikely that anything another system calls ISO8859-1 is an unsafe
(ASCII-incompatible or state-dependent) encoding, i'm proposing the
> following patch.
The patch has been applied - no error for UTF-8. Completed with all tests
"make tests" - no errors on AIX 7.1.

Damien Miller

2018-Aug-21 02:04 UTC

head link

Call for testing: OpenSSH 7.8

On Mon, 20 Aug 2018, Ingo Schwarze wrote:
> Hi,
> 
> Michael Felt wrote on Mon, Aug 20, 2018 at 05:00:17PM +0200:
> 
> > ./nl_langinfo
> > setlocale -> "C"
> > nl_langinfo -> "ISO8859-1"
> 
> Thanks, that is helpful.
> 
> So i think i was wrong and Damien was right.  This means that
> OpenSSH returns truncated messages when non-ASCII bytes occur
> in them, even when the user requests LC_CTYPE=POSIX.
> That's not good.
> 
> While there is no need to cater for any potential locale that users
> might wilfully select, we should probably try to show complete
> messages to users who specifically select the POSIX locale.
> 
> Admittedly, AIX is weird in calling ASCII "ISO8859-1", which is
> probably going to mean something different elsewhere.  But given
> that it is very unlikely that anything another system calls ISO8859-1
> is an unsafe (ASCII-incompatible or state-dependent) encoding, i'm
> proposing the following patch.
> 
> I suggest adding some comments because otherwise, we will
> eventually forget where all these strings came from.
> 
> OK?
ok, djm@ (I'd prefer the comment before the return statement, but up to you)
> Index: utf8.c
> ==================================================================> RCS
file: /cvs/src/usr.bin/ssh/utf8.c,v
> retrieving revision 1.7
> diff -u -p -r1.7 utf8.c
> --- utf8.c	31 May 2017 09:15:42 -0000	1.7
> +++ utf8.c	20 Aug 2018 17:11:33 -0000
> @@ -51,9 +51,18 @@ dangerous_locale(void) {
>  	char	*loc;
>  
>  	loc = nl_langinfo(CODESET);
> -	return strcmp(loc, "US-ASCII") != 0 && strcmp(loc,
"UTF-8") != 0 &&
> -	    strcmp(loc, "ANSI_X3.4-1968") != 0 && strcmp(loc,
"646") != 0 &&
> -	    strcmp(loc, "") != 0;
> +	return strcmp(loc, "UTF-8") != 0 &&
> +	    strcmp(loc, "US-ASCII") != 0 &&
> +
> +	    /*
> +	     * What nl_langinfo(CODESET) returns for US-ASCII
> +	     * on various operating systems:
> +	     */
> +
> +	    strcmp(loc, "ANSI_X3.4-1968") != 0 &&	/* Linux */
> +	    strcmp(loc, "ISO8859-1") != 0 &&		/* AIX */
> +	    strcmp(loc, "646") != 0 &&			/* Solaris, NetBSD */
> +	    strcmp(loc, "") != 0;			/* Solaris 6 */
>  }
>  
>  static int
> _______________________________________________
> openssh-unix-dev mailing list
> openssh-unix-dev at mindrot.org
> https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
>

Ingo Schwarze

2018-Aug-21 14:05 UTC

head link

Call for testing: OpenSSH 7.8

Hi Damien,

Damien Miller wrote on Tue, Aug 21, 2018 at 12:04:41PM +1000:
> ok, djm@
Thanks for checking, and thanks to Val and Michael for testing.
I just committed the patch to OpenBSD, others will likely take
care of merging it to -portable.
> (I'd prefer the comment before the return statement, but up to you)
Immediately before the return statement, it looked really confusing,
like:

  /* ASCII on various systems ... blabla */
  return UTF-8 ... blabla

So i moved it up a few more lines and appended it to the existing
comment, keeping the code concise and yet clearly explaining it.

Yours,
  Ingo

openssh unix dev - Aug 2018 - Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8

Call for testing: OpenSSH 7.8